Logistic Regression Algorithm

Hi, everyone. I am Orhan Yagizer. In this article, I will work with the Logistic regression algorithm in python. Let’s get start it.

Firstly, what is a logistic regression algorithm?

Formula

What are the differences between linear regression and logistic regression?

Sometimes these two algorithms can be confused with each other.

Differences

Logistic Regression Analysis with Python

Now it’s time to analyze them in python. I will mostly use sci-kit learn. I will use the Titanic data set from Kaggle. It’s a very famous ML data set. You can download the data set from here.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
train = pd.read_csv('titanic_train.csv')
train.head()
plt.figure(figsize=(10,6))
sns.heatmap(train.isnull(),yticklabels=False,cbar=False,cmap="Greens")
sns.set_style('whitegrid')
sns.countplot(x='Survived',data=train,palette='pastel')
Countplot
sns.set_style('whitegrid')
sns.countplot(x='Survived',hue='Sex',data=train,palette='RdBu_r')
Countplot
sns.set_style('whitegrid')
sns.countplot(x='Survived',hue='Pclass',data=train,palette='viridis')
Countplot
plt.figure(figsize=(10,7))
sns.distplot(train["Age"].dropna(),kde=False,bins=30);
Distplot
plt.figure(figsize=(12, 7))
sns.boxplot(x='Pclass',y='Age',data=train,palette='viridis')
Boxplot
def trans_age(cols):
Age = cols[0]
Pclass = cols[1]

if pd.isnull(Age):

if Pclass == 1:
return 37
elif Pclass == 2:
return 29
else:
return 24
else:
return Age
train['Age'] = train[['Age','Pclass']].apply(trans_age,axis=1)
train.drop('Cabin',axis=1,inplace=True)
train.head()
New Dataframe
plt.figure(figsize=(10,6))
sns.heatmap(train.isnull(),yticklabels=False,cbar=False,cmap="Greens")
Heatmap for missing values
sex = pd.get_dummies(train['Sex'],drop_first=True)
embark = pd.get_dummies(train['Embarked'],drop_first=True)
train.drop(['Sex','Embarked','Name','Ticket'],axis=1,inplace=True)
train = pd.concat([train,sex,embark],axis=1)train.head()
Modelling Data
from sklearn.model_selection import train_test_splitX = train.drop('Survived',axis=1)
y = train['Survived']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3,random_state=42)
from sklearn.linear_model import LogisticRegressionlogmodel = LogisticRegression()
logmodel.fit(X_train,y_train)
Output
predictions = logmodel.predict(X_test)
from sklearn.metrics import classification_report, confusion_matrixprint(confusion_matrix(y_test,predictions))
print("\n")
print(classification_report(y_test,predictions))
Confusion Matrix and Classification Report

Orhan Yağızer Çınar

Linkedin

Founder at Codecort |Leader at Young Leaders Over The Horizon| YetGen 21'2| Advisory Board Member at GelecektekiSen| Blogger| Data Science orhanyagizercinar.com