Predict success rate of your marketing campaign using Logistics Regression

Hello there!

Welcome to this article!

We are going to use data related with direct marketing campaigns of a Portuguese banking institution. Marketing team of bank contacted people over the phone and tried selling bank deposit. They created this database containing the details of people called and whether or not they opened fixed deposit with bank.

The complete data-sets can be found here -> https://archive.ics.uci.edu/ml/datasets/Bank+Marketing

Get the bank-additional.csv and bank-additional-all.csv files.

We are going to use Logistic Regression from Python library Sklearn to analyse this data and develop predictive model.

Let’s begin

import mandatory libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Read the file

df=pd.read_csv(r’C:\Sanrusha-Canon Laptop\Udemy\Machine Learning\SampleDataSet\bank-additional\bank-additional.csv’,delimiter=’;’)

Let’s review first few rows

df.head()

You should get below output

Let’s confirm the shape of the dataset

df.shape

Result:

(4119, 21)

Let’s define X and y

y=df[‘y’]
X=df.drop([‘y’],axis=1)

Run below python code in label encode feature values.

from sklearn.preprocessing import LabelEncoder
lbc=LabelEncoder()
for i in range(len(X.columns)):
X.iloc[:,i]=lbc.fit_transform(X.iloc[:,i])

Let’s review first few records.

X.head()

You should get below result

It’s time to scale the feature values.

from sklearn.preprocessing import StandardScaler
stdscl=StandardScaler()
X=stdscl.fit_transform(X)

Let’s print first few records again

pd.DataFrame(X).head()

It’s time to split the data in train and test size.

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)

Now fit the train data in LogisticRegression

from sklearn.linear_model import LogisticRegression
lrc=LogisticRegression()
lrc.fit(X_train,y_train)

Model is trained, it’s time to predict y now

y_pred=lrc.predict(X_test)

Very good! Let’s calculate the accuracy metric

from sklearn import metrics
print(“Accuracy “, metrics.accuracy_score(y,lrc.predict(X)))

Accuracy 91% is good!

Let’s compare the actual values with predicted values

df3 = pd.DataFrame({‘Actual’: y.values, ‘Predicted’:lrc.predict(X)})
df3.head()

You should get below result

Run below python code to understand how many predicted values are not matching with the actual value.

df3[‘Actual’]=lbc.fit_transform(df3[‘Actual’])
df3[‘Predicted’]=lbc.fit_transform(df3[‘Predicted’])
j=0
for i in range(len(df3)):
if df3.iloc[i,0]!=df3.iloc[i,1]:
j=j+1
print(j)

Result:

Only 356 out of 4119 is not matching. No bad!

Good Job! Logistics Regression has worked well in this case.

Reference:

Machine Learning Hands-on

Predict success rate of your marketing campaign using Logistics Regression

Predict success rate of your marketing campaign using Logistics Regression

Machine Learning Hands-on Course

Join the most comprehensive and popular Machine Learning Hands-on Course on Udemy, because now is the time to get…

End to End Machine Learning

Sanrusha is a leading provider of Machine Learning and AI based solutions. We strive to make life better by using AI.

Recommend

No-code user-generated FAQ, for any webpage

It’s easy to surpass a predecessor

Lasso, Ridge, Elastic Net Regression

F# Show #0 - Vlad Zarytovskii @ Microsoft, Backpressure patterns, Sharding the S...

JetBrains ReSharper on Twitter: "The ReSharper 2021.3.3 bug-fix release is...

GitHub - asc-community/AngouriMath: Open-source cross-platform symbolic algebra...

Portant Workflow - Simple document automation for Google Workspace | Product Hun...

Share your setup with others and earn from affiliate links

Announcing .NET Community Toolkit v8.0.0 Preview 1

Beseda Share - Async audio & video discussions – right from your browser | P...

About Joyk