1

Predict success rate of your marketing campaign using Logistics Regression

 2 years ago
source link: https://medium.com/sanrusha-consultancy/how-to-predict-product-sale-using-logistics-regression-31c2322c3b3e
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
1*Pun4kXzlah6cEgjREJupqA.gif

Predict success rate of your marketing campaign using Logistics Regression

Hello there!

Welcome to this article!

We are going to use data related with direct marketing campaigns of a Portuguese banking institution. Marketing team of bank contacted people over the phone and tried selling bank deposit. They created this database containing the details of people called and whether or not they opened fixed deposit with bank.

The complete data-sets can be found here -> https://archive.ics.uci.edu/ml/datasets/Bank+Marketing

Get the bank-additional.csv and bank-additional-all.csv files.

We are going to use Logistic Regression from Python library Sklearn to analyse this data and develop predictive model.

Let’s begin

import mandatory libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Read the file

df=pd.read_csv(r’C:\Sanrusha-Canon Laptop\Udemy\Machine Learning\SampleDataSet\bank-additional\bank-additional.csv’,delimiter=’;’)

Let’s review first few rows

df.head()

You should get below output

1*4cOhQp_J6xTPeieQGKSBPQ.png?q=20
null

Let’s confirm the shape of the dataset

df.shape

Result:

(4119, 21)

Let’s define X and y

y=df[‘y’]
X=df.drop([‘y’],axis=1)

Run below python code in label encode feature values.

from sklearn.preprocessing import LabelEncoder
lbc=LabelEncoder()
for i in range(len(X.columns)):
X.iloc[:,i]=lbc.fit_transform(X.iloc[:,i])

Let’s review first few records.

X.head()

You should get below result

1*zsFrLoKvMZRImxZvSA-COg.png?q=20
null

It’s time to scale the feature values.

from sklearn.preprocessing import StandardScaler
stdscl=StandardScaler()
X=stdscl.fit_transform(X)

Let’s print first few records again

pd.DataFrame(X).head()

1*viWXtN00WKwrNli_WevwvA.png?q=20
null

It’s time to split the data in train and test size.

from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2,random_state=0)

Now fit the train data in LogisticRegression

from sklearn.linear_model import LogisticRegression
lrc=LogisticRegression()
lrc.fit(X_train,y_train)

Model is trained, it’s time to predict y now

y_pred=lrc.predict(X_test)

Very good! Let’s calculate the accuracy metric

from sklearn import metrics
print(“Accuracy “, metrics.accuracy_score(y,lrc.predict(X)))

1*EZRdzLteOX76Hu5_gQ5Z2Q.png?q=20
null

Accuracy 91% is good!

Let’s compare the actual values with predicted values

df3 = pd.DataFrame({‘Actual’: y.values, ‘Predicted’:lrc.predict(X)})
df3.head()

You should get below result

1*HZzi_svelRQVB1mEOtcWSw.png?q=20
null

Run below python code to understand how many predicted values are not matching with the actual value.

df3[‘Actual’]=lbc.fit_transform(df3[‘Actual’])
df3[‘Predicted’]=lbc.fit_transform(df3[‘Predicted’])

j=0
for i in range(len(df3)):
if df3.iloc[i,0]!=df3.iloc[i,1]:
j=j+1
print(j)

Result:

356

Only 356 out of 4119 is not matching. No bad!

Good Job! Logistics Regression has worked well in this case.

Reference:

Machine Learning Hands-on


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK