Create account

DEV Community

Omkar Ajnadkar

Posted on Aug 17, 2018 • Originally published at Medium on Aug 3, 2018

Predicting customer churn in banking using ANN

#machinelearning #datascience #keras #neuralnetworks

Dataset

The dataset ‘Churn_Modelling.csv’ contains records of 10,000 customers of a bank with following columns:

RowNumber
CustomerId
Surname
CreditScore
Geography
Gender
Age
Tenure
Balance
NumOfProducts
HasCrCard
IsActiveMember
EstimatedSalary
Exited

By using the columns 1 to 13, we want to predict if the customer will exit or not that is column 14.

Data Preprocessing

Removing unnecessary features
Label Encoder
One Hot Encoder
Train Test Split
Standard Scaler

Simple ANN Model using Keras

Create an ANN with total 4 layers

One input layer with 11 input features and 6 output features
Hidden layer with 6 output features
Final layer with 1 output feature

Activation Functions

1st layer: Relu
2nd layer: Relu
3rd layer: Sigmoid

Hyperparameters

optimizer: adam
loss: binary_crossentropy
metrics: accuracy
batch_size: 10
epochs:100

Accuracy(Subject to change)

Training Set: 0.8610
Testing Set:0.86

Improving ANN

Use k-fold classifier to split training set in say 10 parts and applying training on 9 out of 10 parts and testing on another every time to decrease fluctuation in accuracy every time you run the code.
Use Dropout technique with a certain threshold to decreases overfitting on the training set. Applying to this dataset gives an accuracy of 0.8321 which means now data is less overfitted to this training set.
Use GridSearchCV to find best parameters automatically. Enter all the hyperparameters you want to test your network on and after testing everything it will give the best possible accuracy and parameters. I tried with the following parameters:

batch_size: 25, 32

epochs: 100, 500

optimizer: adam, rmsprop

After waking in the morning(yes, it takes a long time…), this is what I found…

best_parameters

batch_size: 25
epochs: 500
optimizer: rmsprop

accuracy: 0.8545

Further Improvements

You can further improve this model by changing hyperparameters and trying other range of values in GridSearchCV. But it is important to note that, as you will increase the number of parameters in GridSearchCV, your time for training will also increase.

DEV Community

Predicting customer churn in banking using ANN

Dataset

Data Preprocessing

Simple ANN Model using Keras

Improving ANN

Further Improvements

Code

blackbird71SR / Small-Deep-Learning-Projects

Small projects with Deep Learning magic! - Predicting Customer Churn in Banking, Predict tags on Stack Overflow, Sign Language Recognition

Neural Networks

1. Predicting Customer Churn In Banking

2. Predict tags on StackOverflow

3. Sign Language and Static-Gesture Recognition

4. You Only Look Once - Photos & Videos

Top comments (0)

Read next

How Digital Onboarding KYC is Transforming Identity Verification

Understanding the Evolution of Word Representation: Static vs. Dynamic Embeddings

Building a Simple Chatbot with Llama2 [Chat with Excel]

The Apache SeaTunnel Community Welcomes A New Committer From India!