SlideShare une entreprise Scribd logo
Bank Customer
Churn Prediction
Leveraging Machine Learning for
Enhanced Customer Retention Presented by : Saurav Singh
Introduction
• The Banking sector is evolving rapidly and is very well influenced
by technological advancements, changing consumer preferences,
and a competitive market.
• Customer churn, which is the phenomenon of customers
discontinuing their relationship with a bank, poses unique
challenges and opportunities. When a bank loses customers, it can
seriously affect how much money it makes and its market standing.
• Machine learning, with its predictive capabilities, offers a
transformative approach to understanding and mitigating the
challenges posed by customer churn.
Through data-driven insights and predictive modeling, this presentation aims to showcase my
Machine Learning Capstone Project focused on predicting customer churn in the Banking Sector.
Dataset
Information
Here are the key details about the dataset used in this project:
• Number of records: Our dataset comprises a robust collection of data,
consisting of 10,000 records. Each record represents a unique entry,
contributing to the richness and depth of our analysis.
• Features/Columns: The dataset is characterized by a diverse set of features,
each providing valuable insights into customer behavior, preferences, and
interactions. In total, there are 14 features/columns that form the basis of our
predictive modeling.
Column Names
• Row Number
• Customer ID
• Surname
• Credit Score
• Geography
• Gender
• Age
• Tenure
• Balance
• Number of Products
• Has Credit Card
• Is Active Member
• Estimated Salary
• Churned
Exploratory Data Analysis (EDA)
• Exploring the data allowed us to gain a comprehensive overview of
the data's structure. It uncovered potential patterns, helped us
identify key trends and get essential insights from the dataset.
• Throughout the EDA process, we analyzed the distribution of
individual features, investigated correlations, and explored any
inherent relationships between variables.
• Visualizations also played a crucial role in providing a clear
representation of the data, offering insights into customer behavior
and identifying the factors that may contribute to customer churn.
• First, we made sure there were no Null values and Duplicates in the dataset. And luckily,
there weren't any. Our dataset was clean to begin with.
• Then, we checked our columns to see if they were providing any useful information for us
to work with. We found out that columns like “RowNumber”, “CustomerID” and “Surname”
weren't contributing much to the predictions. Hence, we decided to drop them during
preprocessing.
• The "Geography" and "Gender" columns in our dataset were categorical variables. For
them to work with our model, it was necessary to convert these categorical features into a
numerical format.
• To ensure consistent scales for numerical features, we decided to employ Standard Scaler
during preprocessing.
Exploratory Data Analysis (EDA)
Visualizations
Our target variable 'Churned' exhibits class
imbalance, with one class dominating the other.
This issue of data imbalance needs to be addressed.
The above plot reveals a substantial
customer presence in France, surpassing
other regions by a significant margin.
• The dataset contains more Male entries than Female entries.
• The number of credit card owners is significantly higher than those who don’t own a credit card.
• Credit Card owners have a higher Churn Rate than Non-Credit Card owners.
• The distribution of Active and Inactive members is almost the same.
• Inactive members have a higher Churn Rate than Active members.
• The distribution of people with Credit Score ranging from 601 to 700 is higher than any other group.
• The distribution of people with Age ranging from 31 to 40 is higher than any other Age Group.
Upon inspecting the heatmap, we can see that there is no significant correlation observed
among the columns. As a result, no columns will be dropped solely based on correlation.
Preprocessing
• First, “RowNumber” , “CustomerID” and “Surname” columns were dropped as they
didn’t provide any useful information for our predictions.
• Then, we encoded the Categorical data into Numerical data with the help of One-Hot
Encoding Technique. It assigns binary numeric values to each unique class present in
columns with categorical data.
Splitting the data into X and
y• In this step, we partitioned the dataset into two components: X and y.
• The variable X encompasses all independent variables, representing the features
that contribute to our predictions.
• On the other hand, y encapsulates the dependent variable or target variable,
serving as the outcome we aim to predict.
Train-Test Split
• We then split the dataset into training data and testing data.
• We did an 80:20 split, meaning 80% of our data is Training Data and 20% of our data is
Testing Data. So, our test size was set to 0.2.
• We took Random State as 123. This guaranteed the reproducibility of our results across
different runs.
• We also used Stratify = y to ensure that our Target Variable (y) is distributed
proportionally.
Standard Scaler
• We used Standard Scaler to standardize the features of the dataset.
• This ensured that the consistency between the features of the dataset was maintained.
• Standardization is crucial for certain machine learning algorithms, promoting optimal
model performance by mitigating the influence of varying magnitudes among features
Over-Sampling with SMOTE
• We had data imbalance within our target variable. Initially, we evaluated our model's
accuracy in the presence of this imbalance.
• Then, to rectify the issue of imbalance, we implemented the Synthetic Minority Over-
Sampling Technique (SMOTE) as an oversampling method.
• We then compared the model accuracies before and after addressing the data imbalance using
SMOTE, providing valuable insights into the impact of this preprocessing technique.
• Distribution of our y_train before oversampling :
• Distribution of our y_train after oversampling:
Not Churned Churned
6370 1630
Not Churned Churned
6370 6370
Applying Machine
Learning Algorithms
This Bank Customer Churn problem we have here is a Binary Classification problem.
Models used:
• Logistic Regression : Logistic Regression is a powerful tool in binary classification. Its very good at modeling
the probability of an event occurring, making it suitable for scenarios where understanding the likelihood of
customers churning is essential.
• Support Vector Machine (SVC) : Support Vector Classification is a robust algorithm employed for classification
tasks, especially when there's a need for clear separation between classes. In the context of customer churn
prediction, it draws distinct decision boundaries between loyal and potential churned customers.
• Naive Bayes : Naive Bayes is a probabilistic classification algorithm known for its simplicity and efficiency. It
assumes that features are independent, making calculations easier. Its often used when simplicity and speed
are crucial.
Evaluation Metrics
Without Oversampling
(SMOTE)
With Oversampling (SMOTE)
Model Accuracy Precision Recall F1-Score
LOGI 81.2 59.62 23.58 33.80
SVC 86.5 80.44 44.47 57.27
NB 82.1 59.53 37.59 46.08
Model Accuracy Precision Recall F1-Score
LOGI_OS 70.75 37.42 65.11 47.53
SVC_OS 80.75 51.88 74.44 61.15
NB_OS 71.70 38.91 68.55 49.64
• We can see that Oversampling makes a huge difference.
• After Oversampling, the accuracy and precision of our models have decreased a bit
but Recall and F1-Score have increased.
Model Selection and Considerations
• SVC outperforms Logistic Regression and Naive Bayes in all metrics, demonstrating
higher Accuracy, Precision, Recall, and F1-Score. It seems to be a promising model for
our task.
• Based on the provided metrics, SVC stands out as the best-performing model overall. It
achieves a good balance between precision and recall, making it suitable for our
customer churn prediction task.
• While metrics like Accuracy and Precision are essential, Recall is particularly crucial in
Customer Churn Prediction, as it indicates the ability to identify customers who are
likely to Churn. And Support Vector Classification provided us the best Recall value.
• Hence, we will go with Support Vector Classification as our final model as it is quite
evident that it performs best for our Bank Customer Churn problem.
Conclusion
• With the help of several insights, patterns and trends in our data, we’ve used Machine Learning to
address the intricate challenge of predicting Customer Churn.
• This project offers significant benefits to banks:
 By predicting potential churners, banks can adopt proactive strategies to retain valuable
customers. This involves personalized interventions, loyalty programs, and targeted
communication to address customer concerns and enhance satisfaction.
 By focusing efforts on customers at a higher risk of churn, banks can streamline operations,
reduce marketing costs, and improve overall efficiency.
 Anticipating and mitigating customer churn contributes directly to revenue optimization.
 Understanding the factors influencing customer churn enables banks to tailor their services to
meet individual needs. This level of personalization fosters stronger customer relationships,
increases loyalty, and enhances the overall banking experience.
Thank You !

Contenu connexe

Similaire à Bank Customer Churn Prediction- Saurav Singh.pptx

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectBoston Institute of Analytics
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment Kunal Kashyap
 
Wooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersWooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersLucinda Linde
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language Aayush Kumar
 
Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019Michael Van Luven
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckSasha Lazarevic
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionMatt Stubbs
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET Journal
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.Souma Maiti
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - ReportAkanksha Gohil
 
Lead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdfLead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdfKrishP2
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 

Similaire à Bank Customer Churn Prediction- Saurav Singh.pptx (20)

Decoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis ProjectDecoding Patterns: Customer Churn Prediction Data Analysis Project
Decoding Patterns: Customer Churn Prediction Data Analysis Project
 
Personal Loan Risk Assessment
Personal Loan Risk Assessment Personal Loan Risk Assessment
Personal Loan Risk Assessment
 
Wooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersWooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit Customers
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019Auxilium Advanced Analytics Brochure 2019
Auxilium Advanced Analytics Brochure 2019
 
Navigant qfas april 2015
Navigant qfas april 2015Navigant qfas april 2015
Navigant qfas april 2015
 
Navigant qfas april 2015
Navigant qfas april 2015Navigant qfas april 2015
Navigant qfas april 2015
 
Navigant qfas april 2015
Navigant qfas april 2015Navigant qfas april 2015
Navigant qfas april 2015
 
Business Analytics.pptx
Business Analytics.pptxBusiness Analytics.pptx
Business Analytics.pptx
 
BMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist DeckBMDSE v1 - Data Scientist Deck
BMDSE v1 - Data Scientist Deck
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price PromotionIRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
IRJET- Finding Optimal Skyline Product Combinations Under Price Promotion
 
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
LOAN APPROVAL PRDICTION SYSTEM USING MACHINE LEARNING.
 
Predictive modeling
Predictive modelingPredictive modeling
Predictive modeling
 
Data Analytics Using R - Report
Data Analytics Using R - ReportData Analytics Using R - Report
Data Analytics Using R - Report
 
Lead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdfLead Scoring Group Case Study Presentation.pdf
Lead Scoring Group Case Study Presentation.pdf
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Day 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business AnalyticsDay 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business Analytics
 

Plus de Boston Institute of Analytics

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationBoston Institute of Analytics
 
Demystifying Salaries: A Data Science Approach to Predicting Salary Ranges
Demystifying Salaries: A Data Science Approach to Predicting Salary RangesDemystifying Salaries: A Data Science Approach to Predicting Salary Ranges
Demystifying Salaries: A Data Science Approach to Predicting Salary RangesBoston Institute of Analytics
 
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...Boston Institute of Analytics
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeBoston Institute of Analytics
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksBoston Institute of Analytics
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesBoston Institute of Analytics
 
Unveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data ScienceUnveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data ScienceBoston Institute of Analytics
 
Beyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
Beyond Thumbs Up/Down: Using AI to Analyze Movie ReviewsBeyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
Beyond Thumbs Up/Down: Using AI to Analyze Movie ReviewsBoston Institute of Analytics
 
Unveiling the Patterns: A Cluster Analysis of NYC Shootings
Unveiling the Patterns: A Cluster Analysis of NYC ShootingsUnveiling the Patterns: A Cluster Analysis of NYC Shootings
Unveiling the Patterns: A Cluster Analysis of NYC ShootingsBoston Institute of Analytics
 
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgBoston Institute of Analytics
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFBoston Institute of Analytics
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachBoston Institute of Analytics
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Boston Institute of Analytics
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 

Plus de Boston Institute of Analytics (20)

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
Solar production with K means clustering
Solar production with K means clusteringSolar production with K means clustering
Solar production with K means clustering
 
Demystifying Salaries: A Data Science Approach to Predicting Salary Ranges
Demystifying Salaries: A Data Science Approach to Predicting Salary RangesDemystifying Salaries: A Data Science Approach to Predicting Salary Ranges
Demystifying Salaries: A Data Science Approach to Predicting Salary Ranges
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
Predicting Power Consumption for a Greener Tomorrow: Machine Learning Project...
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Unveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data ScienceUnveiling the Market: Predicting House Prices with Data Science
Unveiling the Market: Predicting House Prices with Data Science
 
Beyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
Beyond Thumbs Up/Down: Using AI to Analyze Movie ReviewsBeyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
Beyond Thumbs Up/Down: Using AI to Analyze Movie Reviews
 
Unveiling the Patterns: A Cluster Analysis of NYC Shootings
Unveiling the Patterns: A Cluster Analysis of NYC ShootingsUnveiling the Patterns: A Cluster Analysis of NYC Shootings
Unveiling the Patterns: A Cluster Analysis of NYC Shootings
 
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.orgEnhancing Cybersecurity: An In-depth Analysis of Travelblog.org
Enhancing Cybersecurity: An In-depth Analysis of Travelblog.org
 
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRFExploring Web Security Threats: A Practical Study on SQL Injection and CSRF
Exploring Web Security Threats: A Practical Study on SQL Injection and CSRF
 
Detecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning ApproachDetecting Credit Card Fraud: A Machine Learning Approach
Detecting Credit Card Fraud: A Machine Learning Approach
 
Detecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven ApproachDetecting Credit Card Fraud: An AI-driven Approach
Detecting Credit Card Fraud: An AI-driven Approach
 
Predicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning ApproachPredicting House Prices: A Machine Learning Approach
Predicting House Prices: A Machine Learning Approach
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
Decoding Loan Approval with Predictive Modeling in Action Discovering Weaknes...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 

Dernier

Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...Sayali Powar
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfVivekanand Anglo Vedic Academy
 
The Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational ResourcesThe Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational Resourcesaileywriter
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfbu07226
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17Celine George
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptSourabh Kumar
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...Nguyen Thanh Tu Collection
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticspragatimahajan3
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointELaRue0
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff17thcssbs2
 
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTelling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTechSoup
 
Advances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfAdvances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfDr. M. Kumaresan Hort.
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxjmorse8
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxJenilouCasareno
 
The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...sanghavirahi2
 
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General QuizPragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General QuizPragya - UEM Kolkata Quiz Club
 

Dernier (20)

Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
UNIT – IV_PCI Complaints: Complaints and evaluation of complaints, Handling o...
 
Sectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdfSectors of the Indian Economy - Class 10 Study Notes pdf
Sectors of the Indian Economy - Class 10 Study Notes pdf
 
The Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational ResourcesThe Benefits and Challenges of Open Educational Resources
The Benefits and Challenges of Open Educational Resources
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17
 
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.pptBasic_QTL_Marker-assisted_Selection_Sourabh.ppt
Basic_QTL_Marker-assisted_Selection_Sourabh.ppt
 
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
GIÁO ÁN DẠY THÊM (KẾ HOẠCH BÀI BUỔI 2) - TIẾNG ANH 8 GLOBAL SUCCESS (2 CỘT) N...
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceutics
 
NCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdfNCERT Solutions Power Sharing Class 10 Notes pdf
NCERT Solutions Power Sharing Class 10 Notes pdf
 
B.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdfB.ed spl. HI pdusu exam paper-2023-24.pdf
B.ed spl. HI pdusu exam paper-2023-24.pdf
 
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
Operations Management - Book1.p  - Dr. Abdulfatah A. SalemOperations Management - Book1.p  - Dr. Abdulfatah A. Salem
Operations Management - Book1.p - Dr. Abdulfatah A. Salem
 
Open Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPointOpen Educational Resources Primer PowerPoint
Open Educational Resources Primer PowerPoint
 
IATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdffIATP How-to Foreign Travel May 2024.pdff
IATP How-to Foreign Travel May 2024.pdff
 
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdfTelling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
Telling Your Story_ Simple Steps to Build Your Nonprofit's Brand Webinar.pdf
 
Advances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdfAdvances in production technology of Grapes.pdf
Advances in production technology of Grapes.pdf
 
Morse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptxMorse OER Some Benefits and Challenges.pptx
Morse OER Some Benefits and Challenges.pptx
 
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptxMatatag-Curriculum and the 21st Century Skills Presentation.pptx
Matatag-Curriculum and the 21st Century Skills Presentation.pptx
 
The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...The impact of social media on mental health and well-being has been a topic o...
The impact of social media on mental health and well-being has been a topic o...
 
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General QuizPragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
Pragya Champions Chalice 2024 Prelims & Finals Q/A set, General Quiz
 

Bank Customer Churn Prediction- Saurav Singh.pptx

  • 1.
  • 2. Bank Customer Churn Prediction Leveraging Machine Learning for Enhanced Customer Retention Presented by : Saurav Singh
  • 3. Introduction • The Banking sector is evolving rapidly and is very well influenced by technological advancements, changing consumer preferences, and a competitive market. • Customer churn, which is the phenomenon of customers discontinuing their relationship with a bank, poses unique challenges and opportunities. When a bank loses customers, it can seriously affect how much money it makes and its market standing. • Machine learning, with its predictive capabilities, offers a transformative approach to understanding and mitigating the challenges posed by customer churn. Through data-driven insights and predictive modeling, this presentation aims to showcase my Machine Learning Capstone Project focused on predicting customer churn in the Banking Sector.
  • 4. Dataset Information Here are the key details about the dataset used in this project: • Number of records: Our dataset comprises a robust collection of data, consisting of 10,000 records. Each record represents a unique entry, contributing to the richness and depth of our analysis. • Features/Columns: The dataset is characterized by a diverse set of features, each providing valuable insights into customer behavior, preferences, and interactions. In total, there are 14 features/columns that form the basis of our predictive modeling. Column Names • Row Number • Customer ID • Surname • Credit Score • Geography • Gender • Age • Tenure • Balance • Number of Products • Has Credit Card • Is Active Member • Estimated Salary • Churned
  • 5. Exploratory Data Analysis (EDA) • Exploring the data allowed us to gain a comprehensive overview of the data's structure. It uncovered potential patterns, helped us identify key trends and get essential insights from the dataset. • Throughout the EDA process, we analyzed the distribution of individual features, investigated correlations, and explored any inherent relationships between variables. • Visualizations also played a crucial role in providing a clear representation of the data, offering insights into customer behavior and identifying the factors that may contribute to customer churn.
  • 6. • First, we made sure there were no Null values and Duplicates in the dataset. And luckily, there weren't any. Our dataset was clean to begin with. • Then, we checked our columns to see if they were providing any useful information for us to work with. We found out that columns like “RowNumber”, “CustomerID” and “Surname” weren't contributing much to the predictions. Hence, we decided to drop them during preprocessing. • The "Geography" and "Gender" columns in our dataset were categorical variables. For them to work with our model, it was necessary to convert these categorical features into a numerical format. • To ensure consistent scales for numerical features, we decided to employ Standard Scaler during preprocessing. Exploratory Data Analysis (EDA)
  • 7. Visualizations Our target variable 'Churned' exhibits class imbalance, with one class dominating the other. This issue of data imbalance needs to be addressed. The above plot reveals a substantial customer presence in France, surpassing other regions by a significant margin.
  • 8. • The dataset contains more Male entries than Female entries. • The number of credit card owners is significantly higher than those who don’t own a credit card. • Credit Card owners have a higher Churn Rate than Non-Credit Card owners. • The distribution of Active and Inactive members is almost the same. • Inactive members have a higher Churn Rate than Active members.
  • 9. • The distribution of people with Credit Score ranging from 601 to 700 is higher than any other group. • The distribution of people with Age ranging from 31 to 40 is higher than any other Age Group.
  • 10. Upon inspecting the heatmap, we can see that there is no significant correlation observed among the columns. As a result, no columns will be dropped solely based on correlation.
  • 11. Preprocessing • First, “RowNumber” , “CustomerID” and “Surname” columns were dropped as they didn’t provide any useful information for our predictions. • Then, we encoded the Categorical data into Numerical data with the help of One-Hot Encoding Technique. It assigns binary numeric values to each unique class present in columns with categorical data. Splitting the data into X and y• In this step, we partitioned the dataset into two components: X and y. • The variable X encompasses all independent variables, representing the features that contribute to our predictions. • On the other hand, y encapsulates the dependent variable or target variable, serving as the outcome we aim to predict.
  • 12. Train-Test Split • We then split the dataset into training data and testing data. • We did an 80:20 split, meaning 80% of our data is Training Data and 20% of our data is Testing Data. So, our test size was set to 0.2. • We took Random State as 123. This guaranteed the reproducibility of our results across different runs. • We also used Stratify = y to ensure that our Target Variable (y) is distributed proportionally. Standard Scaler • We used Standard Scaler to standardize the features of the dataset. • This ensured that the consistency between the features of the dataset was maintained. • Standardization is crucial for certain machine learning algorithms, promoting optimal model performance by mitigating the influence of varying magnitudes among features
  • 13. Over-Sampling with SMOTE • We had data imbalance within our target variable. Initially, we evaluated our model's accuracy in the presence of this imbalance. • Then, to rectify the issue of imbalance, we implemented the Synthetic Minority Over- Sampling Technique (SMOTE) as an oversampling method. • We then compared the model accuracies before and after addressing the data imbalance using SMOTE, providing valuable insights into the impact of this preprocessing technique. • Distribution of our y_train before oversampling : • Distribution of our y_train after oversampling: Not Churned Churned 6370 1630 Not Churned Churned 6370 6370
  • 14. Applying Machine Learning Algorithms This Bank Customer Churn problem we have here is a Binary Classification problem. Models used: • Logistic Regression : Logistic Regression is a powerful tool in binary classification. Its very good at modeling the probability of an event occurring, making it suitable for scenarios where understanding the likelihood of customers churning is essential. • Support Vector Machine (SVC) : Support Vector Classification is a robust algorithm employed for classification tasks, especially when there's a need for clear separation between classes. In the context of customer churn prediction, it draws distinct decision boundaries between loyal and potential churned customers. • Naive Bayes : Naive Bayes is a probabilistic classification algorithm known for its simplicity and efficiency. It assumes that features are independent, making calculations easier. Its often used when simplicity and speed are crucial.
  • 15. Evaluation Metrics Without Oversampling (SMOTE) With Oversampling (SMOTE) Model Accuracy Precision Recall F1-Score LOGI 81.2 59.62 23.58 33.80 SVC 86.5 80.44 44.47 57.27 NB 82.1 59.53 37.59 46.08 Model Accuracy Precision Recall F1-Score LOGI_OS 70.75 37.42 65.11 47.53 SVC_OS 80.75 51.88 74.44 61.15 NB_OS 71.70 38.91 68.55 49.64 • We can see that Oversampling makes a huge difference. • After Oversampling, the accuracy and precision of our models have decreased a bit but Recall and F1-Score have increased.
  • 16. Model Selection and Considerations • SVC outperforms Logistic Regression and Naive Bayes in all metrics, demonstrating higher Accuracy, Precision, Recall, and F1-Score. It seems to be a promising model for our task. • Based on the provided metrics, SVC stands out as the best-performing model overall. It achieves a good balance between precision and recall, making it suitable for our customer churn prediction task. • While metrics like Accuracy and Precision are essential, Recall is particularly crucial in Customer Churn Prediction, as it indicates the ability to identify customers who are likely to Churn. And Support Vector Classification provided us the best Recall value. • Hence, we will go with Support Vector Classification as our final model as it is quite evident that it performs best for our Bank Customer Churn problem.
  • 17. Conclusion • With the help of several insights, patterns and trends in our data, we’ve used Machine Learning to address the intricate challenge of predicting Customer Churn. • This project offers significant benefits to banks:  By predicting potential churners, banks can adopt proactive strategies to retain valuable customers. This involves personalized interventions, loyalty programs, and targeted communication to address customer concerns and enhance satisfaction.  By focusing efforts on customers at a higher risk of churn, banks can streamline operations, reduce marketing costs, and improve overall efficiency.  Anticipating and mitigating customer churn contributes directly to revenue optimization.  Understanding the factors influencing customer churn enables banks to tailor their services to meet individual needs. This level of personalization fosters stronger customer relationships, increases loyalty, and enhances the overall banking experience.