SlideShare une entreprise Scribd logo
1  sur  23
Default Payment Prediction System
Data Analysis and Predictive Analysis – R Programming and Azure ML
ASHISH ARORA
Introduction and Problem
• Banks plays a significant role in providing
financial services to help people and
business to achieve their goals as well as
reach their potential.
• To keep the integrity Bank must avoid in
investing wrong customers who can default
and cause loss to the Financial Institution.
Purpose and Process
• To build a predictive model that can be used to
help the Banks use their data efficiently to
make better decisions.
• A predictive analytics application allows the
banks and other financial institutions to
identify the risks and address them in real time
to reach better outcomes.
• Bank must able to analyze available data
related to the customers before making the
decision of issuing credit card.
• The model developed will use all possible
factors and data to predict whether the
customer would fail or succeed in making the
next payment with a rational accuracy. It would
benefit the bank before they make any
decisions against that customers. The target is
to minimize the risk of having loan loss.
Data Set
• https://archive.ics.uci.edu/ml/
datasets/default+of+credit+ca
rd+clients
• 30000 rows
• Features in dataset = 25
• This dataset contains
information on default
payments, demographic
factors, credit data, history of
payment, and bill statements
of credit card clients in Taiwan
from April 2005 to September
2005.
• There are no missing data.
R Code – Description
And Results
• # Read the .csv file in R
envorinment
• creditcarddata <-
read.csv("default of credit card
clients.csv")
• dim(creditcarddata)
Data Set Summary
• There are two key variable categories in the
dataset.
• Nominal variables include sex, education,
marriage, repayment statuses (PAY_X), etc.
• Numeric variables contains age, amount of
given credit (LIMIT_BAL), amount of bill
statements (BILL_AMT), and amount of
previous payments (PAY_AMT).
• The class variable (y) indicates whether that
customer had default payment the next
month or not. If yes, it is labeled 1,
otherwise, set to 0.
Structure of Data
Before Adding new
variables and
Tidying the Data
• This is the structure of Data
before reshaping and
cleaning step.
• New Variables can be created
to give more possibility of
predicting defaulters.
• SEX, EDUCARION and
MARRIAGE variable can be
converted from integer to
categorical data.
Structure of Data after
adding new variables
• 4 new columns are added to
make data set more
meaningful.
• The new columns being added
are work_status,
education_cat, MARRIAGE_cat
and SEX_cat
Reshaping the Data
• Reshaping the Data by converting
Quantitative Variables To New Factorial
Variables
• Factors are categorical variables that are
super useful in summary statistics, plots, and
regressions. They basically act like dummy
variables that R codes for you.
• Removing Variables which are not useful for
analysis.
• Variables removed from dataset are
PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6.
Structure And Summary of Data After
Tidying the Data
Exploring Data Via
Basic Visualization
• There are more female than male in the
dataset.
• There are clients who finished university-
level education.
• There are more single client than married,
but the number is quite closed.
• More Clients are employed.
Limit Balance Distribution
Determining
Balance Limit
Variability By
Factors of Gender,
Education and
Work State
• After creating box plots it is evident that gender has
no effects on determining balance limits by bank.
• Education level and Work Status are the most
important factors which are being considered by
banks to determine balance limits.
Relationship Between Marital Status &
Balance Limits Categorized By Gender
• By this graph, we can observe
that, there is no change for
females , balance limits
depending on their marital
status remains almost same
for both conditions either
married or single, however it
changes a lot on males side
maybe because of extra
expenditures which is the
reason on increased balance
limits.
Relationship between Limit
Balance & Default Payment
• Balance limits and count of
defaulted clients are almost
same for University and
Graduate Level. Additionally,
the ratio of defaulted clients at
high school level seems almost
the same as the university and
graduate levels.
Balance Limits By Age
Groups & Education
• This box plots shows that the
Balance Limit for higher Age
Group individuals are
increasing based on their
education status.
Correlations Between Limit Balance,
Bill Amounts & Payments
• This correlation plot shows us
that there is a low correlation
between the limit balances
and payments and bill
amounts. However it can be
seen that bill amounts has
high correlation between each
other as expected since the
bills are reflecting the
cumulative amounts.
Is there any
variability in
defaulting payment
next month based
on gender,
education and
martial status ?
• It seems that more males seems to default payment
and in case of education more clients with high
school as their last degree defaults payment.
• Martial Status of client doesn't show any variability.
Model Building
• This section is to start building
the model for predicting the
default payment outcome.
• Before building the model the
dataset was divided in training
and test data set.
• Train Data Set = 70%
• Test Data Set = 30%
Model Building Using Azure ML
• The Model is trained using Two-Class Decision Forest.
The classification matrix or
the confusion matrix
• This classifies our predictions as false positive, false negative, and so on.
• True Positive = The true positives are where the actual value is 1, so in other words, they defaulted and
the predicted value is also 1.
• False Positive = The false positive is where the predicted value is a 1, but the actual value is a 0. Okay, so
we predicted a positive, but we were wrong about it. That's why it's a false positive, so we predicted they
would default, they did not.
• False Negative = The false negative is where we predicted they would not default, and they defaulted.
• True Negative = True negative is where we predicted negative, we predicted they would not default, and
they did not default, okay.
• Accuracy = What Percent out of total test data set population is being predicted correctly.
• Accuracy = (TP+TN)/(TOTAL) = (662+6734)/(1329+275+662+6734) = 0.82
• Precision = how precise was your prediction?
• When you predicted default, how likely are you to be correct?
• Precision = TP / TP + FP = 662 / 662 + 275 = 0.707
• Recall = Out of the Total population, what fraction of population you correctly predicted who will
defaulted.
• Recall = 662 / 662 + 1329 = 0.332
Conclusion
• This project involves prediction of defaulters for
Credit Card Bank Customers.
• R programming is used for Exploratory Data
Analysis and Visualization.
• R and Azure ML is used for Model Building using
Logistic Regression and Two Class Decision
Forest Algorithim.

Contenu connexe

Tendances

Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDeep Borkar
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningAlibaba Cloud
 
Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.Kuldeep Mahani
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET Journal
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language Aayush Kumar
 
Exploratory Data Analysis Bank Fraud Case Study
Exploratory  Data Analysis Bank Fraud Case StudyExploratory  Data Analysis Bank Fraud Case Study
Exploratory Data Analysis Bank Fraud Case StudyLumbiniSardare
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Predictionsriram30691
 
Exploratory Data Analysis For Credit Risk Assesment
Exploratory Data Analysis For Credit Risk AssesmentExploratory Data Analysis For Credit Risk Assesment
Exploratory Data Analysis For Credit Risk AssesmentVishalPatil527
 
Overview of Data Analytics in Lending Business
Overview of Data Analytics in Lending BusinessOverview of Data Analytics in Lending Business
Overview of Data Analytics in Lending BusinessSanjay Kar
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachEslam Nader
 
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...BAINIDA
 
Recommender systems in practice
Recommender systems in practiceRecommender systems in practice
Recommender systems in practiceBigData Republic
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsHariteja Bodepudi
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision TreesSara Hooker
 
EDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptxEDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptxAmitDas125851
 

Tendances (20)

Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan Data
 
Loan Default Prediction with Machine Learning
Loan Default Prediction with Machine LearningLoan Default Prediction with Machine Learning
Loan Default Prediction with Machine Learning
 
Telecom Churn Prediction
Telecom Churn PredictionTelecom Churn Prediction
Telecom Churn Prediction
 
Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.Customer churn prediction for telecom data set.
Customer churn prediction for telecom data set.
 
IRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom IndustryIRJET - Customer Churn Analysis in Telecom Industry
IRJET - Customer Churn Analysis in Telecom Industry
 
Loan default prediction with machine language
Loan  default  prediction with  machine  language Loan  default  prediction with  machine  language
Loan default prediction with machine language
 
Exploratory Data Analysis Bank Fraud Case Study
Exploratory  Data Analysis Bank Fraud Case StudyExploratory  Data Analysis Bank Fraud Case Study
Exploratory Data Analysis Bank Fraud Case Study
 
Credit eda case study
Credit eda case studyCredit eda case study
Credit eda case study
 
House Sale Price Prediction
House Sale Price PredictionHouse Sale Price Prediction
House Sale Price Prediction
 
Exploratory Data Analysis For Credit Risk Assesment
Exploratory Data Analysis For Credit Risk AssesmentExploratory Data Analysis For Credit Risk Assesment
Exploratory Data Analysis For Credit Risk Assesment
 
Overview of Data Analytics in Lending Business
Overview of Data Analytics in Lending BusinessOverview of Data Analytics in Lending Business
Overview of Data Analytics in Lending Business
 
Loan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approachLoan approval prediction based on machine learning approach
Loan approval prediction based on machine learning approach
 
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
Subscriber Churn Prediction Model using Social Network Analysis In Telecommun...
 
Recommender systems in practice
Recommender systems in practiceRecommender systems in practice
Recommender systems in practice
 
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning AlgorithmsCredit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
Credit Card Fraud Detection Using Unsupervised Machine Learning Algorithms
 
Module 5: Decision Trees
Module 5: Decision TreesModule 5: Decision Trees
Module 5: Decision Trees
 
Machine Learning (Classification Models)
Machine Learning (Classification Models)Machine Learning (Classification Models)
Machine Learning (Classification Models)
 
Customer churn prediction in banking
Customer churn prediction in bankingCustomer churn prediction in banking
Customer churn prediction in banking
 
EDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptxEDA_Case_Study_PPT.pptx
EDA_Case_Study_PPT.pptx
 
Book Recommendations.pptx
Book Recommendations.pptxBook Recommendations.pptx
Book Recommendations.pptx
 

Similaire à Default payment prediction system

Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryPranov Mishra
 
Measurement and Scaling.pptx
Measurement and Scaling.pptxMeasurement and Scaling.pptx
Measurement and Scaling.pptxNamrata Wagle
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data ScienceCarolyn Knight
 
What is a Credit Score
What is a Credit ScoreWhat is a Credit Score
What is a Credit ScoreDarren De Jong
 
Employee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptxEmployee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptxBoston Institute of Analytics
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 DiagnoseWebinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 DiagnosePayScale, Inc.
 
25 Financial Health Metrics
25 Financial Health Metrics25 Financial Health Metrics
25 Financial Health MetricsBarbara O'Neill
 
Estimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditEstimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditFriedman Associates
 
Being Right Starts By Knowing You're Wrong
Being Right Starts By Knowing You're WrongBeing Right Starts By Knowing You're Wrong
Being Right Starts By Knowing You're WrongData Con LA
 
Credit Repair Program: Partner Overview
Credit Repair Program: Partner Overview Credit Repair Program: Partner Overview
Credit Repair Program: Partner Overview sabrecredit
 
How to prepare for pay equity analyis
How to prepare for pay equity analyisHow to prepare for pay equity analyis
How to prepare for pay equity analyisPayScale, Inc.
 
How to Better Manage Your Veterinary Practice Finances
How to Better Manage Your Veterinary Practice FinancesHow to Better Manage Your Veterinary Practice Finances
How to Better Manage Your Veterinary Practice FinancesMcGaunnSchwadronCPA
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1PayScale, Inc.
 
The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining ProcessMarc Berman
 

Similaire à Default payment prediction system (20)

Reduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage IndustryReduction in customer complaints - Mortgage Industry
Reduction in customer complaints - Mortgage Industry
 
Credit Scoring Capstone Project- Pallavi Mohanty.pptx
Credit Scoring Capstone Project- Pallavi Mohanty.pptxCredit Scoring Capstone Project- Pallavi Mohanty.pptx
Credit Scoring Capstone Project- Pallavi Mohanty.pptx
 
Measurement and Scaling.pptx
Measurement and Scaling.pptxMeasurement and Scaling.pptx
Measurement and Scaling.pptx
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
 
What is a Credit Score
What is a Credit ScoreWhat is a Credit Score
What is a Credit Score
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Employee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptxEmployee Retension Capstone Project - Neeraj Bubby.pptx
Employee Retension Capstone Project - Neeraj Bubby.pptx
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 DiagnoseWebinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 2 Diagnose
 
Group 1 p53
Group 1 p53Group 1 p53
Group 1 p53
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
25 Financial Health Metrics
25 Financial Health Metrics25 Financial Health Metrics
25 Financial Health Metrics
 
Estimating Supply and Demand for Microcredit
Estimating Supply and Demand for MicrocreditEstimating Supply and Demand for Microcredit
Estimating Supply and Demand for Microcredit
 
Being Right Starts By Knowing You're Wrong
Being Right Starts By Knowing You're WrongBeing Right Starts By Knowing You're Wrong
Being Right Starts By Knowing You're Wrong
 
Credit Repair Program: Partner Overview
Credit Repair Program: Partner Overview Credit Repair Program: Partner Overview
Credit Repair Program: Partner Overview
 
How to prepare for pay equity analyis
How to prepare for pay equity analyisHow to prepare for pay equity analyis
How to prepare for pay equity analyis
 
Indhu resume
Indhu resumeIndhu resume
Indhu resume
 
Indhu resume
Indhu resumeIndhu resume
Indhu resume
 
How to Better Manage Your Veterinary Practice Finances
How to Better Manage Your Veterinary Practice FinancesHow to Better Manage Your Veterinary Practice Finances
How to Better Manage Your Veterinary Practice Finances
 
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1
Webinar - How to Prepare for a Pay Equity Analysis Series Ep 1
 
The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining Process
 

Dernier

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...SOFTTECHHUB
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.pptibrahimabdi22
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...HyderabadDolls
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdfkhraisr
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 

Dernier (20)

Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
TrafficWave Generator Will Instantly drive targeted and engaging traffic back...
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf20240412-SmartCityIndex-2024-Full-Report.pdf
20240412-SmartCityIndex-2024-Full-Report.pdf
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 

Default payment prediction system

  • 1. Default Payment Prediction System Data Analysis and Predictive Analysis – R Programming and Azure ML ASHISH ARORA
  • 2. Introduction and Problem • Banks plays a significant role in providing financial services to help people and business to achieve their goals as well as reach their potential. • To keep the integrity Bank must avoid in investing wrong customers who can default and cause loss to the Financial Institution.
  • 3. Purpose and Process • To build a predictive model that can be used to help the Banks use their data efficiently to make better decisions. • A predictive analytics application allows the banks and other financial institutions to identify the risks and address them in real time to reach better outcomes. • Bank must able to analyze available data related to the customers before making the decision of issuing credit card. • The model developed will use all possible factors and data to predict whether the customer would fail or succeed in making the next payment with a rational accuracy. It would benefit the bank before they make any decisions against that customers. The target is to minimize the risk of having loan loss.
  • 4. Data Set • https://archive.ics.uci.edu/ml/ datasets/default+of+credit+ca rd+clients • 30000 rows • Features in dataset = 25 • This dataset contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients in Taiwan from April 2005 to September 2005. • There are no missing data.
  • 5. R Code – Description And Results • # Read the .csv file in R envorinment • creditcarddata <- read.csv("default of credit card clients.csv") • dim(creditcarddata)
  • 6. Data Set Summary • There are two key variable categories in the dataset. • Nominal variables include sex, education, marriage, repayment statuses (PAY_X), etc. • Numeric variables contains age, amount of given credit (LIMIT_BAL), amount of bill statements (BILL_AMT), and amount of previous payments (PAY_AMT). • The class variable (y) indicates whether that customer had default payment the next month or not. If yes, it is labeled 1, otherwise, set to 0.
  • 7. Structure of Data Before Adding new variables and Tidying the Data • This is the structure of Data before reshaping and cleaning step. • New Variables can be created to give more possibility of predicting defaulters. • SEX, EDUCARION and MARRIAGE variable can be converted from integer to categorical data.
  • 8. Structure of Data after adding new variables • 4 new columns are added to make data set more meaningful. • The new columns being added are work_status, education_cat, MARRIAGE_cat and SEX_cat
  • 9. Reshaping the Data • Reshaping the Data by converting Quantitative Variables To New Factorial Variables • Factors are categorical variables that are super useful in summary statistics, plots, and regressions. They basically act like dummy variables that R codes for you. • Removing Variables which are not useful for analysis. • Variables removed from dataset are PAY_0,PAY_2,PAY_3,PAY_4,PAY_5,PAY_6.
  • 10. Structure And Summary of Data After Tidying the Data
  • 11. Exploring Data Via Basic Visualization • There are more female than male in the dataset. • There are clients who finished university- level education. • There are more single client than married, but the number is quite closed. • More Clients are employed.
  • 13. Determining Balance Limit Variability By Factors of Gender, Education and Work State • After creating box plots it is evident that gender has no effects on determining balance limits by bank. • Education level and Work Status are the most important factors which are being considered by banks to determine balance limits.
  • 14. Relationship Between Marital Status & Balance Limits Categorized By Gender • By this graph, we can observe that, there is no change for females , balance limits depending on their marital status remains almost same for both conditions either married or single, however it changes a lot on males side maybe because of extra expenditures which is the reason on increased balance limits.
  • 15. Relationship between Limit Balance & Default Payment • Balance limits and count of defaulted clients are almost same for University and Graduate Level. Additionally, the ratio of defaulted clients at high school level seems almost the same as the university and graduate levels.
  • 16. Balance Limits By Age Groups & Education • This box plots shows that the Balance Limit for higher Age Group individuals are increasing based on their education status.
  • 17. Correlations Between Limit Balance, Bill Amounts & Payments • This correlation plot shows us that there is a low correlation between the limit balances and payments and bill amounts. However it can be seen that bill amounts has high correlation between each other as expected since the bills are reflecting the cumulative amounts.
  • 18. Is there any variability in defaulting payment next month based on gender, education and martial status ? • It seems that more males seems to default payment and in case of education more clients with high school as their last degree defaults payment. • Martial Status of client doesn't show any variability.
  • 19. Model Building • This section is to start building the model for predicting the default payment outcome. • Before building the model the dataset was divided in training and test data set. • Train Data Set = 70% • Test Data Set = 30%
  • 20.
  • 21. Model Building Using Azure ML • The Model is trained using Two-Class Decision Forest.
  • 22. The classification matrix or the confusion matrix • This classifies our predictions as false positive, false negative, and so on. • True Positive = The true positives are where the actual value is 1, so in other words, they defaulted and the predicted value is also 1. • False Positive = The false positive is where the predicted value is a 1, but the actual value is a 0. Okay, so we predicted a positive, but we were wrong about it. That's why it's a false positive, so we predicted they would default, they did not. • False Negative = The false negative is where we predicted they would not default, and they defaulted. • True Negative = True negative is where we predicted negative, we predicted they would not default, and they did not default, okay. • Accuracy = What Percent out of total test data set population is being predicted correctly. • Accuracy = (TP+TN)/(TOTAL) = (662+6734)/(1329+275+662+6734) = 0.82 • Precision = how precise was your prediction? • When you predicted default, how likely are you to be correct? • Precision = TP / TP + FP = 662 / 662 + 275 = 0.707 • Recall = Out of the Total population, what fraction of population you correctly predicted who will defaulted. • Recall = 662 / 662 + 1329 = 0.332
  • 23. Conclusion • This project involves prediction of defaulters for Credit Card Bank Customers. • R programming is used for Exploratory Data Analysis and Visualization. • R and Azure ML is used for Model Building using Logistic Regression and Two Class Decision Forest Algorithim.