SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
SFbayACM.org Data Science Camp 
Saturday, October 25, 2014 
Greg Makowski 
Twitter Tag #DMCAMP
 Customer Description – CC Company – 
“Who” vs. “How” to Talk to Customers 
 Hotel Price Optimization – Using 
Clusters as Non-Linear Constraints 
 Retail Supply Chain – Planning 
Replenishment for 52 Week Demand 
Curves
 Context: 
◦ Major credit card company 
◦ South American Market 
◦ Repeat for Argentina, Brazil… and “dollar countries” 
 Objectives or Problem: 
◦ How to best manage the customer population 
◦ Develop a software system, to repeat over geography 
and time 
◦ How to AUTOMATE understanding? 
 How to automate naming the clusters?
 Solution, 3 projects for each customer base 
◦ “WHO” to talk to… 
 Customer Attrition Model – Neural Network (5 algs tested) 
 Decrease in spending over time 
 Basic vs. Supplemental Cards 
 By 7 categories 
 Challenge: Double digit inflation in some countries (90’s) 
 Standardize by monthly spending 
 Mining Factoid: Credit Card Digit 11 was predictive 
 Billing cycles? Monthly salaries + high inflation 
 Customer Profitability – Net Present Value 
◦ “HOW” to talk to them… 
 Cluster Analysis
 Consider Scalability 
◦ 100k – 500k customers 
◦ Some cluster methods are O(n) or O(n2) 
◦ Use Kmeans to create 100 clusters O(n) 
◦ Then use O(n2) methods to reduce from 100 clusters 
down to 8-12 clusters 
◦
Select the 5-15% customers 
“highest” in the spike 
1 
4 7 10 13 16 19 
Tree-Net 
Random 
Cumulative Profit 
5% Customer 
Groups 
Total Profit / cell 
Attrition 
Profitability 
83% of Attrition Profit was Lost in top 15%
 How to design the cluster analysis? 
◦ Select top fields from neural network 
 Sensitivity Analysis on the NN 
 % spending by category 
 Restaurant, Retail, Grocery, Hotel, Air, Auto, … 
 Trend over time (slope, expected future value) 
 Decide to create 8 – 12 clusters or customer segments 
to communicate to marketers 
◦
 Consider Scalability 
◦ 100k – 500k customers 
◦ Some cluster methods are O(n) or O(n2) 
◦ Use Kmeans to create 100 clusters O(n) 
◦ Then use O(n2) methods to reduce from 100 clusters 
down to 8-12 clusters 
◦
 Consider Scalability 
◦ 100k – 500k customers 
◦ Some cluster methods are O(n) or O(n2) 
◦ Use Kmeans to create 100 clusters O(n) 
◦ Then use O(n2) methods to reduce from 100 clusters 
down to 8-12 clusters 
◦ This uses all the data scalebly, and more 
sophisticated hierarchical cluster search 
◦
Clusters 
Most customers  Least 
ALL 1 2 N 
Ordered 
by 
Importance 
Fields 100% 36% 22%  5%
Clusters 
Most customers  Least 
Ordered 
by 
Importance 
ALL 1 2 N 
100% 36% Fields 22%  5% min MAX
Most: 
Var X, Y, Z 
Least 
Var A, B, C 
May have 12 clusters, 36 variables 
Then each cluster may have 6 attributes 
to use in naming 
min MAX
 Select “WHO” with (Attrition)x(Profitability) 
 Select “HOW” with Cluster Segments 
◦ Given the variable selection, only a few clusters 
matched most of the 15% subset of the customers to 
manage 
 Marketers could understand well the different 
audiences and reasons for attrition – and 
could better write copy for communication 
 About 50 Executives walked around with the 
one page cluster summary in their pocket, 
frequently used to plan customer strategies
Analysis 
Type 
CRM 
Behavior 
Media 
Message 
$$$ 
Best 
Customers 
Upgrade, Downgrade 
Loyal 
Loyalty 
Cross-Sell 
Prospect 
Segment 
Reactivation 
Attrition 
Retention 
Fraud
 Customer Description – CC Company – 
“Who” vs. “How” to Talk to Customers 
 Hotel Price Optimization – Using 
Clusters as Non-Linear Constraints 
 Retail Supply Chain – Planning 
Replenishment for 52 Week Demand 
Curves
 Objective: 
◦ Optimize pricing for hotel rooms 
◦ Take into account geography & use 
 weekend, vacation, business, conference, … 
 Seasons of the year as it relates to demand 
 The hotel owns many brands (chains) focused 
on different audiences 
◦ Different price tiers, target audiences,… 
◦ Hotel, motel, extended stay, … 
◦ What “lessons learned” cross brands?
 Revenue Management is a general process used 
to 
◦ optimize profit 
◦ given the remaining (plane seat or hotel room) 
inventory 
◦ the remaining time until the inventory is gone 
 Operations Research 
◦ Linear or Non-Linear Programming 
 Lin or Non-Lin in either constraints or objective function 
◦ Need an objective function to optimize 
 Train predictive models to forecast price, given 
conditions
 Data Mining and Operations Research Design 
◦ When training predictive models, it helps to learn 
behavior “in the same ball park” with the same 
model. 
◦ If the underlying thought process is fairly different, 
subdivide the data into different subsets and train 
different models. For example: 
 Attrition: checking, credit card, line of credit, mortgage 
 In Mortgage Bond Pricing: monthly prepayment of 
none vs. 100’s vs. 1,000’s vs. a full refinance
 How do we group or divide individual hotels, 
given all the attributes? 
◦ Brand, location, % utilization weekday or weekend, 
 Find bottom-up clusters, rather than top-down 
assertions on the data 
 For cluster variables – use best variables in 
pricing predictive models (sound familiar?)
 Solution: 
◦ 1) Build an initial predictive model predicting 
pricing. Find the most important variables. 
◦ 2) Create 8-16 clusters, using those variables 
◦ 3) Within each cluster 
 A) Train a predictive model for use as the OR objective 
function 
 B) Run a LINEAR OR price optimization, on the data 
subset
 Customer Description – CC Company – 
“Who” vs. “How” to Talk to Customers 
 Hotel Price Optimization – Using 
Clusters as Non-Linear Constraints 
 Retail Supply Chain – Planning 
Replenishment for 52 Week Demand 
Curves
 The “Retail Supply Chain” is from 
◦ the manufacturer to 
◦ distribution center to 
◦ Warehouse to 
◦ Store to Consumer 
 Replenishment is to re-supply products on the 
shelves 
◦ Minimize overstock and understock 
◦ Heavy understock causes LOSS OF SALES 
◦ Heavy overstock causes 30% end of season liquidation
 4,000 stores 
 100,000 products/SKU’s (stock keeping units) 
◦ 400 million store-product combinations 
 52 weeks per year 
◦ 20.8 billion store-product-week combinations 
 Not the smallest problem in the mid-90’s 
 Holidays shift in week number, from year to 
year – need to adjust
 End up creating 2,000+ “profiles” or 
centroids 
 Assign new store-SKU’s to an existing profile 
 If it doesn’t match (within a radius)… 
◦ Re-run cluster analysis 
◦ Lock existing centroids 
◦ Create new centroids for data points outside 
◦ Add to the “profile library”
 Bottom-up findings (after the fact) 
◦ Buying hunting related items as the ducks migrate 
north
Three case studies deploying cluster analysis
Three case studies deploying cluster analysis
Three case studies deploying cluster analysis
Three case studies deploying cluster analysis
Three case studies deploying cluster analysis

Contenu connexe

Tendances

Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationVishva Abeyrathne
 
Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industryskewdlogix
 
Customer Churn Prevention Powerpoint Presentation Slides
Customer Churn Prevention Powerpoint Presentation SlidesCustomer Churn Prevention Powerpoint Presentation Slides
Customer Churn Prevention Powerpoint Presentation SlidesSlideTeam
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysisGramener
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom IndustrySatyam Barsaiyan
 
Churn Prediction in Practice
Churn Prediction in PracticeChurn Prediction in Practice
Churn Prediction in PracticeBigData Republic
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data ScienceCarolyn Knight
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaRahul Bhatia
 
Product Analytics Playbook
Product Analytics PlaybookProduct Analytics Playbook
Product Analytics Playbookssuserd5e338
 
The SaaS business model and metrics
The SaaS business model and metricsThe SaaS business model and metrics
The SaaS business model and metricsDavid Skok
 
Predict online shoppers' intentions
Predict online shoppers' intentionsPredict online shoppers' intentions
Predict online shoppers' intentionsHelen Phan
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An OverviewMachinePulse
 
Whatsapp analytics
Whatsapp analyticsWhatsapp analytics
Whatsapp analyticsDipesh Patel
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDeep Borkar
 

Tendances (20)

Data visualization
Data visualizationData visualization
Data visualization
 
Churn modelling
Churn modellingChurn modelling
Churn modelling
 
Predicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using ClassificationPredicting Bank Customer Churn Using Classification
Predicting Bank Customer Churn Using Classification
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
Churn in the Telecommunications Industry
Churn in the Telecommunications IndustryChurn in the Telecommunications Industry
Churn in the Telecommunications Industry
 
Customer Churn Prevention Powerpoint Presentation Slides
Customer Churn Prevention Powerpoint Presentation SlidesCustomer Churn Prevention Powerpoint Presentation Slides
Customer Churn Prevention Powerpoint Presentation Slides
 
Telecom Churn Analysis
Telecom Churn AnalysisTelecom Churn Analysis
Telecom Churn Analysis
 
Exploratory data analysis
Exploratory data analysisExploratory data analysis
Exploratory data analysis
 
Churn Analysis in Telecom Industry
Churn Analysis in Telecom IndustryChurn Analysis in Telecom Industry
Churn Analysis in Telecom Industry
 
Churn Prediction in Practice
Churn Prediction in PracticeChurn Prediction in Practice
Churn Prediction in Practice
 
Bank churn with Data Science
Bank churn with Data ScienceBank churn with Data Science
Bank churn with Data Science
 
MIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_BhatiaMIS637_Final_Project_Rahul_Bhatia
MIS637_Final_Project_Rahul_Bhatia
 
Product Analytics Playbook
Product Analytics PlaybookProduct Analytics Playbook
Product Analytics Playbook
 
The SaaS business model and metrics
The SaaS business model and metricsThe SaaS business model and metrics
The SaaS business model and metrics
 
Predicting the e-commerce churn
Predicting the e-commerce churnPredicting the e-commerce churn
Predicting the e-commerce churn
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Predict online shoppers' intentions
Predict online shoppers' intentionsPredict online shoppers' intentions
Predict online shoppers' intentions
 
Predictive Analytics - An Overview
Predictive Analytics - An OverviewPredictive Analytics - An Overview
Predictive Analytics - An Overview
 
Whatsapp analytics
Whatsapp analyticsWhatsapp analytics
Whatsapp analytics
 
Default Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan DataDefault Prediction & Analysis on Lending Club Loan Data
Default Prediction & Analysis on Lending Club Loan Data
 

Similaire à Three case studies deploying cluster analysis

Wooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersWooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersLucinda Linde
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionMatt Stubbs
 
Automated Data Mining for Everyone
Automated Data Mining for EveryoneAutomated Data Mining for Everyone
Automated Data Mining for EveryoneExponea
 
Using Big Data & Analytics to Create Consumer Actionable Insights
Using Big Data & Analytics to Create Consumer Actionable InsightsUsing Big Data & Analytics to Create Consumer Actionable Insights
Using Big Data & Analytics to Create Consumer Actionable Insights莫利伟 Olivier Maugain
 
Customer Lifetime Value in Digital Marketing
Customer Lifetime Value in Digital MarketingCustomer Lifetime Value in Digital Marketing
Customer Lifetime Value in Digital MarketingTaste Medio
 
Retail Energy Analytics_Marketelligent
Retail Energy Analytics_MarketelligentRetail Energy Analytics_Marketelligent
Retail Energy Analytics_MarketelligentMarketelligent
 
Developing a customer data platform
Developing a customer data platformDeveloping a customer data platform
Developing a customer data platformTredence Inc
 
Rapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and SolverRapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and SolverMichael Mina
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case studyRupam Devnath
 
Developing a Customer Win Back Strategy
Developing a Customer Win Back StrategyDeveloping a Customer Win Back Strategy
Developing a Customer Win Back StrategyArt Hall
 
Big Data and the Next Best Offer
Big Data and the Next Best OfferBig Data and the Next Best Offer
Big Data and the Next Best OfferMichel Bruley
 
Webinar: How to Setup a Product to Perform by Worldpay PM
Webinar: How to Setup a Product to Perform by Worldpay PMWebinar: How to Setup a Product to Perform by Worldpay PM
Webinar: How to Setup a Product to Perform by Worldpay PMProduct School
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPranov Mishra
 
Using AI and ML Solutions for Proactive Customer Retention.pptx
Using AI and ML Solutions for Proactive Customer Retention.pptxUsing AI and ML Solutions for Proactive Customer Retention.pptx
Using AI and ML Solutions for Proactive Customer Retention.pptxVOZIQ
 

Similaire à Three case studies deploying cluster analysis (20)

Wooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit CustomersWooing the Best Bank Deposit Customers
Wooing the Best Bank Deposit Customers
 
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing AttributionBig Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
Big Data LDN 2017: Advanced Analytics Applied to Marketing Attribution
 
Automated Data Mining for Everyone
Automated Data Mining for EveryoneAutomated Data Mining for Everyone
Automated Data Mining for Everyone
 
Using Big Data & Analytics to Create Consumer Actionable Insights
Using Big Data & Analytics to Create Consumer Actionable InsightsUsing Big Data & Analytics to Create Consumer Actionable Insights
Using Big Data & Analytics to Create Consumer Actionable Insights
 
Churn analysis
Churn analysisChurn analysis
Churn analysis
 
Identifying high value customers
Identifying high value customersIdentifying high value customers
Identifying high value customers
 
Customer Lifetime Value in Digital Marketing
Customer Lifetime Value in Digital MarketingCustomer Lifetime Value in Digital Marketing
Customer Lifetime Value in Digital Marketing
 
Insurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis ProjectInsurance Churn Prediction Data Analysis Project
Insurance Churn Prediction Data Analysis Project
 
Retail Energy Analytics_Marketelligent
Retail Energy Analytics_MarketelligentRetail Energy Analytics_Marketelligent
Retail Energy Analytics_Marketelligent
 
Developing a customer data platform
Developing a customer data platformDeveloping a customer data platform
Developing a customer data platform
 
Rapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and SolverRapid Optimization Application Development Using Excel and Solver
Rapid Optimization Application Development Using Excel and Solver
 
Day 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business AnalyticsDay 1 (Lecture 2): Business Analytics
Day 1 (Lecture 2): Business Analytics
 
Demand forecasting case study
Demand forecasting case studyDemand forecasting case study
Demand forecasting case study
 
Developing a Customer Win Back Strategy
Developing a Customer Win Back StrategyDeveloping a Customer Win Back Strategy
Developing a Customer Win Back Strategy
 
Pilar sanchezaita ratetiger[faro]
Pilar sanchezaita ratetiger[faro]Pilar sanchezaita ratetiger[faro]
Pilar sanchezaita ratetiger[faro]
 
Data science vs real world: friends or foes - Pavle Kecman
Data science vs real world: friends or foes - Pavle KecmanData science vs real world: friends or foes - Pavle Kecman
Data science vs real world: friends or foes - Pavle Kecman
 
Big Data and the Next Best Offer
Big Data and the Next Best OfferBig Data and the Next Best Offer
Big Data and the Next Best Offer
 
Webinar: How to Setup a Product to Perform by Worldpay PM
Webinar: How to Setup a Product to Perform by Worldpay PMWebinar: How to Setup a Product to Perform by Worldpay PM
Webinar: How to Setup a Product to Perform by Worldpay PM
 
Prediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom IndustryPrediction of customer propensity to churn - Telecom Industry
Prediction of customer propensity to churn - Telecom Industry
 
Using AI and ML Solutions for Proactive Customer Retention.pptx
Using AI and ML Solutions for Proactive Customer Retention.pptxUsing AI and ML Solutions for Proactive Customer Retention.pptx
Using AI and ML Solutions for Proactive Customer Retention.pptx
 

Plus de Greg Makowski

Understanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptxGreg Makowski
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxGreg Makowski
 
A Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data ScientistsA Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data ScientistsGreg Makowski
 
Kdd 2019: Standardizing Data Science to Help Hiring
Kdd 2019:  Standardizing Data Science to Help HiringKdd 2019:  Standardizing Data Science to Help Hiring
Kdd 2019: Standardizing Data Science to Help HiringGreg Makowski
 
Tales from an ip worker in consulting and software
Tales from an ip worker in consulting and softwareTales from an ip worker in consulting and software
Tales from an ip worker in consulting and softwareGreg Makowski
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Greg Makowski
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Greg Makowski
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsGreg Makowski
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Greg Makowski
 
SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24Greg Makowski
 
How to Create 80% of a Big Data Pilot Project
How to Create 80% of a Big Data Pilot ProjectHow to Create 80% of a Big Data Pilot Project
How to Create 80% of a Big Data Pilot ProjectGreg Makowski
 
Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...
Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...
Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...Greg Makowski
 
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning SolutionsKamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning SolutionsGreg Makowski
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchGreg Makowski
 
Linked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 BLinked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 BGreg Makowski
 
The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)Greg Makowski
 
The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)Greg Makowski
 

Plus de Greg Makowski (17)

Understanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptx
 
Future of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptxFuture of AI - 2023 07 25.pptx
Future of AI - 2023 07 25.pptx
 
A Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data ScientistsA Successful Hiring Process for Data Scientists
A Successful Hiring Process for Data Scientists
 
Kdd 2019: Standardizing Data Science to Help Hiring
Kdd 2019:  Standardizing Data Science to Help HiringKdd 2019:  Standardizing Data Science to Help Hiring
Kdd 2019: Standardizing Data Science to Help Hiring
 
Tales from an ip worker in consulting and software
Tales from an ip worker in consulting and softwareTales from an ip worker in consulting and software
Tales from an ip worker in consulting and software
 
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...Predictive Model and Record Description with Segmented Sensitivity Analysis (...
Predictive Model and Record Description with Segmented Sensitivity Analysis (...
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
Using Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical ApplicationsUsing Deep Learning to do Real-Time Scoring in Practical Applications
Using Deep Learning to do Real-Time Scoring in Practical Applications
 
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
Using Deep Learning to do Real-Time Scoring in Practical Applications - 2015-...
 
SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24SFbayACM ACM Data Science Camp 2015 10 24
SFbayACM ACM Data Science Camp 2015 10 24
 
How to Create 80% of a Big Data Pilot Project
How to Create 80% of a Big Data Pilot ProjectHow to Create 80% of a Big Data Pilot Project
How to Create 80% of a Big Data Pilot Project
 
Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...
Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...
Powering Real­time Decision Engines in Finance and Healthcare using Open Sour...
 
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning SolutionsKamanja: Driving Business Value through Real-Time Decisioning Solutions
Kamanja: Driving Business Value through Real-Time Decisioning Solutions
 
Heuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient searchHeuristic design of experiments w meta gradient search
Heuristic design of experiments w meta gradient search
 
Linked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 BLinked In Slides 2009 02 24 B
Linked In Slides 2009 02 24 B
 
The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)The 360º Leader (Section 2 of 6)
The 360º Leader (Section 2 of 6)
 
The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)The 360º Leader (Section 1 of 6)
The 360º Leader (Section 1 of 6)
 

Dernier

detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxellehsormae
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 

Dernier (20)

detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Vision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptxVision, Mission, Goals and Objectives ppt..pptx
Vision, Mission, Goals and Objectives ppt..pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 

Three case studies deploying cluster analysis

  • 1. SFbayACM.org Data Science Camp Saturday, October 25, 2014 Greg Makowski Twitter Tag #DMCAMP
  • 2.  Customer Description – CC Company – “Who” vs. “How” to Talk to Customers  Hotel Price Optimization – Using Clusters as Non-Linear Constraints  Retail Supply Chain – Planning Replenishment for 52 Week Demand Curves
  • 3.  Context: ◦ Major credit card company ◦ South American Market ◦ Repeat for Argentina, Brazil… and “dollar countries”  Objectives or Problem: ◦ How to best manage the customer population ◦ Develop a software system, to repeat over geography and time ◦ How to AUTOMATE understanding?  How to automate naming the clusters?
  • 4.  Solution, 3 projects for each customer base ◦ “WHO” to talk to…  Customer Attrition Model – Neural Network (5 algs tested)  Decrease in spending over time  Basic vs. Supplemental Cards  By 7 categories  Challenge: Double digit inflation in some countries (90’s)  Standardize by monthly spending  Mining Factoid: Credit Card Digit 11 was predictive  Billing cycles? Monthly salaries + high inflation  Customer Profitability – Net Present Value ◦ “HOW” to talk to them…  Cluster Analysis
  • 5.  Consider Scalability ◦ 100k – 500k customers ◦ Some cluster methods are O(n) or O(n2) ◦ Use Kmeans to create 100 clusters O(n) ◦ Then use O(n2) methods to reduce from 100 clusters down to 8-12 clusters ◦
  • 6. Select the 5-15% customers “highest” in the spike 1 4 7 10 13 16 19 Tree-Net Random Cumulative Profit 5% Customer Groups Total Profit / cell Attrition Profitability 83% of Attrition Profit was Lost in top 15%
  • 7.  How to design the cluster analysis? ◦ Select top fields from neural network  Sensitivity Analysis on the NN  % spending by category  Restaurant, Retail, Grocery, Hotel, Air, Auto, …  Trend over time (slope, expected future value)  Decide to create 8 – 12 clusters or customer segments to communicate to marketers ◦
  • 8.  Consider Scalability ◦ 100k – 500k customers ◦ Some cluster methods are O(n) or O(n2) ◦ Use Kmeans to create 100 clusters O(n) ◦ Then use O(n2) methods to reduce from 100 clusters down to 8-12 clusters ◦
  • 9.  Consider Scalability ◦ 100k – 500k customers ◦ Some cluster methods are O(n) or O(n2) ◦ Use Kmeans to create 100 clusters O(n) ◦ Then use O(n2) methods to reduce from 100 clusters down to 8-12 clusters ◦ This uses all the data scalebly, and more sophisticated hierarchical cluster search ◦
  • 10. Clusters Most customers  Least ALL 1 2 N Ordered by Importance Fields 100% 36% 22%  5%
  • 11. Clusters Most customers  Least Ordered by Importance ALL 1 2 N 100% 36% Fields 22%  5% min MAX
  • 12. Most: Var X, Y, Z Least Var A, B, C May have 12 clusters, 36 variables Then each cluster may have 6 attributes to use in naming min MAX
  • 13.  Select “WHO” with (Attrition)x(Profitability)  Select “HOW” with Cluster Segments ◦ Given the variable selection, only a few clusters matched most of the 15% subset of the customers to manage  Marketers could understand well the different audiences and reasons for attrition – and could better write copy for communication  About 50 Executives walked around with the one page cluster summary in their pocket, frequently used to plan customer strategies
  • 14. Analysis Type CRM Behavior Media Message $$$ Best Customers Upgrade, Downgrade Loyal Loyalty Cross-Sell Prospect Segment Reactivation Attrition Retention Fraud
  • 15.  Customer Description – CC Company – “Who” vs. “How” to Talk to Customers  Hotel Price Optimization – Using Clusters as Non-Linear Constraints  Retail Supply Chain – Planning Replenishment for 52 Week Demand Curves
  • 16.  Objective: ◦ Optimize pricing for hotel rooms ◦ Take into account geography & use  weekend, vacation, business, conference, …  Seasons of the year as it relates to demand  The hotel owns many brands (chains) focused on different audiences ◦ Different price tiers, target audiences,… ◦ Hotel, motel, extended stay, … ◦ What “lessons learned” cross brands?
  • 17.  Revenue Management is a general process used to ◦ optimize profit ◦ given the remaining (plane seat or hotel room) inventory ◦ the remaining time until the inventory is gone  Operations Research ◦ Linear or Non-Linear Programming  Lin or Non-Lin in either constraints or objective function ◦ Need an objective function to optimize  Train predictive models to forecast price, given conditions
  • 18.  Data Mining and Operations Research Design ◦ When training predictive models, it helps to learn behavior “in the same ball park” with the same model. ◦ If the underlying thought process is fairly different, subdivide the data into different subsets and train different models. For example:  Attrition: checking, credit card, line of credit, mortgage  In Mortgage Bond Pricing: monthly prepayment of none vs. 100’s vs. 1,000’s vs. a full refinance
  • 19.  How do we group or divide individual hotels, given all the attributes? ◦ Brand, location, % utilization weekday or weekend,  Find bottom-up clusters, rather than top-down assertions on the data  For cluster variables – use best variables in pricing predictive models (sound familiar?)
  • 20.  Solution: ◦ 1) Build an initial predictive model predicting pricing. Find the most important variables. ◦ 2) Create 8-16 clusters, using those variables ◦ 3) Within each cluster  A) Train a predictive model for use as the OR objective function  B) Run a LINEAR OR price optimization, on the data subset
  • 21.  Customer Description – CC Company – “Who” vs. “How” to Talk to Customers  Hotel Price Optimization – Using Clusters as Non-Linear Constraints  Retail Supply Chain – Planning Replenishment for 52 Week Demand Curves
  • 22.  The “Retail Supply Chain” is from ◦ the manufacturer to ◦ distribution center to ◦ Warehouse to ◦ Store to Consumer  Replenishment is to re-supply products on the shelves ◦ Minimize overstock and understock ◦ Heavy understock causes LOSS OF SALES ◦ Heavy overstock causes 30% end of season liquidation
  • 23.  4,000 stores  100,000 products/SKU’s (stock keeping units) ◦ 400 million store-product combinations  52 weeks per year ◦ 20.8 billion store-product-week combinations  Not the smallest problem in the mid-90’s  Holidays shift in week number, from year to year – need to adjust
  • 24.
  • 25.  End up creating 2,000+ “profiles” or centroids  Assign new store-SKU’s to an existing profile  If it doesn’t match (within a radius)… ◦ Re-run cluster analysis ◦ Lock existing centroids ◦ Create new centroids for data points outside ◦ Add to the “profile library”
  • 26.  Bottom-up findings (after the fact) ◦ Buying hunting related items as the ducks migrate north