SlideShare a Scribd company logo
1 of 24
Solving Churn challenge in Big data
environment
Jelena Pekez
Principal Business Consultant, Lead Data Scientist
Comtrade System Integrations
CHURN IS A CHALLENGE IN EVERY INDUSTRY, THE
DIFFERENCE IS HOW IT IS MANAGED
Examine how this model will be used
Business
focus
High margin
customers
Strategy
relevant
segments
At Risk
customers
How fresh data do we need?
Monthly/Daily/Real time?
Outcome
Identified Key Patterns of Behavior that
Lead to Churn
Enabled pro-active outreach to save
profitable at-risk customers
The Goal
Enhance churn prediction through multi-
channel customer behavior analytics and
find an incremental number of high risk
churners compared to the traditional
statistical models
BUSINESS GOAL AND OUTCOMES
The goal of retention strategy is to keep churn under control.
• Domain specific features
• Special models outputs as
new features
• Balancing techniques
Evaluation
Data
Deployment
Modeling
Data
preparation
Data
understanding
Business
understanding
CRISP METHODOLOGY
BUSINESS UNDERSTANDING
What is goal?
Relevant segment
Challenge the business
definition
Formal definition is often
not relevant for targeting
purposes
e.g. Churn 90 days  10
days
Are there already
tested campaigns
Results, take rate
e.g. do we have integrated
campaigns results,
experiences
Existing reports
review
Reduces data analysis and
understanding
Helps to set expectations
and feasibility of prediction
Trend and seasonality
understanding
Which population is
relevant
Exclude inactive customers
Find irrelevant groups and
black list
Examine existing
segments / behavior
groups
Define metrics
for success
Evaluation metrics and
expectations
What will be product
offering
Target list size for
campaign
Frequency
6
 Set objectives
 Produce Project Plan
 Business success criteria/DS success
criteria
 Assess the current situation
 Risks assumptions, constraints and
contingencies
 Terminology
 Cost and benefits
BUSINESS
UNDERSTANDING
POTENTIAL
ANALYSIS / MODELS
 Churn prediction model creation
 Sequence of impacting events
 Content Categorization
 Competitor calls recognition
 CEI – Experience Index
 Social Network Analytics (SNA)
 Behavior clusters
 Offer optimization
7
DATA PREPARATION
PHASE
1. Data understanding
2. Data integration:
 Data Integration from different data sources
 Data quality report
3. Data preparation:
 Deriving new attributes and trend variables
 Balancing data set
 Handling nulls and outliers
 Normalization and standardization of data
 Data reduction techniques
4. Feature selection
5. Create Event Tables
 Generating and investigating events
 Creating Event History Table
 Fine-tuning of event definitions based on their correlation
with churn
6. Create Event Sequences
 Generating event sequences from event table
 Generating subpaths from event sequences
 Analyzing temporal churn effects of event paths
Features from different data sources
DWH
Lifecycle stage (near contract
termination indicator)
Drop calls and Silent calls
Products and Discounts
Spending and profitability
Device info
Contract history
NPS score
Close friend churned (based on
freq. calls)
Network KPI-s
Calls to Competition
CRM
Shop visits
Handset service
Campaigns available to the customer
(Upsell, NBA, X-sell)
Previous termination requests
Call Center Activity
IVR
Call logs (frequency, recency,
duration, branch)
Text mining, text segmentation
Complaints (network, device,
contract…)
Web/ App usage
Web/App Categories browsing
Voice, data, SMS usage and limits Bill shock Web/App keyword search
CHURN IMPACT THROUGH „BIG DATA“ FEED
Non traditional data
(raw) CDR:
Competitor CC, Poaching
calls, Usage change…
Market Research:
Satisfaction surveys,
competitor new offer
(conjoint)…
Call Center:
Compliance, compliance
path, operator data, …
POS:
Visits, inquires…
Web:
Self Service portal,
browsing behavior, …
CRM:
Campaign/Response
history, Opt In/Out,
Customer data change, …
Provisioning:
Not successful activations,
…
Network:
All bad network events
(mulfunction, droped calls,
silent calls…)
Network External
Process Interaction
Event triggers
SPECIAL FEATURES EXTRACTION
Customer email
Call record
Call summary note
SN comments
Define key words Recognize intent
CHURN
NON
CHURN
Web crawled numbers
to all POS and agents
of direct competition
Find trend of calls
Find sequence of calls and SMS to
these numbers for
relevant groups
NON
CHURN
CHURN
Content Categorization
Like: Tariff, product,
service, competitor,..
Competitors calls recognition
AGREGGATED CUSTOMER EXPERIENCE INDEX
N PC U= + + +
Cantakeanyvalue between
0 and1,where1is Excellent
and0is Awful
NetworkExp. Measuredby
numberof
drops&failures
CallcenterExp. Ismeasured
byvoice sentiments
(positive,neutral,negative)
ProductExp. Ismeasuredby
numberof attemptsto
searchcompetitors
productsorsites
UsageExp. Ismeasuredby
appsusage
Calculated
DAILY
At
SUBSCRIBER
level
Benchmarked
againstAVERAGE CEM score
With ALARMS
if scoresuddenlydrops
IndividualCustomer Experience Index varies from 0 to 1 and is determined
bythe following parameters:
Using CDR data and modern tools for data integration, we can create graph of customers interactions and calculate different relationship metrics.
Combine social groups with Geo-location calculations
Features for model
• Size of network
(number of nodes)
• Number of links+
• unique links
• Leadership score
• Role in community
• Community shape
• Centrality
• Density
SOCIAL NETWORK ANALYTICS FEATURES
• Who contacts whom?
• How often?
• How long?
• Both directions?
Identify the social network
• Who influences whom?
• Who work together?
• Close people
Identify important people, calling
circles
SNA: graph analysis where nodes are metrics
Using CDR data and modern tools for data integration, we can create graph of customers interactions and calculate
different relationship metrics.
Combine social groups with Geo-location calculations
CUSTOMER PROFILE – E.G. GAMER
Network:
- Capabilities
- Access
- Bandwidth
- …
Social:
- Social Media
- Gaming forum
- Social Network
- Multi-Gaming
identity
- …
Consumption:
- Data volume
- Messaging
- VoIP
- …
Devices:
- Multi vs Single
- Online / Offline
- …
Sources of Experience:
- Profile
(demographics)
- Behavior (Usage,
CDRs)
- Interaction (CRM)
- Price plan, add-on
services
- History
- …
Traditional
sources:
Areas of importance: (AoI)
- MMO1 vs. Single player
- Online vs. Offline - / multi-screen
- VoIP
- Game communication
- Data volume, Latency
- Access method
- Gaming forum, youtube channels
- …
Experience:
1 MMO – Massively-Multiplayer Online Game
DATA INTEGRATION
From Analytical Data Mart to training table
DWH
Training table
Evaluation table
Scoring table
Features
Engineering
The metric trap – if any of values is Zero- model is
Biased
The Goal
Is to get
curve like
this
Non-churn
Churn
0
100000
200000
300000
400000
500000
600000
0 1
Share of churn in relevant population is
less than 10% in majority of cases
even less than 1% in some cases
TYPICAL CHALLENGE IS HIGHLY IMBALANCED DATASET
Confusion Matrix
99%
ACCURACY
Predicted Class
No Yes
Observed
Class
No 114700 0
Yes 4334 0
0
20
40
60
80
100
0 10 20 30 40 50 60 70 80 90 100
%ofevents
% of data sets
Gain Chart
RESAMPLING TECHNIQUES
UNDERSAMPLING
Removing samples from the
majority class
https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets
OVERSAMPLING
Adding more examples from
the minority class
Weaknesses:
1. Loss of information
2. Overfitting
Useful only with big
enough data sets.
When 1% is actually
more than 10
thousands units.
Tools:
• SQL / Python
• imbalanced-learn
OVER-SAMPLING FOLLOWED BY UNDER-SAMPLING
SMOTE ADASYN consists of synthesizing elements for the
minority class, based on those that already exist. It is based on the
nearest neighbors:
• Randomly pick a point from the minority class
• Computing the k-nearest neighbors for this point
• The synthetic points are added between the chosen point and its
neighbors
• Adds a random small values to the points
• TOMEK LINKS are pairs of very close instances, but of opposite
classes. Removing the instances of the majority class of each pair
increases the space between the two classes, facilitating the
classification process.
EXAMPLE OF BALANCING TECHNIQUES COMBINATION
0 1
12 months
historical data
10:1 ratio
Boost minority
class with
SMOTE
Eliminate similar
points with
TomekLinks
0 1
5:1 ratio
More
balanced
training set
SMOTETomek
 XGBoost offers fast computing speed
combined with explainable results with
regards to ranking feature importance's.
 Compatible with the SHAP framework offering
even more in-depth explanations of model
predictions
MODEL DEVELOPMENT USING XGBoost ALGORITHM
IS THE BEST PRACTICE FOR IMBALANCED DATASET
XGBoost
Regularization for
avoiding
Overfitting
(both Lasso and Rige)
Efficient handling
of missing data
(?)
Cash awareness
and out-of-core
computing
Parallelized
processing
In-built
cross-validation
capability
Tree pruning
using depth-first
approach
Sequentially learning algorithm that is based on function approximation by
optimizing specific loss functions as well as applying several regularization
techniques.
LatBill shock= 1,15
MODEL INTERPRETATION IS VITAL FOR
FINE TUNING OF OFFERING
1. Overall interpretation
Understanding the most important features with feature
importance plot.
2. Local interpretation:
1. understand for an individual case the reasons of the
prediction.
2. understand on a filtered population the most frequent
reasons of their prediction
SHAP summary plot
3 variables with most contribution
1st variable 2nd variable 3rd varible
ID
Probability
to churn
Class
predicted
Name Impact Name Impact Name Impact
12098321 95% 1 Reb_1 +34 Bill_3 +19% Lat_2 +8%
12098322 88% 1 Bill_1 +25 NPS_2 +14% Sill_c3 +13%
12098323 35% 0 Inf_7 -27 Lat_2 -23% Reb_1 -12%
21
ANALYTICAL
OBJECTIVE
MODEL PERFORMANCE
EVALUATION
 Lift on top 1%, 10%, and 20% most likely
churners
 Campaign performance evaluation (A/B
testing):
• Churn rate in different model
percentiles
• Churn rate DNC vs. TGT
• Offer response rate DNC vs. TGT
• Churn rate old vs. BD approach
• Offer response rate old vs. BD
approach
• Monthly level measurement
Assign a churn score to all customers in the eligible
segment
Automatically target top X% of customers with high
probability with special offer
The score should be recalculated on a daily level
New events should trigger near real-time scoring
Optimize offer type and price for individual customer
MODEL DEPLOYMENT IN AIRFLOW ENVIRONMENT
BENEFITS OF BIG DATA PLATFORM
1 2 3 4 5Include better
granularity of specific
features.
Quickly calculate daily
attributes and longer
history from more data
sources
Faster combine results
of different analytical
models to optimize
process and value
Recompute score in
real-time based on the
latest customer activity
/ event
Efficient monitoring of
model performance
and execution
THANK YOU
J e l e n a . p e k e z @ c o m t r a d e . c o m
Copyright © 2019 Comtrade. All rights reserved.
The content of this presentation is copyright protected.
Any reproduction, distribution, or modification is not allowed.
The information, solutions, and opinions contained in this presentation are of informative nature only and are not
intended to be a comprehensive study, nor should they be relied on or treated as a means to provide a complete
solution or advice, since we may not be aware of all specific circumstances of the case. We try to provide quality
information, but we make no claims, promises, or guaranties about the accuracy, completeness, or adequacy of the
information contained herein.
www.comtradeintegration.com

More Related Content

What's hot

Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...Edureka!
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Rajesh Kumar
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data AnalyticsTUSHAR GARG
 
Developing a Strategic Analytics Framework that Drives Healthcare Transformation
Developing a Strategic Analytics Framework that Drives Healthcare TransformationDeveloping a Strategic Analytics Framework that Drives Healthcare Transformation
Developing a Strategic Analytics Framework that Drives Healthcare TransformationTrevor Strome
 
Create a 'Customer 360' with Master Data Management for Financial Services
Create a 'Customer 360' with Master Data Management for Financial ServicesCreate a 'Customer 360' with Master Data Management for Financial Services
Create a 'Customer 360' with Master Data Management for Financial ServicesPerficient, Inc.
 
What is No-Code/Low-Code App Development and Why Should Your Business Care?
What is No-Code/Low-Code App Development and Why Should Your Business Care?What is No-Code/Low-Code App Development and Why Should Your Business Care?
What is No-Code/Low-Code App Development and Why Should Your Business Care?kintone
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyShivam Dhawan
 
Managed Services Model For IT Services
Managed Services Model For IT Services Managed Services Model For IT Services
Managed Services Model For IT Services Ajay Rathi
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data PlatformsTreasure Data, Inc.
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Data-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B TestingData-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B TestingJack Nguyen (Hung Tien)
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfsshrey mehrotra
 
Organizational Change Management for Data- and Analytics-Driven Projects
Organizational Change Management for Data- and Analytics-Driven ProjectsOrganizational Change Management for Data- and Analytics-Driven Projects
Organizational Change Management for Data- and Analytics-Driven ProjectsDATAVERSITY
 
What is (and who needs) a customer data platform?
What is (and who needs) a customer data platform?What is (and who needs) a customer data platform?
What is (and who needs) a customer data platform?Angela Sun
 
Building a High Performing SDR Team
Building a High Performing SDR Team Building a High Performing SDR Team
Building a High Performing SDR Team DiscoverOrg
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Customer Data Platform 101
Customer Data Platform 101Customer Data Platform 101
Customer Data Platform 101Kiyoto Tamura
 
Big Data in e-Commerce
Big Data in e-CommerceBig Data in e-Commerce
Big Data in e-CommerceDivante
 

What's hot (20)

Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
Big Data vs Data Science vs Data Analytics | Demystifying The Difference | Ed...
 
Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture Azure data analytics platform - A reference architecture
Azure data analytics platform - A reference architecture
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
Developing a Strategic Analytics Framework that Drives Healthcare Transformation
Developing a Strategic Analytics Framework that Drives Healthcare TransformationDeveloping a Strategic Analytics Framework that Drives Healthcare Transformation
Developing a Strategic Analytics Framework that Drives Healthcare Transformation
 
Create a 'Customer 360' with Master Data Management for Financial Services
Create a 'Customer 360' with Master Data Management for Financial ServicesCreate a 'Customer 360' with Master Data Management for Financial Services
Create a 'Customer 360' with Master Data Management for Financial Services
 
What is No-Code/Low-Code App Development and Why Should Your Business Care?
What is No-Code/Low-Code App Development and Why Should Your Business Care?What is No-Code/Low-Code App Development and Why Should Your Business Care?
What is No-Code/Low-Code App Development and Why Should Your Business Care?
 
BI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and StrategyBI Consultancy - Data, Analytics and Strategy
BI Consultancy - Data, Analytics and Strategy
 
Managed Services Model For IT Services
Managed Services Model For IT Services Managed Services Model For IT Services
Managed Services Model For IT Services
 
Data stewardship
Data stewardshipData stewardship
Data stewardship
 
Introduction to Customer Data Platforms
Introduction to Customer Data PlatformsIntroduction to Customer Data Platforms
Introduction to Customer Data Platforms
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Business Analytics Overview
Business Analytics OverviewBusiness Analytics Overview
Business Analytics Overview
 
Data-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B TestingData-Driven UI/UX Design with A/B Testing
Data-Driven UI/UX Design with A/B Testing
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfs
 
Organizational Change Management for Data- and Analytics-Driven Projects
Organizational Change Management for Data- and Analytics-Driven ProjectsOrganizational Change Management for Data- and Analytics-Driven Projects
Organizational Change Management for Data- and Analytics-Driven Projects
 
What is (and who needs) a customer data platform?
What is (and who needs) a customer data platform?What is (and who needs) a customer data platform?
What is (and who needs) a customer data platform?
 
Building a High Performing SDR Team
Building a High Performing SDR Team Building a High Performing SDR Team
Building a High Performing SDR Team
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Customer Data Platform 101
Customer Data Platform 101Customer Data Platform 101
Customer Data Platform 101
 
Big Data in e-Commerce
Big Data in e-CommerceBig Data in e-Commerce
Big Data in e-Commerce
 

Similar to Solving churn challenge in Big Data environment - Jelena Pekez

Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution AnalyticsTime-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution AnalyticsRevolution Analytics
 
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
5733   a deep dive into IBM Watson Foundation for CSP (WFC)5733   a deep dive into IBM Watson Foundation for CSP (WFC)
5733 a deep dive into IBM Watson Foundation for CSP (WFC)Arvind Sathi
 
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAmazon Web Services
 
Connecting the "dots" around your Consumers
Connecting the "dots" around your ConsumersConnecting the "dots" around your Consumers
Connecting the "dots" around your ConsumersDigital Air Strike
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformArvind Sathi
 
Listening in Real-Time
Listening in Real-TimeListening in Real-Time
Listening in Real-TimeFatima Ross
 
Listening in Real-Time
Listening in Real-TimeListening in Real-Time
Listening in Real-TimeFatima Ross
 
Dataiku tatvic webinar presentation
Dataiku tatvic webinar presentationDataiku tatvic webinar presentation
Dataiku tatvic webinar presentationTatvic Analytics
 
Impacto del Big Data en la empresa española
Impacto del Big Data en la empresa españolaImpacto del Big Data en la empresa española
Impacto del Big Data en la empresa españolaParadigma Digital
 
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...Rising Media Ltd.
 
Taming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model FrameworkTaming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model FrameworkRamkumar Ravichandran
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureMongoDB
 
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.BI
 
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...Databricks
 
Big Data Analytics for Contact Centers
Big Data Analytics for Contact CentersBig Data Analytics for Contact Centers
Big Data Analytics for Contact CentersRajender K Salgam
 
TechConnectr's Big Data Connection. Digital Marketing KPIs, Targeting, Analy...
TechConnectr's Big Data Connection.  Digital Marketing KPIs, Targeting, Analy...TechConnectr's Big Data Connection.  Digital Marketing KPIs, Targeting, Analy...
TechConnectr's Big Data Connection. Digital Marketing KPIs, Targeting, Analy...Bob Samuels
 

Similar to Solving churn challenge in Big Data environment - Jelena Pekez (20)

Time-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution AnalyticsTime-to-Event Models, presented by DataSong and Revolution Analytics
Time-to-Event Models, presented by DataSong and Revolution Analytics
 
uae views on big data
  uae views on  big data  uae views on  big data
uae views on big data
 
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
5733   a deep dive into IBM Watson Foundation for CSP (WFC)5733   a deep dive into IBM Watson Foundation for CSP (WFC)
5733 a deep dive into IBM Watson Foundation for CSP (WFC)
 
Big Data use cases in telcos
Big Data use cases in telcosBig Data use cases in telcos
Big Data use cases in telcos
 
Big Data use cases in telcos
Big Data use cases in telcosBig Data use cases in telcos
Big Data use cases in telcos
 
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and RedshiftAWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
AWS Webcast - Sales Productivity Solutions with MicroStrategy and Redshift
 
Connecting the "dots" around your Consumers
Connecting the "dots" around your ConsumersConnecting the "dots" around your Consumers
Connecting the "dots" around your Consumers
 
Implementing Advanced Analytics Platform
Implementing Advanced Analytics PlatformImplementing Advanced Analytics Platform
Implementing Advanced Analytics Platform
 
Workshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David RaabWorkshop: Make the Most of Customer Data Platforms - David Raab
Workshop: Make the Most of Customer Data Platforms - David Raab
 
Listening in Real-Time
Listening in Real-TimeListening in Real-Time
Listening in Real-Time
 
Listening in Real-Time
Listening in Real-TimeListening in Real-Time
Listening in Real-Time
 
Dataiku tatvic webinar presentation
Dataiku tatvic webinar presentationDataiku tatvic webinar presentation
Dataiku tatvic webinar presentation
 
Impacto del Big Data en la empresa española
Impacto del Big Data en la empresa españolaImpacto del Big Data en la empresa española
Impacto del Big Data en la empresa española
 
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
Behind the Buzzword: Understanding Customer Data Platforms in the Light of Pr...
 
Taming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model FrameworkTaming the Data Lake with Scalable Metrics Model Framework
Taming the Data Lake with Scalable Metrics Model Framework
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise Architecture
 
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For EcommerceDeep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
Deep.bi - Real-time, Deep Data Analytics Platform For Ecommerce
 
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
Creating an Omnichannel Banking Experience with Machine Learning on Azure Dat...
 
Big Data Analytics for Contact Centers
Big Data Analytics for Contact CentersBig Data Analytics for Contact Centers
Big Data Analytics for Contact Centers
 
TechConnectr's Big Data Connection. Digital Marketing KPIs, Targeting, Analy...
TechConnectr's Big Data Connection.  Digital Marketing KPIs, Targeting, Analy...TechConnectr's Big Data Connection.  Digital Marketing KPIs, Targeting, Analy...
TechConnectr's Big Data Connection. Digital Marketing KPIs, Targeting, Analy...
 

More from Institute of Contemporary Sciences

Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Institute of Contemporary Sciences
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicInstitute of Contemporary Sciences
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Institute of Contemporary Sciences
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovInstitute of Contemporary Sciences
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Institute of Contemporary Sciences
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Institute of Contemporary Sciences
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Institute of Contemporary Sciences
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Institute of Contemporary Sciences
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicInstitute of Contemporary Sciences
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicInstitute of Contemporary Sciences
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionInstitute of Contemporary Sciences
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentInstitute of Contemporary Sciences
 
Machine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports TeamMachine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports TeamInstitute of Contemporary Sciences
 

More from Institute of Contemporary Sciences (20)

First 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip PanjevicFirst 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip Panjevic
 
Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar Dilov
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...
 
From Zero to ML Hero for Underdogs - Amir Tabakovic
From Zero to ML Hero for Underdogs  - Amir TabakovicFrom Zero to ML Hero for Underdogs  - Amir Tabakovic
From Zero to ML Hero for Underdogs - Amir Tabakovic
 
Data and data scientists are not equal to money david hoyle
Data and data scientists are not equal to money   david hoyleData and data scientists are not equal to money   david hoyle
Data and data scientists are not equal to money david hoyle
 
The price is right - Tomislav Krizan
The price is right - Tomislav KrizanThe price is right - Tomislav Krizan
The price is right - Tomislav Krizan
 
When it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela CulibrkWhen it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela Culibrk
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos Solujic
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir Brusic
 
Improving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity SearchImproving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity Search
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognition
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local government
 
Geospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and ClimateGeospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and Climate
 
Machine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports TeamMachine Learning-Driven Injury Prediction for a Professional Sports Team
Machine Learning-Driven Injury Prediction for a Professional Sports Team
 

Recently uploaded

Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Boston Institute of Analytics
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataTecnoIncentive
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxSimranPal17
 

Recently uploaded (20)

Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Analysis Project: Stroke Prediction
Data Analysis Project: Stroke PredictionData Analysis Project: Stroke Prediction
Data Analysis Project: Stroke Prediction
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
Data Analysis Project Presentation: Unveiling Your Ideal Customer, Bank Custo...
 
Cyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded dataCyber awareness ppt on the recorded data
Cyber awareness ppt on the recorded data
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
What To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptxWhat To Do For World Nature Conservation Day by Slidesgo.pptx
What To Do For World Nature Conservation Day by Slidesgo.pptx
 

Solving churn challenge in Big Data environment - Jelena Pekez

  • 1. Solving Churn challenge in Big data environment Jelena Pekez Principal Business Consultant, Lead Data Scientist Comtrade System Integrations
  • 2. CHURN IS A CHALLENGE IN EVERY INDUSTRY, THE DIFFERENCE IS HOW IT IS MANAGED Examine how this model will be used Business focus High margin customers Strategy relevant segments At Risk customers How fresh data do we need? Monthly/Daily/Real time?
  • 3. Outcome Identified Key Patterns of Behavior that Lead to Churn Enabled pro-active outreach to save profitable at-risk customers The Goal Enhance churn prediction through multi- channel customer behavior analytics and find an incremental number of high risk churners compared to the traditional statistical models BUSINESS GOAL AND OUTCOMES The goal of retention strategy is to keep churn under control.
  • 4. • Domain specific features • Special models outputs as new features • Balancing techniques Evaluation Data Deployment Modeling Data preparation Data understanding Business understanding CRISP METHODOLOGY
  • 5. BUSINESS UNDERSTANDING What is goal? Relevant segment Challenge the business definition Formal definition is often not relevant for targeting purposes e.g. Churn 90 days  10 days Are there already tested campaigns Results, take rate e.g. do we have integrated campaigns results, experiences Existing reports review Reduces data analysis and understanding Helps to set expectations and feasibility of prediction Trend and seasonality understanding Which population is relevant Exclude inactive customers Find irrelevant groups and black list Examine existing segments / behavior groups Define metrics for success Evaluation metrics and expectations What will be product offering Target list size for campaign Frequency
  • 6. 6  Set objectives  Produce Project Plan  Business success criteria/DS success criteria  Assess the current situation  Risks assumptions, constraints and contingencies  Terminology  Cost and benefits BUSINESS UNDERSTANDING POTENTIAL ANALYSIS / MODELS  Churn prediction model creation  Sequence of impacting events  Content Categorization  Competitor calls recognition  CEI – Experience Index  Social Network Analytics (SNA)  Behavior clusters  Offer optimization
  • 7. 7 DATA PREPARATION PHASE 1. Data understanding 2. Data integration:  Data Integration from different data sources  Data quality report 3. Data preparation:  Deriving new attributes and trend variables  Balancing data set  Handling nulls and outliers  Normalization and standardization of data  Data reduction techniques 4. Feature selection 5. Create Event Tables  Generating and investigating events  Creating Event History Table  Fine-tuning of event definitions based on their correlation with churn 6. Create Event Sequences  Generating event sequences from event table  Generating subpaths from event sequences  Analyzing temporal churn effects of event paths
  • 8. Features from different data sources DWH Lifecycle stage (near contract termination indicator) Drop calls and Silent calls Products and Discounts Spending and profitability Device info Contract history NPS score Close friend churned (based on freq. calls) Network KPI-s Calls to Competition CRM Shop visits Handset service Campaigns available to the customer (Upsell, NBA, X-sell) Previous termination requests Call Center Activity IVR Call logs (frequency, recency, duration, branch) Text mining, text segmentation Complaints (network, device, contract…) Web/ App usage Web/App Categories browsing Voice, data, SMS usage and limits Bill shock Web/App keyword search
  • 9. CHURN IMPACT THROUGH „BIG DATA“ FEED Non traditional data (raw) CDR: Competitor CC, Poaching calls, Usage change… Market Research: Satisfaction surveys, competitor new offer (conjoint)… Call Center: Compliance, compliance path, operator data, … POS: Visits, inquires… Web: Self Service portal, browsing behavior, … CRM: Campaign/Response history, Opt In/Out, Customer data change, … Provisioning: Not successful activations, … Network: All bad network events (mulfunction, droped calls, silent calls…) Network External Process Interaction Event triggers
  • 10. SPECIAL FEATURES EXTRACTION Customer email Call record Call summary note SN comments Define key words Recognize intent CHURN NON CHURN Web crawled numbers to all POS and agents of direct competition Find trend of calls Find sequence of calls and SMS to these numbers for relevant groups NON CHURN CHURN Content Categorization Like: Tariff, product, service, competitor,.. Competitors calls recognition
  • 11. AGREGGATED CUSTOMER EXPERIENCE INDEX N PC U= + + + Cantakeanyvalue between 0 and1,where1is Excellent and0is Awful NetworkExp. Measuredby numberof drops&failures CallcenterExp. Ismeasured byvoice sentiments (positive,neutral,negative) ProductExp. Ismeasuredby numberof attemptsto searchcompetitors productsorsites UsageExp. Ismeasuredby appsusage Calculated DAILY At SUBSCRIBER level Benchmarked againstAVERAGE CEM score With ALARMS if scoresuddenlydrops IndividualCustomer Experience Index varies from 0 to 1 and is determined bythe following parameters:
  • 12. Using CDR data and modern tools for data integration, we can create graph of customers interactions and calculate different relationship metrics. Combine social groups with Geo-location calculations Features for model • Size of network (number of nodes) • Number of links+ • unique links • Leadership score • Role in community • Community shape • Centrality • Density SOCIAL NETWORK ANALYTICS FEATURES • Who contacts whom? • How often? • How long? • Both directions? Identify the social network • Who influences whom? • Who work together? • Close people Identify important people, calling circles SNA: graph analysis where nodes are metrics Using CDR data and modern tools for data integration, we can create graph of customers interactions and calculate different relationship metrics. Combine social groups with Geo-location calculations
  • 13. CUSTOMER PROFILE – E.G. GAMER Network: - Capabilities - Access - Bandwidth - … Social: - Social Media - Gaming forum - Social Network - Multi-Gaming identity - … Consumption: - Data volume - Messaging - VoIP - … Devices: - Multi vs Single - Online / Offline - … Sources of Experience: - Profile (demographics) - Behavior (Usage, CDRs) - Interaction (CRM) - Price plan, add-on services - History - … Traditional sources: Areas of importance: (AoI) - MMO1 vs. Single player - Online vs. Offline - / multi-screen - VoIP - Game communication - Data volume, Latency - Access method - Gaming forum, youtube channels - … Experience: 1 MMO – Massively-Multiplayer Online Game
  • 14. DATA INTEGRATION From Analytical Data Mart to training table DWH Training table Evaluation table Scoring table Features Engineering
  • 15. The metric trap – if any of values is Zero- model is Biased The Goal Is to get curve like this Non-churn Churn 0 100000 200000 300000 400000 500000 600000 0 1 Share of churn in relevant population is less than 10% in majority of cases even less than 1% in some cases TYPICAL CHALLENGE IS HIGHLY IMBALANCED DATASET Confusion Matrix 99% ACCURACY Predicted Class No Yes Observed Class No 114700 0 Yes 4334 0 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100 %ofevents % of data sets Gain Chart
  • 16. RESAMPLING TECHNIQUES UNDERSAMPLING Removing samples from the majority class https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets OVERSAMPLING Adding more examples from the minority class Weaknesses: 1. Loss of information 2. Overfitting Useful only with big enough data sets. When 1% is actually more than 10 thousands units. Tools: • SQL / Python • imbalanced-learn
  • 17. OVER-SAMPLING FOLLOWED BY UNDER-SAMPLING SMOTE ADASYN consists of synthesizing elements for the minority class, based on those that already exist. It is based on the nearest neighbors: • Randomly pick a point from the minority class • Computing the k-nearest neighbors for this point • The synthetic points are added between the chosen point and its neighbors • Adds a random small values to the points • TOMEK LINKS are pairs of very close instances, but of opposite classes. Removing the instances of the majority class of each pair increases the space between the two classes, facilitating the classification process.
  • 18. EXAMPLE OF BALANCING TECHNIQUES COMBINATION 0 1 12 months historical data 10:1 ratio Boost minority class with SMOTE Eliminate similar points with TomekLinks 0 1 5:1 ratio More balanced training set SMOTETomek
  • 19.  XGBoost offers fast computing speed combined with explainable results with regards to ranking feature importance's.  Compatible with the SHAP framework offering even more in-depth explanations of model predictions MODEL DEVELOPMENT USING XGBoost ALGORITHM IS THE BEST PRACTICE FOR IMBALANCED DATASET XGBoost Regularization for avoiding Overfitting (both Lasso and Rige) Efficient handling of missing data (?) Cash awareness and out-of-core computing Parallelized processing In-built cross-validation capability Tree pruning using depth-first approach Sequentially learning algorithm that is based on function approximation by optimizing specific loss functions as well as applying several regularization techniques. LatBill shock= 1,15
  • 20. MODEL INTERPRETATION IS VITAL FOR FINE TUNING OF OFFERING 1. Overall interpretation Understanding the most important features with feature importance plot. 2. Local interpretation: 1. understand for an individual case the reasons of the prediction. 2. understand on a filtered population the most frequent reasons of their prediction SHAP summary plot 3 variables with most contribution 1st variable 2nd variable 3rd varible ID Probability to churn Class predicted Name Impact Name Impact Name Impact 12098321 95% 1 Reb_1 +34 Bill_3 +19% Lat_2 +8% 12098322 88% 1 Bill_1 +25 NPS_2 +14% Sill_c3 +13% 12098323 35% 0 Inf_7 -27 Lat_2 -23% Reb_1 -12%
  • 21. 21 ANALYTICAL OBJECTIVE MODEL PERFORMANCE EVALUATION  Lift on top 1%, 10%, and 20% most likely churners  Campaign performance evaluation (A/B testing): • Churn rate in different model percentiles • Churn rate DNC vs. TGT • Offer response rate DNC vs. TGT • Churn rate old vs. BD approach • Offer response rate old vs. BD approach • Monthly level measurement Assign a churn score to all customers in the eligible segment Automatically target top X% of customers with high probability with special offer The score should be recalculated on a daily level New events should trigger near real-time scoring Optimize offer type and price for individual customer
  • 22. MODEL DEPLOYMENT IN AIRFLOW ENVIRONMENT
  • 23. BENEFITS OF BIG DATA PLATFORM 1 2 3 4 5Include better granularity of specific features. Quickly calculate daily attributes and longer history from more data sources Faster combine results of different analytical models to optimize process and value Recompute score in real-time based on the latest customer activity / event Efficient monitoring of model performance and execution
  • 24. THANK YOU J e l e n a . p e k e z @ c o m t r a d e . c o m Copyright © 2019 Comtrade. All rights reserved. The content of this presentation is copyright protected. Any reproduction, distribution, or modification is not allowed. The information, solutions, and opinions contained in this presentation are of informative nature only and are not intended to be a comprehensive study, nor should they be relied on or treated as a means to provide a complete solution or advice, since we may not be aware of all specific circumstances of the case. We try to provide quality information, but we make no claims, promises, or guaranties about the accuracy, completeness, or adequacy of the information contained herein. www.comtradeintegration.com

Editor's Notes

  1. from imblearn.combine import SMOTETomek Synthetic Minority Oversampling Technique
  2. XGBoost4j on Scala-Spark Early stopping may still contain bugs
  3. Real time triggers – Examples: 1. Bill shock + call to Call Centar- -customer has Bill Shock but call to CC triggers Real time scoring and agent can see new score for that customer during a call 2. Reclamation + low NPS score Customer submitted reclamation and gives low NPS score, which triggers real time restoring for that customer