Solving churn challenge in Big Data environment - Jelena Pekez

Solving Churn challenge in Big data
environment
Jelena Pekez
Principal Business Consultant, Lead Data Scientist
Comtrade System Integrations

CHURN IS A CHALLENGE IN EVERY INDUSTRY, THE
DIFFERENCE IS HOW IT IS MANAGED
Examine how this model will be used
Business
focus
High margin
customers
Strategy
relevant
segments
At Risk
customers
How fresh data do we need?
Monthly/Daily/Real time?

Outcome
Identified Key Patterns of Behavior that
Lead to Churn
Enabled pro-active outreach to save
profitable at-risk customers
The Goal
Enhance churn prediction through multi-
channel customer behavior analytics and
find an incremental number of high risk
churners compared to the traditional
statistical models
BUSINESS GOAL AND OUTCOMES
The goal of retention strategy is to keep churn under control.

• Domain specific features
• Special models outputs as
new features
• Balancing techniques
Evaluation
Data
Deployment
Modeling
Data
preparation
Data
understanding
Business
understanding
CRISP METHODOLOGY

BUSINESS UNDERSTANDING
What is goal?
Relevant segment
Challenge the business
definition
Formal definition is often
not relevant for targeting
purposes
e.g. Churn 90 days  10
days
Are there already
tested campaigns
Results, take rate
e.g. do we have integrated
campaigns results,
experiences
Existing reports
review
Reduces data analysis and
understanding
Helps to set expectations
and feasibility of prediction
Trend and seasonality
understanding
Which population is
relevant
Exclude inactive customers
Find irrelevant groups and
black list
Examine existing
segments / behavior
groups
Define metrics
for success
Evaluation metrics and
expectations
What will be product
offering
Target list size for
campaign
Frequency

6
 Set objectives
 Produce Project Plan
 Business success criteria/DS success
criteria
 Assess the current situation
 Risks assumptions, constraints and
contingencies
 Terminology
 Cost and benefits
BUSINESS
UNDERSTANDING
POTENTIAL
ANALYSIS / MODELS
 Churn prediction model creation
 Sequence of impacting events
 Content Categorization
 Competitor calls recognition
 CEI – Experience Index
 Social Network Analytics (SNA)
 Behavior clusters
 Offer optimization

7
DATA PREPARATION
PHASE
1. Data understanding
2. Data integration:
 Data Integration from different data sources
 Data quality report
3. Data preparation:
 Deriving new attributes and trend variables
 Balancing data set
 Handling nulls and outliers
 Normalization and standardization of data
 Data reduction techniques
4. Feature selection
5. Create Event Tables
 Generating and investigating events
 Creating Event History Table
 Fine-tuning of event definitions based on their correlation
with churn
6. Create Event Sequences
 Generating event sequences from event table
 Generating subpaths from event sequences
 Analyzing temporal churn effects of event paths

Features from different data sources
DWH
Lifecycle stage (near contract
termination indicator)
Drop calls and Silent calls
Products and Discounts
Spending and profitability
Device info
Contract history
NPS score
Close friend churned (based on
freq. calls)
Network KPI-s
Calls to Competition
CRM
Shop visits
Handset service
Campaigns available to the customer
(Upsell, NBA, X-sell)
Previous termination requests
Call Center Activity
IVR
Call logs (frequency, recency,
duration, branch)
Text mining, text segmentation
Complaints (network, device,
contract…)
Web/ App usage
Web/App Categories browsing
Voice, data, SMS usage and limits Bill shock Web/App keyword search

CHURN IMPACT THROUGH „BIG DATA“ FEED
Non traditional data
(raw) CDR:
Competitor CC, Poaching
calls, Usage change…
Market Research:
Satisfaction surveys,
competitor new offer
(conjoint)…
Call Center:
Compliance, compliance
path, operator data, …
POS:
Visits, inquires…
Web:
Self Service portal,
browsing behavior, …
CRM:
Campaign/Response
history, Opt In/Out,
Customer data change, …
Provisioning:
Not successful activations,
…
Network:
All bad network events
(mulfunction, droped calls,
silent calls…)
Network External
Process Interaction
Event triggers

SPECIAL FEATURES EXTRACTION
Customer email
Call record
Call summary note
SN comments
Define key words Recognize intent
CHURN
NON
CHURN
Web crawled numbers
to all POS and agents
of direct competition
Find trend of calls
Find sequence of calls and SMS to
these numbers for
relevant groups
NON
CHURN
CHURN
Content Categorization
Like: Tariff, product,
service, competitor,..
Competitors calls recognition

AGREGGATED CUSTOMER EXPERIENCE INDEX
N PC U= + + +
Cantakeanyvalue between
0 and1,where1is Excellent
and0is Awful
NetworkExp. Measuredby
numberof
drops&failures
CallcenterExp. Ismeasured
byvoice sentiments
(positive,neutral,negative)
ProductExp. Ismeasuredby
numberof attemptsto
searchcompetitors
productsorsites
UsageExp. Ismeasuredby
appsusage
Calculated
DAILY
At
SUBSCRIBER
level
Benchmarked
againstAVERAGE CEM score
With ALARMS
if scoresuddenlydrops
IndividualCustomer Experience Index varies from 0 to 1 and is determined
bythe following parameters:

Using CDR data and modern tools for data integration, we can create graph of customers interactions and calculate different relationship metrics.
Combine social groups with Geo-location calculations
Features for model
• Size of network
(number of nodes)
• Number of links+
• unique links
• Leadership score
• Role in community
• Community shape
• Centrality
• Density
SOCIAL NETWORK ANALYTICS FEATURES
• Who contacts whom?
• How often?
• How long?
• Both directions?
Identify the social network
• Who influences whom?
• Who work together?
• Close people
Identify important people, calling
circles
SNA: graph analysis where nodes are metrics
Using CDR data and modern tools for data integration, we can create graph of customers interactions and calculate
different relationship metrics.
Combine social groups with Geo-location calculations

CUSTOMER PROFILE – E.G. GAMER
Network:
- Capabilities
- Access
- Bandwidth
- …
Social:
- Social Media
- Gaming forum
- Social Network
- Multi-Gaming
identity
- …
Consumption:
- Data volume
- Messaging
- VoIP
- …
Devices:
- Multi vs Single
- Online / Offline
- …
Sources of Experience:
- Profile
(demographics)
- Behavior (Usage,
CDRs)
- Interaction (CRM)
- Price plan, add-on
services
- History
- …
Traditional
sources:
Areas of importance: (AoI)
- MMO1 vs. Single player
- Online vs. Offline - / multi-screen
- VoIP
- Game communication
- Data volume, Latency
- Access method
- Gaming forum, youtube channels
- …
Experience:
1 MMO – Massively-Multiplayer Online Game

DATA INTEGRATION
From Analytical Data Mart to training table
DWH
Training table
Evaluation table
Scoring table
Features
Engineering

The metric trap – if any of values is Zero- model is
Biased
The Goal
Is to get
curve like
this
Non-churn
Churn
0
100000
200000
300000
400000
500000
600000
0 1
Share of churn in relevant population is
less than 10% in majority of cases
even less than 1% in some cases
TYPICAL CHALLENGE IS HIGHLY IMBALANCED DATASET
Confusion Matrix
99%
ACCURACY
Predicted Class
No Yes
Observed
Class
No 114700 0
Yes 4334 0
0
20
40
60
80
100
0 10 20 30 40 50 60 70 80 90 100
%ofevents
% of data sets
Gain Chart

RESAMPLING TECHNIQUES
UNDERSAMPLING
Removing samples from the
majority class
https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets
OVERSAMPLING
Adding more examples from
the minority class
Weaknesses:
1. Loss of information
2. Overfitting
Useful only with big
enough data sets.
When 1% is actually
more than 10
thousands units.
Tools:
• SQL / Python
• imbalanced-learn

OVER-SAMPLING FOLLOWED BY UNDER-SAMPLING
SMOTE ADASYN consists of synthesizing elements for the
minority class, based on those that already exist. It is based on the
nearest neighbors:
• Randomly pick a point from the minority class
• Computing the k-nearest neighbors for this point
• The synthetic points are added between the chosen point and its
neighbors
• Adds a random small values to the points
• TOMEK LINKS are pairs of very close instances, but of opposite
classes. Removing the instances of the majority class of each pair
increases the space between the two classes, facilitating the
classification process.

EXAMPLE OF BALANCING TECHNIQUES COMBINATION
0 1
12 months
historical data
10:1 ratio
Boost minority
class with
SMOTE
Eliminate similar
points with
TomekLinks
0 1
5:1 ratio
More
balanced
training set
SMOTETomek

 XGBoost offers fast computing speed
combined with explainable results with
regards to ranking feature importance's.
 Compatible with the SHAP framework offering
even more in-depth explanations of model
predictions
MODEL DEVELOPMENT USING XGBoost ALGORITHM
IS THE BEST PRACTICE FOR IMBALANCED DATASET
XGBoost
Regularization for
avoiding
Overfitting
(both Lasso and Rige)
Efficient handling
of missing data
(?)
Cash awareness
and out-of-core
computing
Parallelized
processing
In-built
cross-validation
capability
Tree pruning
using depth-first
approach
Sequentially learning algorithm that is based on function approximation by
optimizing specific loss functions as well as applying several regularization
techniques.
LatBill shock= 1,15

MODEL INTERPRETATION IS VITAL FOR
FINE TUNING OF OFFERING
1. Overall interpretation
Understanding the most important features with feature
importance plot.
2. Local interpretation:
1. understand for an individual case the reasons of the
prediction.
2. understand on a filtered population the most frequent
reasons of their prediction
SHAP summary plot
3 variables with most contribution
1st variable 2nd variable 3rd varible
ID
Probability
to churn
Class
predicted
Name Impact Name Impact Name Impact
12098321 95% 1 Reb_1 +34 Bill_3 +19% Lat_2 +8%
12098322 88% 1 Bill_1 +25 NPS_2 +14% Sill_c3 +13%
12098323 35% 0 Inf_7 -27 Lat_2 -23% Reb_1 -12%

21
ANALYTICAL
OBJECTIVE
MODEL PERFORMANCE
EVALUATION
 Lift on top 1%, 10%, and 20% most likely
churners
 Campaign performance evaluation (A/B
testing):
• Churn rate in different model
percentiles
• Churn rate DNC vs. TGT
• Offer response rate DNC vs. TGT
• Churn rate old vs. BD approach
• Offer response rate old vs. BD
approach
• Monthly level measurement
Assign a churn score to all customers in the eligible
segment
Automatically target top X% of customers with high
probability with special offer
The score should be recalculated on a daily level
New events should trigger near real-time scoring
Optimize offer type and price for individual customer

MODEL DEPLOYMENT IN AIRFLOW ENVIRONMENT

BENEFITS OF BIG DATA PLATFORM
1 2 3 4 5Include better
granularity of specific
features.
Quickly calculate daily
attributes and longer
history from more data
sources
Faster combine results
of different analytical
models to optimize
process and value
Recompute score in
real-time based on the
latest customer activity
/ event
Efficient monitoring of
model performance
and execution

THANK YOU
J e l e n a . p e k e z @ c o m t r a d e . c o m
Copyright © 2019 Comtrade. All rights reserved.
The content of this presentation is copyright protected.
Any reproduction, distribution, or modification is not allowed.
The information, solutions, and opinions contained in this presentation are of informative nature only and are not
intended to be a comprehensive study, nor should they be relied on or treated as a means to provide a complete
solution or advice, since we may not be aware of all specific circumstances of the case. We try to provide quality
information, but we make no claims, promises, or guaranties about the accuracy, completeness, or adequacy of the
information contained herein.
www.comtradeintegration.com

Solving churn challenge in Big Data environment - Jelena Pekez

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Solving churn challenge in Big Data environment - Jelena Pekez

Similar to Solving churn challenge in Big Data environment - Jelena Pekez (20)

More from Institute of Contemporary Sciences

More from Institute of Contemporary Sciences (20)

Recently uploaded

Recently uploaded (20)

Solving churn challenge in Big Data environment - Jelena Pekez

Editor's Notes