The scenario selected for the POC is:
– Studying the events occurred in the lives of clients in relation to churn.
– The aim is to deduce the behavior of customers likely to improve the
prediction of churn.
– The data used for analysis are those contained in the customer Teradata
– The data will be extracted and imported into Aster, or loaded directly
About the POCAbout the POC
Identify the "client“: is it an individual, a household, corporation, other?
Identify the key fields in the data.
Identify churn: what customers should be regarded as "churner"? Customers
who closed an account? All their accounts? Some, which ones? Customers who
have had a significant decrease in their activity? We call these customers
"clients parties" of the bank.
Consider "events" in the lives of clients. Identify sequences of events that are
discriminating parties for clients, compared to others.
Types of events can be various from different data from different systems in
different forms. Their identification may require the execution of algorithms on
In parallel, it is necessary to identify, for each required data that support the
corresponding analyzes. Since the data source is Teradata, you must identify
the tables and columns you need.
The procedure is as followsThe procedure is as follows
Program: Preparing the data - First glance to the data - Refinement of the method of
Results: During the week, we extracted and loaded into Aster about 500 tables. The
volume of data in Aster is around 250 GB.
Week of January 21Week of January 21
The presented data
closing in April 2012
Loading additional data
Customers vs. closed. non-closed
Creation of cohorts
Preliminary analysis of the CRM events
Principle of “univarié” analysis (Univarié in French = Who uses a single
Performance of the “univarié” analysis of the CRM events
Analysis of the fence in relation to CRM events
We now have the following data as Aster sources of events:
CRM Events (management of customer relations),
Operations, from April 2012 to January 2013
Events secure messaging
Thursday, February 7 – 1/3Thursday, February 7 – 1/3
Thursday, February 7 – 2/3Thursday, February 7 – 2/3
We note that the years 2011 and 2012 have data that does not seem consistent. While in 2010 the
number of each type of event varies slightly from one month to the next, in 2011 and 2012 the behavior
It is quite possible that these data are not real, but due to testing environmental qualification.
We will focus in previous years, in order to have more consistent data. This is valid for the customers
We tried to analyze the sequences of events clients closed before closing (8 to 2 months before
the closing date), as well as non-clients closed cohorts that we have built.
The results are not conclusive, probably due to the fact that the events of 2011 and 2012 are not
consistent. We decided to move the study range from 2008 to 2010, during which range the
number of events by type seems consistent.
Thursday, February 7 – 3/3Thursday, February 7 – 3/3
Program: Analysis of customer data, from 2008 to 2010 - Analysis of CRM events
from 2008 to 2010 - Performance of the “univarié” analysis of the CRM events
2009 & 2010 - Analysis of sequences of CRM events clients closed
Monday, February 11 – 1/4Monday, February 11 – 1/4
We note that consistent data are from January 2009 to December 2010. We focus between these dates,
as regards CRM events.
We note that most customers have not closed events "discriminating" (because we excluded
those with a score <= 0.003). In fact, 184,907 customers had no discriminating event 14 months
before closing, excluding the last 2 months, a total of 201,152 customers closed. We have 43,133
customers who had events, 43 014 events had "discriminatory."
Monday, February 11 – 2/4Monday, February 11 – 2/4
We also limit events per customer to the last 4.
Monday, February 11 – 4/4Monday, February 11 – 4/4
Now we note that most closings are preceded by a type event "Account-contact report" before
their closure: Type meeting face to face – Subscription - Free Spirit termination.
Program: Textual analysis of the minutes of advisors - Background and data
preparation – Classification - Verification of stability - Most discriminating words
Friday, February 15 – 1/1Friday, February 15 – 1/1
We have built a model, and to assess the relevance of this model, we calculate the "lift" it generates if it
is based on the prediction. The lift is positive, we conclude that this analysis can provide some
information on the type of client (closed or non-closed).
To illustrate the analysis, we have created a "word cloud", where the size of the letters is proportional
to the lift. We used a web application for this: http://www.wordle.net.
We have, in the CRM events
table the advisors minutes as
free text. We analyze this
information to see if it
Program: Analysis of sequences of events – Overview - Details of the algorithm -
Movement analysis accounting
Thursday, February 28 – 1/2Thursday, February 28 – 1/2
We note that, compared to
the “univarié” score, we do
not necessarily find the same
highest types of event score.
Apart from the type
"Termination Free Spirit",
which clearly indicates
closure of account.
We can observe that the event "Termination Free Spirit" often leads to the conclusion. We also
observe that the termination of dematerialized contract is in the same case, even if it involves fewer
customers. In contrast, other types of events have the opposite effect: they indicate non-closure. Few
customers go to closing after these events, compared to those who go to the non-closing position.
Accounting movement’s analysis: Data are available from April 2012 until
January 2013. But they are clearly inconsistent outside the period June to
December 2012, so we will focus only on the period.
Thursday, February 28 – 2/2Thursday, February 28 – 2/2
The phase 1 has been conducted in relation with the IT people of the Bank, they
have been very satisfied of the Aster Data possibility, and proposed to continue
the POC with business people.
The continuation will be realized in the phase 2.
Ball forward !
End of Phase 1End of Phase 1