1. Case Study in Banking and Finance: The Real-World Use of Big Data in Financial Services
Big Data World Show, Malaysia
Loon Wing Yuen
Director, Innovation
Group Information Services,
Group Information and Operations Division
Shangri-La Hotel, Kuala Lumpur
9-10 September, 2014
2. The Opportunity in CIMB (circa 2012)
CIMB had the largest Facebook fan-base (over 600k) among banks in Malaysia
CIMB also had the largest Facebook fan-base (over 1m) among banks in ASEAN
CIMB also had a huge Twitter following
CIMB had launched OctoPay (Facebook Banking) and many Facebook-related marketing campaigns, several which targeted debit card usage – we built a database of matching CustomerIDs and FBIDs with associated FB structured and unstructured data
With this data asset, there were potential debit card revenue opportunities leveraging this asset
2
FB Marketing Campaigns (2011 & 2012)
ASEAN’s 1st banking service on Facebook (2012)
3. Application Use Case – Leveraging the value of our customer’s FB Likes
The Debit Card business was new and so the active card base was low, resulting in low transactional volumes. Hence, this did not give the business a good idea on the types of merchant spend.
We decided to check if there was a correlation between FB Likes and merchant spend. Assuming there was a good correlation – then the hypothesis was that we could take the wider range of merchant categories in FB Likes and use them for marketing campaign interventions.
3
Big Data
Platform
(FB Likes)
With
Debit Card
With
Spend
Regular Spender
Irregular* Spender
Without
Spend
Without
Debit Card
Traditional Big Data
New
Big Data
2012
4. Distribution of Customers’ FB Likes
►Total identified distinct users who generated FB Like*: 27,614 (out of a total of 53,482)
►Out of the 53k total, we linked 12,925 users as active CIMB customers
►Approximately 6.5% of users liked more than 500 pages in their Facebook profile.
►These 6.5% of “heavy FB Likes users” accounted for approximately 36% of the total FB Likes captured.
We discovered that the FB Likes generated by customers were very unevenly distributed.
* -- Excluding an FB Like for CIMB.
14544
9006
2277
1787
0
2000
4000
6000
8000
10000
12000
14000
16000
<=100
>100 and <=300
>300 and <=500
>500
Total Number of User Count versus Total Number of Likes per User
567047, 12%
1584913, 34%
874962, 18%
1696692, 36%
Total Number of User Like Pages versus Total Number of Likes per User
<=100
>100 and <=300
>300 and <=500
>500
2012
4
5. Correlation of FB Likes with Merchant Spend
►Significant amounts of data cleansing and transformations required
►Correlations stronger in certain merchant categories/brands
►Not every FB Like is correlated
►Statistical testing required to determine the strength of correlations
We discovered good correlations among certain merchant categories and brands.
•The matching of the merchant_name and fblike_name is based on the simple “Like” SQL statement which does not guarantee the full match between the merchant_name and fblike_name.
•More powerful data cleaning is needed to match the merchant_name and fblike_name more accurately.
Debit Card Txn
FB data
2012
5
6. Distinct Count of FBLikes for Starbucks
by Micro Segment
Targeted Interventions by Merchant Brand
►Size of bubble represents Total FB Likes from Credit Card Prospect Base*
FILTERED BDPP DATASET AS AT 11 MAR 2013:
1.6 FB CAMPAIGN DATA FROM GMCD - I LOVE NEW YORK, DEBIT CARD RESKIN, MY DEBIT CARD, FOOTBALL FANTASY, YOUTH PEEK BUY, YOUTH VIDEO VOTING
2.FB CIMB_ASSISTS DATA FROM GMCD
3.CUSTOMER TAGGING DATA FROM BIU
We can then partner with a selected existing merchant (eg. Starbucks) and design a very targeted campaign – or on-board a promising new merchant partnership.
6
Note*: Prospect Base is based on active customers aged > 21 yrs old without a credit card.
Micro Segment
Distinct Count of FBLikes for Starbucks
Facebook User Base
4,747
Active Customer Base
1,778
Credit Card Base
96
Credit Card Prospect Base*
1,216
Debit Card Base
946
Distinct Count of FBLikes for Starbucks
by Business Segment
Distinct Count of FBLikes for Starbucks
by Macro Segment
FBLikes Comparison between Credit Card Base &
Debit Card Base for Starbucks by Macro Segment
2013
7. Application Use Case – Moving on to the Credit Card base
The results from the work on the Debit Card base was promising enough to gain buy-in to next work on the Credit Card base as the next phase.
7
2013
The scope was to create actionable insights to:
Increase credit card usage
Reactivate inactive credit card users
The approach was to:
Focus on influencing usage behavior – hence the focus on analyzing customer behaviors
Influence usage behavior by offering targeted merchant offers
Increased usage will generally lead to increased balances
The deliverables were:
Decile analysis of the card user base by card spend, merchant category and merchant brand spend
A range of actionable propositions that can drive card usage
A fully sized segmentation model for targeted offers
8. Credit Card Usage Analysis
The business goal at high level is to maximise both usage and balances for each credit card customer.
8
Usage
Balance
High Usage
Medium Balance
High Usage
Low Balance
(Transactors)
High Usage
High Balance
(Core Revolvers)
Medium Usage
Medium Balance
Medium Usage
Low Balance
Medium Usage
High Balance
Low Usage
Medium Balance
Low Usage
Low Balance
Low Usage
High Balance
1
2
3
4
5
6
7
8
9
‘Occasionals’
Profitable
group
2013
9. From Analysis to Actionable Insights
An example of crafting a marketing proposition for the ‘Occasionals’ cohort.
1. Understand customer purchases by merchant categories
2. Understand merchant
product features
3. Plan campaign and create offers
4. Generate the customer list for each offer and execute according to campaign plan
Which product to offer
Choo- sing
who to target
2013
9
10. Big Data Analytics Platform for Business
In reality though, this is how the business is analyzed – by deciles. The new Big Data Exploration Portal allows “speed of thought” analyses as compared to the traditional multi-week report turnarounds from the data-warehouse – a key metric is now “Time to Actionable Insights”.
10
2014
Entire Customer Base
with > 30 months of transactional data
500+ different metrics calculated in < 2 seconds
11. The problem is that at least two-thirds of our effort and time is spent with data cleansing, filtering, transformation, enrichment, etc. instead of extracting business value from the data.
Our Biggest Challenge though is..
11
12. Big Data requires familiarity with Statistical/Machine Learning and NoSQL approaches
The Statistical/Machine Learning approaches used were:
► Principal Component Analysis (a Statistical dimensionality reduction approach) was used to reveal key behaviors among the credit card base
► K-Means clustering (a Machine Learning approach) was used to identify and segment “Low to High (Y1 Y2) Usage” spend behavior
► Neural Networks (a Machine Learning approach) was used to predict spend behavior
► Support Vector Machines (a Machine Learning approach) was used to predict customer inactivity
The NoSQL approaches used were:
► De-normalisation/nesting of the transactional data
► Modeling the data for optimal access for the purposes of supporting long-term customer analytics and near-realtime customer intervention systems
Some of the statistical/machine learning and NoSQL approaches used were:
2013 - 2014
12
13. Enhancing Business Capabilities with Big Data analytics
Big Data analytics can enhance all business dimensions of “Analytics” and “Management Information”
Compliance & Regulatory Analytics
Basel II & III
FATCA
Sarbanes Oxley Act (SOX)
Fraud / AML
Suspicious Activity
Compliance Reporting
Regulatory Reporting
Risk Management Analytics
Credit Risk
Market Risk
Operations Risk
Liquidity Risk
Capital Analysis
Collection Analysis
Exposure Analysis
Sales Analytics
Event/Campaign Analytics
Behavior Analytics
Market Analytics
Transaction Analytics
Customer Analytics
Targeted Marketing / Sales
Lead Analytics
Management Analytics
Income Analytics
Cost Analytics
Profitability Analytics
Sales Performance
Payment Analytics
Capital Allocation Analytics
Position Analytics
Balance Sheet Analytics
Weighted Average Analytics
Structured Finance Analytics
Liquidity Analytics
Corporate Action Analytics
Performance Analytics
Financial Market Analytics
Foreign Exchange Analytics
Settlement Analytics
Performance vs Benchmark
Asset Allocation Analytics
Product Analytics
Portfolio Performance
Portfolio Risk Analytics
13
14. Rebuilding our Big Data and Machine Learning Platform
There is an incredible opportunity to leverage Big Data and Machine Learning technologies to
add advanced capabilities to our digital channels as well as to dramatically reduce “time to
actionable insights” for our business stakeholders.
ElasticSearch
Indexing
Map -
Reduce
Pig Hive
Tez HBase Storm Spark*
Yarn
HDFS
Exploration
Portal
Enterprise Data
Warehouse
Customer NoSQL Repository
(Cassandra)
2014
Analytics REST-API layer
Business
Analyst
14
15. Focus on the business priorities first, start with an engaged business stakeholder and manageable pilot
Identify a business opportunity to address and prove the viability/business case, let the next business use case build upon this success and expand
Focus on people and skills, lesser on the technologies
The technologies are new, so be prepared to experiment; use, discard and replace technology components as required (many are open-source, fortunately)
Data cleansing/preparation/ management is a big issue, not to be underestimated
If the existing EDW is not primarily built for customer centricity and insight, don’t retrofit this into the EDW – instead build something akin to the Customer NoSQL Repository outside using new Big Data technologies
Approach
Capabilities
There is an incredible opportunity to leverage Big Data and Machine Learning technologies to add advanced capabilities to an organisation’s digital channels and supporting the business need of significantly reducing “time to actionable insights”
There is significant business opportunity in leveraging external data such as FB and Twitter
But rethink approach on leveraging this FB and Twitter data – start with working on the issue of reliably linking external ids with internal customer ids
Opportunity
Summary and our learnings along the journey so far
2012 – 2014
15