How Eastern Bank Uses Big Data to Better Serve and Protect its Customers

>Eastern Bank Data Engineering
How Eastern Bank Uses Big Data to
Better Serve & Protect its Customers
Brian Griffith
Principal Data Engineer

Agenda
• Introduction
• Eastern Bank & the banking industry
– Data architecture and our big data journey
– Challenges
– Use Case:
• Debit card anomaly detection
2

@bwgriffith
• Database developer and engineer for 15 years
• Working in the “big data” space for about 5
years
– Blizzard Entertainment – Irvine, CA
– Localytics – Boston, MA
• Now @ Eastern Bank, helping engineer their next
generation data platform
b.griffith@easternbank.com 3

Eastern Bank
• 197 year old mutual bank (largest of its kind in the
country)
– Leader in corporate social responsibility
– 8th most charitable business in Massachusetts
• ~1 Million customers
• 4 Organizations:
– Banking: Eastern Bank
– Insurance: Eastern Insurance Group
– Wealth: Eastern Wealth Management
– R & D and Product Dev: Eastern Labs
4

Banking is Evolving
• Customer activity moving more into the mobile
space
• Diverse services continuously emerging
• Customers value personalized service
– Relevant value added services
– Personal relationships
5

Positioned for the Best of Both Worlds
• Like larger banks, leverage data in a manner
that allows us to offer improved features and
convenience
• Like smaller banks, leverage data in a manner
that allows us to offer more customized
services and relationships
6

>Eastern Bank Data Engineering 7

Past Data Architecture Issues
• Customer data lives in transaction “silos”
– 3 Major data entities: Insurance, wealth, and
banking
– Data access via in-house or out-sourced solution
– Impedes analysis
• Regulatory compliance
– Technical Debt
– Auditing
– 3rd party dependencies
8

Data Architecture Goals
• Abstraction from source systems
• Scale horizontally, not vertically
• Complete ownership of depth and breadth of
our data
• Improve data quality and stewardship
• Drive iterative analytics throughout the
enterprise
• “Make the bank smarter”
9

Data Architecture
10
Tx
Data
Warehouse
Customer Master
Big Data Store
• Eastern endeavors to be relationship-driven, not
transaction driven. In a digital economy, face to
face interactions continue to decline. We need
to rely on data integration and analytics to know
our customers to best meet their evolving needs
• Our Data Architecture is built on four
interdependent “tiers” each with its own
capabilities and contributions to the overall
enterprise platform

Hadoop
Tx
Data
Warehouse
Customer Master
Big Data Store
• Can be a significant driver of customer
intimacy in an increasingly digital world
• Allows us to leverage data we’ve never
thought of as “Customer Data” before
• Goes beyond what a customer has with us –
gives visibility into what a customer does with
us through behavioral analytics
• Scales ability to store with ability to process
• Platform natively supports data analytics
languages and machine learning tools
• Fast processing enables iterative exploration

Architecture Diagram
12

Big Data Challenges
13

Challenges
• Governance!
– Ingestion
– Data Lineage
– Data Quality
– Managing growth
• Balancing what data we “can” keep vs data we “should” keep
• Security
– Personal Identifiable Information (PII)
– Mask and limit view of data
• Driving Consumption
– “If you build it, they will come”  Does not work by itself
– Constant evangelism
– Need to demonstrate value!
14

Data Science
15

Hadoop Data Science
Fraud Detection Proof of Concept

Fraud in the Financial Industry
An Introduction
• In 2012, there was 31.1 million fraudulent
transactions, with a value of $6.1 billion1
1 The 2013 Federal Reserve Payments Study
17

Debit Card Fraud
• Industry wide debit card fraud has been rising
at an significant rate
• > 400% in the last 3 years!
• Mostly due to breaches at large, national
retailers
18

Use Case Generation
• Develop process to work in conjunction with
existing fraud detection tools
– Existing tools mostly rules based
• Leverage Hadoop to traverse broad customer
history for anomalous patterns
– Behavioral analysis
19

Fraud Use Case Workflow
20
DATA
FEATURES
TRAINING
TESTING
sample trans &
claims to build
training data
identify account
behavior patterns
indicative of fraud
scoring model will
identify suspicious
accounts the day after
fraud happens
testing and
validating features
iteratively

Data
• Claims – Customer
reported
• Only use customer’s
first claim
• Model trained on all
available transaction
data
21

Features
• Variables indicative of fraud, formatted for
machine learning
• Example: dollarRatio = Ratio of dollar spend today vs hx
• Values calculated by comparing variables
today vs history
– Ratios, log(n), binary, etc…
• Higher value = more suspicious
• Hadoop performance
22

Building and Evaluating the Model
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
FraudDetectionRate
Total Accounts
ROC for TestModel
training
testing
reference
Receiver operating characteristic shows model tuning.
Reviewing 20% of accounts finds ~80% of anomalies.
Reference line shows predicted result of random sample.
Feature Weight Std Error Z p(>|Z|)
(Intercept) -3.44 0.051 -66.93 < 2e-16
dollarRatio 0.09 0.007 11.75 < 2e-16
0
20
40
60
80
100
120
140
0% 20% 40% 60% 80% 100%
FalsePositiveRatio
Fraud Detection Rate
False Positive Rate for TestModel
testing

Scoring
• How anomalous were a day’s transactions
– Value range: 0.00 – 1.00
– Comparing a day to customer’s history
• Assigned to each unique account
• Function of weights & feature values
24

Results & Testing
ACCOUNT Score Feature 1
Feature
2
Feature
3
Feature
4
Feature
5
Feature
6
xxxxxxxx 1 0.693 0.105 0.105 0.105 0.105 237.747
xxxxxxxx 0.9997 0.693 0.713 0.316 1.379 0.036 129.467
xxxxxxxx 0.9994 0.693 0.486 4.847 169.688 35.87 0.29
xxxxxxxx 0.9979 0 14.844 3.088 52.461 41.066 1
xxxxxxxx 0.9803 0.693 0.356 0.421 0.224 0.817 86.446
26

Results & Testing
Feature
2
Feature
3
Feature
4
Feature
5
Feature
6
xxxxxxxx 1 0.693 0.105 0.105 0.105 0.105 237.747
xxxxxxxx 0.9997 0.693 0.713 0.316 1.379 0.036 129.467
xxxxxxxx 0.9994 0.693 0.486 4.847 169.688 35.87 0.29
xxxxxxxx 0.9979 0 14.844 3.088 52.461 41.066 1
xxxxxxxx 0.9803 0.693 0.356 0.421 0.224 0.817 86.446
dollarRatio = Feature 6
27

Results & Testing
Feature
2
Feature
3
Feature
4
Feature
5
Feature
6
xxxxxxxx 1 0.693 0.105 0.105 0.105 0.105 237.747
Merchant Amount Timestamp
JETBLUE AIRW $2,142.00 4/30/15 9:35 AM
28

Results & Testing
Feature
2
Feature
3
Feature
4
Feature
5
Feature
6
xxxxxxxx 1 0.693 0.105 0.105 0.105 0.105 237.747
xxxxxxxx 0.9997 0.693 0.713 0.316 1.379 0.036 129.467
xxxxxxxx 0.9994 0.693 0.486 4.847 169.688 35.87 0.29
xxxxxxxx 0.9979 0 14.844 3.088 52.461 41.066 1
xxxxxxxx 0.9803 0.693 0.356 0.421 0.224 0.817 86.446
29

Results & Testing
Feature
2
Feature
3
Feature
4
Feature
5
Feature
6
xxxxxxxx 0.9979 0 14.844 3.088 52.461 41.066 1
Merchant Amount Timestamp
Internet Vendor $12.25 4/30/15 3:42 AM
30

Iterating
31
.
• Build new features
• Remove ineffective features
• Address feature interaction
• Minimize False Positives
• Try Different Algorithms

Next Steps
• Real time w/ Spark & MLLib
– Get closer to when fraud actually occurs
• Expanded customer reach via notifications
– Improved customer service
• More agile feedback loop based on customer
assessment
32

Other Uses
• Comparing customer behaviors day over day
has carry over to many uses cases:
– Predicting churn
– Customer segmentation & personas
– Predicting Customer Lifetime Value (CLV)
33

Wrap up
• Banking is evolving
• Hadoop addresses a very large gap in our
architecture
• Empowers us to know more about our customers
through all of their interactions with us
• Needs to be governed
• Customer fraud detection only the tip of the
iceberg
34

Special Thanks
• Mark Leonard (Eastern Bank) – SVP, Data &
Development Director
• Joe Blue (MapR) – Data Scientist
35

Thank You!
36

How Eastern Bank Uses Big Data to Better Serve and Protect its Customers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to How Eastern Bank Uses Big Data to Better Serve and Protect its Customers

Similar to How Eastern Bank Uses Big Data to Better Serve and Protect its Customers (20)

Recently uploaded

Recently uploaded (20)

How Eastern Bank Uses Big Data to Better Serve and Protect its Customers

Editor's Notes