Data privacy is on everyone's mind right now. Regulations such as GDPR, as well as public sentiment, mean that governance and compliance are must-have capabilities for data lakes. Learn how to curate meaningful data from your data lake, accelerate governance and compliance, and enable your organization with searchable, trusted datasets.
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Delivering Analytics at Scale with a Governed Data Lake
1. DELIVERING ANALYTICS
AT SCALE WITH A
GOVERNED DATA LAKE
Jean-Michel Franco
Sr Director for Data Governance Products
@jmichel_franco
2. 22
AGENDA
A data governance model for
the Data Lake
Building the platform for the
governed data lake
Use Cases : Establishing
GDPR/Data Privacy compliance
in a customer 360° lake
01
02
03
3. 33
2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
DATA ECONOMICS ARE BROKEN
TRADITIONAL APPROACH CAN NO LONGER KEEP UP
Data &
business
expectations
Delivery
Capabilities
WIDENING
GAP
3Xgrowth
rate in
self-service
users
Data doubling
every 2 years
cloud
machine
learning
IOT
People
5. 55
RawData
WHY MOST DATA LAKE APPROACHES WILL FAIL
New model -> Struggling To Control The Data Sprawl
Any Data Costs, Time to value, Scalability
Governance, Risks
Any Data
Worker
Data Scientists
Business Analysts
Operations
INGEST CURATE MANAGE CONSUME
INGEST CURATE MANAGE CONSUME
INGEST CURATE MANAGE CONSUME
GOVERN
GOVERN
Open Data
Hadoop & NoSQL
Traditional Data Sources
Streams
Enterprise Apps
Cloud
Smart
Data
Smart
Data
Smart
Data
7. 77
COLLABORATIVE GOVERNANCE FROM THE GET GO
Scaling Trust And Reach Through Collaboration
Costs, Time to value, Scalability, Governance Any Data
Worker
Any Data
Open Data
Hadoop & NoSQL
Traditional Data Sources
Streams
Enterprise Apps
Cloud RawData
IT’s Sanctionned Content
Business Crowdsourced Content
SmartData
9. 99
A 5 STEPS APPROACH FOR MODERN DATA GOVERNANCE
√
• Establish Data Quality Upfront
• Unleash Data as a Service
for People and Apps
• Capture & Document Any Data Sources
√
√
√
√
√
√
• Take Control & Protect
Data Engineers
Business Users
Data Scientists
Customers
Applications
API
• Foster accountability
10. 10
WHAT ARE THE RELATED 5 DATA MGMT DISCIPLINES?
Know
Your Data
Build the
360°view
Protect and
govern your Data
Foster
Accounta-
lities
Publish data
in a controlled
way
Data Anonymization,
Policy Enforcement,
Data Lineage
Data Cataloging & Metadata Management
Data Quality &
Master Data Management
Data Stewardship
Data Cataloging and API
11. 11
BOOSTING DATA USAGE TENFOLD
AT ISO-BUDGET WITH A CLOUD
DATA-LAKE
Goals & Benefits
• Expanding data usage and reaping
the benefits of data monetization
• Turbo-charging analytics from
batch to near real time
• Establishing end to end security
and compliance (MIFID, GDPR…)
• Improving data accessibility ->
from 45 days to 1 day for
provisioning a data lab
Data
Lake
Catalog &
Search
Access &
UI
Processing
and
Analytics
Gover-
nance &
Security
Data
Ingestion
&
Integration
13. 1313
MOST COMPANIES FAIL, BADLY
Policies are defined…
98%HAVE UPDATED THEIR
PRIVACY POLICIES FOR
GDPR
70%FAILED TO PROVIDE THE
DATA REQUESTED!
21 days
AVG TIME IT TOOK
COMPLIANT COMPANIES
TO RESPOND
But are not enforced… or poorly delivered
14. 1414
THE ROAD TO
COMPLIANCE:
WHY DO
COMPANIES FAIL?
• No established accountability
• Unautomated process
• A legal process, rather than a
customer service engagement
• People as human data integrators
• No control over personal data
• Reluctance to share the data
15. 1515
CAPTURE AND TRACK PERSONAL DATA
Data Management Discipline: Data Cataloging
The road to success (1/5)