The document discusses how data virtualization can help organizations comply with the General Data Protection Regulation (GDPR). It provides an overview of GDPR requirements and outlines how data virtualization addresses three pillars of compliance: providing a complete view of data subjects, enabling self-service data catalogs, and designing for privacy and responsibility. Specifically, data virtualization can give a single, real-time view of customer data across systems, allow discovery and access to curated data, and ensure consistent security, governance and auditability of personal data.
GDPR Noncompliance: Avoid the Risk with Data Virtualization
1. Mark Pritchard, Principal Sales Engineer, Denodo
Lakshmi Randall, Director of Product Marketing, Denodo
Jan 2018
GDPR Noncompliance:
Avoid the Risk with
Data Virtualization
2. Agenda
1. GDPR Overview
2. Why Data Virtualization for GDPR Compliance?
3. Three Essential Pillars of GDPR Compliance
4. Q & A
4. 4
GDPR
• Accountability – GDPR requires you to show HOW you comply
with principles
• Personal data should be
• Processed lawfully, fairly and transparent way
• Collected for specific, explicit and legitimate purposes
• Adequate, relevant and limited to what is necessary for processing
• Accurate and where necessary kept up to date, rectified without
delay
• Kept in a form permitting identification of subject no longer than
absolutely necessary
• Processed in a manner ensuring appropriate security of data
(unlawful viewing, processing, loss, destruction or damage)
• Controllers
• Demonstrate compliance with the principles.
Principles
5. 5
GDPR
• GDPR
• Comes into effect 25th May 2018
• Affects how companies collect, use and transfer personal
data
• Locate Information
• Document personal data – where from (outside/inside org)
• Information audit
• Duplicated personal information
• Accurate Information
• Personal information must be accurate and able to be
corrected on request
• On-line access (360 degree view)
Background Context
6. 6
GDPR
• Need legal basis for processing personal data
• Need to explain the legitimate basis/interests for using the data not just
claim to.
• Need reason for collecting PI – employment, preventing bribes for
example
• Or full consents – how to prove?
• Prove consents
• Freely given, specific, informed, unambiguous, explicit
• Ensure children protected – parental consents e.g. UK < 13 years
• Detecting and notification of a data breach
• Notify data protection authorities (e.g. ICO in the UK) when happens
(within 72 hours or fine up to EUR10M or 2% WW turnover)
Background Context
7. 7
GDPR
• Automated/Bulk Processing Sensitive Data
• Require Data Protection Impact Assessment and Data Protection
by Design
• New systems need to be developed with privacy in mind – to
comply with privacy principles.
• Appoint a data protection officer
• Monitor data on a large scale – how?
• See the Global Picture
• Operations in other countries
• Which DP body do you have to comply with – several/HQ?
• Keep data map of all repositories
Background Context
8. 8
GDPR
The costs of non-compliance
Regulatory fines and response
Mandated security and audit requirements
• Resulting from a legal or regulatory
settlement
Brand recovery costs
• Rebuilding customer trust will carry its
own costs
Notification Costs
Lawsuits and settlements
10. 10
Five Essential Capabilities of Data Virtualization
4. Self-service data services
5. Centralized metadata, security
& governance
1. Data abstraction
2. Zero replication, zero relocation
3. Real-time information
11. 11
1. Data Abstraction
Abstracts access to disparate data sources.
Acts as a single virtual repository.
Abstracts data complexities like location,
format, protocols
…hides data complexity for ease of data access by business
Enterprise architects must revise their data architecture
to meet the demand for fast data.”
– Create a Road Map For A Real-time, Agile, Self-Service Data
Platform, Forrester Research
12. 12
2. Zero Replication, Zero Relocation
…reduces development time and overall TCO
The Denodo Platform enables us to build and deliver data
services, to our internal and external consumers, within a
day instead of the 1 – 2 weeks it would take with ETL.”
– Manager, DrillingInfo
Leaves the data at its source; extracts only what is
needed, on demand.
Diminishes the need for effort-intensive ETL
processes.
Eliminates unnecessary data redundancy.
13. 13
3. Real-time Information
Provisions data in real-time to consumers
Creates real-time logical views of data across many
data sources.
Supports transformations and quality functions
without the latency, redundancy, and rigidity of legacy
approaches
…enables timely decision-making
Data virtualization integrates disparate data sources in real time or
near-real time to meet demands for analytics and transactional data.”
– Create a Road Map For A Real-time, Agile, Self-Service Data Platform, Forrester
Research, Dec 16, 2015
14. 14
4. Self-Service Data Services
Facilitates access to all data, both internal and external
Enables creation of universal semantic models reflecting
business taxonomy
Connects data silos to provide best available information to
drive business decisions
…enables information discovery and self-service
Impressively quick turn around time to "unlock“ data from
additional siloes and from legacy systems - Few vendors (if any) can
compete with Denodo's support of the Restful/Odata standard -
both to provide data (northbound) and to access data from the
sources (southbound).”
– Business Analyst, Swiss Re
15. 15
5. Centralised Metadata, Security & Governance
Abstracts data source security models and enables single-point
security and governance.
Extends single-point control across cloud and on-premises
architectures
Provides multiple forms of metadata (technical, business,
operational) to facilitate understanding of data.
…simplifies data security, privacy, audit
Our Denodo rollout was one of the easiest and most successful rollouts of critical
enterprise software I have seen. It was successful in handling our initial, security,
use case immediately, and has since shown a strong ability to cover additional
use cases, in particular acting as a Data Abstraction Layer via it's web service
functionality.”
– Enterprise Architect, Asurion
16. 16
Three Pillars of GDPR Compliance
Complete View of Data
Subjects
Self-service Data
Catalog
Privacy by Design
Responsibility & Accountability
19. 19
1. Analytical Focus
DATA VIRTUALIZATION
MDMData Warehouse
Master DataTransactional Data
• DV combines master data
from MDM and transactional
data (facts) in DW to provide
a complete and contextual
view of the enterprise data
• Used in compliance, financial
reporting use cases
20. 20
2. Operational Focus
DATA VIRTUALIZATION
MDM
Master DataTransactional Data
• DV combines master data
from MDM and transactional
data directly from the
transactional systems provide
a complete and contextual
view of the enterprise data
• Used in operational
applications like call center
apps
21. 21
3. Virtual MDM
DATA VIRTUALIZATION
Master DataTransactional Data
Master DataTransactional Data
• DV uses “registry-style MDM”
to match/ merge the data
• Used where storing data is
prohibited – healthcare,
public sector
• Mostly used to support
operational applications (not
much for reporting)
22. 22
Data Virtualization and Master Data
Benefits
• A complete view of the entity
• Single view of the customer, a 360° view of customer relationships, and a complete
view of customer interactions.
• The ability to combine master data with any other data throughout the
enterprise
• Data virtualization can connect to MDM and other data sources.
• Real-time data access to the complete customer view
• For any individual or organization across the enterprise.
• Reduced replication and its associated costs and risks.
• Data virtualization provides access to the data without replicating it.
• A short implementation timeframe
• A robust data virtualization layer can be developed and deployed in a matter of weeks.
25. 25
Most Self-Service Initiatives Fail
Why Self-Service Needs Data Virtualization
More than 70% self-service initiatives ranked as “average” or
lower
Problems: “More complicated than expected”, “spawns more
requests to IT than before”
Solution: expose curated information in business-friendly form
But creating physical, curated repositories is slow, expensive and
hard to maintain
Find more details at:
“How Data Virtualization Helps Build Self-Reliance for Information
Self-Service”
http://news.sys-con.com/node/3969453
26. 26
Self-Service Architecture with Denodo
c
c
∞ ∞⌐ ╥
c c c …
BA 1 BA 2 BA 3
Data Access Views [Data Engineers]
Canonical Views
[Data Engineers and Business Dev]
Business Views [Business Analysts/Dev]
c
Self-Service Catalog
Enterprise Apps
[App Developers][Data Analysts and Data Explorers]
[BI Developers]
27. 27
The Information Self-Service Catalog in the Reference Architecture
The Role of the Information Self-Service Catalog
Catalog of available business / canonical views
For: data analysts, business explorers, app developers
Search / browse data and metadata of existing views
See relationships between views and data lineage
Consume and customize existing views for particular needs
For: data analysts and business explorers
Saved queries for personal use (can be shared)
Export for continuing analysis in other tools (self-service, data prep)
Share with other users
Propose new standard business / canonical views
Preview datasets to business data consumers
For: Data engineers, app developers
28. 28
Catalogs and the Data Delivery Infrastructure
Need for Collaboration
Catalog / Discovery Features need to be tightly
linked to the Data Delivery Infrastructure
• Guarantee information about datasets is up to
date
• Provide Access to both actual data and
metadata:
• Discovery may require exploring the
actual data, not only metadata
• Discovery and final data preparation are
tightly interrelated activities
The Data Delivery Layer contextualizes usage of
datasets
• Who uses a Dataset, When and How
• Who created it, who maintains it and how
often
• What datasets are frequently used together
• Allows estimatic metrics such as relevance or
timeliness
38. 38
The Business Need
Ready Access to Critical Information to Support Business Processes
MarketingSales ExecutiveSupport
Customers
Invoices Products
Service
Usage
Access to complete information: business
entities and pre-integrated views
Access to related information: discovery
and self service
Access in real-time from different apps and
devices
39. 39
Governing Personal Data
The Challenge
MarketingSales ExecutiveSupport
Is the data being processing in
a lawful, fair and transparent
way?
Is the data being collected for
a specific, explicit and
legitimate purpose?
Is the data adequate and
limited to what is necessary
for processing?
Is the data you are viewing
accurate, up-to-date?
Is the data kept in a form
where subject is identifiable
no longer than is necessary?
Is the data processed in a
manner that ensures
appropriate security of data?
Database
Apps
Warehouse Cloud
Big Data
Documents AppsNo SQL
Multiple copies of the data?
Lineage of the data?
Consistent security of the data?
Data on premise and off?
Data access audit? Who is
replicating the data?
Discovery what data is actually
published to consumers?
Access to most up to date data? Is data anonymised for
40. 40
Denodo Platform Architecture
Facilitating GDPR Compliance
Multiple copies of the data
reduced through virtualization
approach.
Lineage of the data.
Understanding from which
systems the data is published.
Consistent security of the data,
applied in a single point of
access.
Data on premise and off,
combine through the same
governed virtual layer.
Data access audit and
monitoring. Logging who is
replicating and accessing the
data.
Self-service discovery, enabling
location of what data is actually
published to consumers.
Access to most up to date data
through right time access to
data sources.
Data masking on the fly.
41. 41
With Denodo Platform and Data Virtualization
• Adopt a cost-benefit-based approach to protecting and securing customer
data and privacy
• Instill data privacy and security easily into new initiatives requiring
information access
• Leverage data privacy and security to drive superior customer experience
• Meet regional data privacy and security requirements
• Prevent any non-compliance costs
You can
42. 42
Security in Denodo
Overview
Authentication
• Pass-through authentication
• Kerberos and Windows SSO
• OAuth, SPNEGO
Authentication
• Standard JDBC/ODBC security
• Kerberos and Windows SSO
• Web Service security
LDAP
Active Directory
Role based Authentication &
Authorization
Guest, employee, corporate
Schema-wide Permissions
Data Specific Permissions
(Row, Column level, Masking)
Policy Based Security
Data in motion
• SSL/TLS
Data in motion
• SSL/TLS
Encrypted data
at rest
• Cache
• Swap
47. 47
Custom
Policy
Conditions satisfied
Security: applies custom security policies
• If person accessing data has role of
'Supervisor' and location is 'New York', then
show compensation information for
employees in the New York office only.
Enforcement: rejects/filters queries by specified
criteria like user priority, cost, time of day etc.
• If the production batch window runs from 3
am - 6 am, there is increased load on
production servers at this time. So, all
queries on these servers can be blocked
during this time to prevent failure of a
process.
Data consuming users, Apps
Query
Accept / add filters
Reject
Security in Denodo
Custom Policies: Interception of queries before they are executed
Policy Server
(e.g. Axiomatics)
48. 48
Security in Denodo
• Audit trail of all the queries and other actions executed on the system
Complete Auditability
• With this information it is possible to check
at any time who has accessed to which
resources, what changes have been made or
what queries have been executed, and when
it happened
• The information is stored centrally and
Denodo supports SNMP, JMX and WS-
Management standards
49. 49
Monitoring Activity at the Delivery Layer
49
Who Uses What, When and How
Who uses each dataset, when, and
how often
What datasets are used together
Usage reports for multiple criteria
Different UIs for system
administrators (Diagnostic and
Monitoring) and Analysts
50.
51. 51
Resources
INFOGRAPHIC
Data Virtualization for GDPR
SOLUTION BRIEF
The 6 Main GRC-Related Challenges and How
Data Virtualization Addresses Them
SOLUTION BRIEF
Seamlessly Comply with the GDPR
SOLUTION BRIEF
Facilitating the Digital Transformation in Banking
EBOOK
Data Virtualization for Logical Data Warehouse