If your organization is in a highly-regulated industry – or relies on data for competitive advantage – data governance is undoubtedly a top priority. Whether you’re focused on “defensive” data governance (supporting regulatory compliance and risk management) or “offensive” data governance (extracting the maximum value from your data assets, and minimizing the cost of bad data), data quality plays a critical role in ensuring success.
View this webinar on-demand to learn how enterprise data quality drives stronger data governance, including:
• The overlaps between data governance and data quality
• The “data” dependencies of data governance – and how data quality addresses them
• Key considerations for deploying data quality for data governance
2. Agenda
Introduction
Why Data Quality & Data Governance are top of mind
Data Quality & Data Governance: a symbiotic relationship
How Data Quality strengthens Enterprise Data Governance
Summary
Syncsort Confidential and Proprietary - do not copy or distribute
3. Speakers
Harald Smith
Director of Product Management, Trillium Software
20 years in Information Management incl. data quality, integration, and governance
Co-author of Patterns of Information Management
Author of two Redbooks on Information Governance and Data Integration
Davinity Powis
Pre-Sales Consultant for Syncsort
Founded UK-based data-marketing agency until its acquisition in 2012
Specialises in Data Quality, Data Governance, Data Integration and Big Data.
Particular interest in data quality and enrichment
Passionate about making data understandable and exciting!
Syncsort Confidential and Proprietary - do not copy or distribute
4. Data: the fuel of the future
Data is to this century, what oil was to the last one: a driver of
growth and change.
The Economist: Fuel of the future - Data is giving rise to a new economy: 6th May 2017
Flows of data have created new infrastructures, new businesses,
new monopolies, new politics and crucially new economics.
Digital information is unlike any previous resource: it is extracted,
refined, valued, bought and sold in different ways.
It changes the rules for markets and it demands new approaches
from regulators.
Many a battle will be fought over who should own, and benefit
from, data.
Syncsort Confidential and Proprietary - do not copy or distribute
5. Many sources are predicting exponential data growth toward 2020
and beyond. In almost a repeat of Moore’s Law, they are all in
broad agreement that the size of the digital universe will double
every two years at least.
Human-generated data is experiencing an overall 10x faster growth
rate than traditional business data, and machine data is increasing
even more rapidly at 50x the growth rate!
Acceleration due to: IoT, AI, ML, Big Data, Block Chain
Data Governance & Quality are top of mind
Volume and complexity
of data is growing
new tools allowing more
granular data dissection
Broader and deeper
compliance & regulation
expectations
trust & confidence
Syncsort Confidential and Proprietary - do not copy or distribute
6. A CDO’s nightmare!
Can I even trust this
data?
Is duplication causing
‘permission clash’
Where is all my data?
How many places store
the same data?
Are we compliant with all
necessary regulations? Can
we prove it?
Do we know what &
how much customer
data we even hold?
Do we have right internal
training & policies to
manage this much data?
Syncsort Confidential and Proprietary - do not copy or distribute
Is my customer data
safe & secure?
Could we survive the
bad publicity & financial
impact of a GDPR fine?
8. Data impacts all areas of the business
sales marketing financelegal IT logistics management
Analysis
Sales reports
Dashboards
Performance metrics
Territory management
Segmentation
SCV / 360
Understanding & CRM
Content
Campaign management
ROI
UX
All reports!
Aggregations
Forecasting & modelling
Cash flow
Contingency planning
Data compliance
Data regulation
Governance
Risk
Access
Security
Disaster recovery
Scheduling
Workloads
Performance planning
Route planning
Capacity management
Environmental
Competitor analysis
HR / recruitment
Overall business strategy!Overall business strategy!
Syncsort Confidential and Proprietary - do not copy or distribute
9. Data Governance is the set of policies, processes, rules,
roles and responsibilities that help organisations
manage data as a corporate asset.
It ensures the availability, usability, integrity,
accuracy, compliance and security of data.
Terminology
Data Quality refers to ensuring that data is “fit for use” in its intended
operational, decision-making and other roles.
It covers the accuracy, completeness, consistency,
relevance, timeliness and validity of data.
Data Quality
ACCURACY
COMPLETENESS
CONSISTENCY
RELEVANCE
TIMELINESS
VALIDITY
Data Governance
PEOPLE
PROCESSES
POLICIES
RULES
STANDARDS
DOCUMENTATION
SECURITY
Data Availability
Data Compliance
Defining Key Data
Elements
Assigning Data Stewards
& Council
Glossaries &
Dictionaries
Data Consistency
& Standardisation
Monitoring
Analytics
Policies & Rules
Metrics
Data Lineage
Reporting
In practiceAreas of common interest
Cleansing
Enrichment
Parsing
Discovery & Profiling
Matching, Suppression &
Deduplication
Syncsort Confidential and Proprietary - do not copy or distribute
10. Symbiosis
“a relationship between two entities
for mutual benefit, often without
competing with each other”
Data Quality & Data Governance
share a ‘symbiotic relationship’
Syncsort Confidential and Proprietary - do not copy or distribute
11. Relevant
Rules &
Policies
DQ needs appropriate DG tools to ensure the data is
cleaned and maintained within an appropriate data framework
which is relevant and pertinent to the business needs
Symbiotic relationship between DQ & DG
High
Quality
Data
DG needs appropriate DQ tools to not-only clean the raw data, but to
illustrate data errors, peculiarities and issues, in order to help compile
the best standards and monitor the data quality over time
Syncsort Confidential and Proprietary - do not copy or distribute
DQDG
12. But they are only useful if they are accurate!We all use information, intelligence & insight
Essex
Kent
Surrey
Shrops
surrey
London
Cornwall
Merseyside
Surry
W. Sussex
PRE-DQ POST-DQ
Syncsort Confidential and Proprietary - do not copy or distribute
13. But they are only useful if they are accurate!
Essex
Kent
Surrey
Shrops
surrey
London
Cornwall
Merseyside
Surry
W. Sussex
PRE-DQ POST-DQ
Syncsort Confidential and Proprietary - do not copy or distribute
Essex
Kent
Surrey
Shrops
surrey
London
Cornwall
Merseyside
Surry
W. Sussex
PRE-DQ POST-DQ
What you don’t know CAN hurt you!
Other changes to data quality quickly undermine trust
Signal loss
Noise
Differing aggregations
Invalid correlations
Unexpressed assumptions
Incorrect defaults
Lack of context
Missing inputs
14. More than simply ‘understanding’ your data!What you don’t know CAN hurt you!
Essex
Kent
Surrey
Shrops
surrey
London
Cornwall
Merseyside
Surry
W. Sussex
POST-DQPRE-DQ
Syncsort Confidential and Proprietary - do not copy or distribute
Signal loss
Noise
Differing aggregations
Invalid correlations
Unexpressed assumptions
Incorrect defaults
Lack of context
Missing inputs
Other changes to data quality quickly undermine trustNecessary to actively Record, Monitor & Measure
Enumerate
Establish the criteria defining goals,
relevance, and fitness for purpose
Acquire
Capture the metadata for data
sources being considered and used
Discover
Profile the data sources which are
required for the desired analysis
Validate
Evaluate the data sources for the identified
and required qualities
Document
Document and store the findings about data
sources and processes
Catalog
Provide and communicate findings about data
sources and processes for others to utilize
15. The role of DQ in DG
It is challenging for organisations to respond to regulatory mandates in a timely
manner.
Data typically comes from multiple disparate systems & sources
The number of touchpoints has grown dramatically.
There is a higher demand and expectation for real-time data.
Regardless of the compliance mandate, the simple fact is that they all require
accurate source data.
Rubbish-in: rubbish-out is more pertinent than ever before!
Syncsort Confidential and Proprietary - do not copy or distribute
16. What are the regulations there for?
Regulations are there to protect and regulate:
privacy
disclosure
risk management
fraud prevention
anti-money laundering
anti-terrorism
anti-usury lending, and the promotion of lending to lower-income populations.
Syncsort Confidential and Proprietary - do not copy or distribute
17. Types of regulations
Risk & Compliance
GDPR
CCPA
FSCS
FATCA
Customer Data
Management & KYC
Regulatory Reporting
& Data Assurance
Operational
Governance
BCBS 239
Data Stewardship
ANACREDIT
HIPPA
BASEL II/III
CCAR / Stress Testing
DQ Assurance
AML
Syncsort Confidential and Proprietary - do not copy or distribute
19. What personal & sensitive data
you hold – and is it up-to-date?
What you are doing with it &
how you are processing it?
That you have
permission to use it
Where it is stored?
Is it duplicated?
Who has access to it? How are you keeping it SAFE?
GDPR is essentially about knowing:
Syncsort Confidential and Proprietary - do not copy or distribute
20. What do you know about me?
Right to access data plus receive a copy of data
Customers are now recognising their new power
Data about me is wrong - fix it!
Right to inaccurate data correction
Erase all my data for good!
Right to be forgotten
Has my data been breached?
Right to be informed within 72 hours
How do you use my data?
Right to limit processing of personal data
and object to how it is processed
Demand human interaction
Right to not participate in fully-automated
decisions based on customer profile
Syncsort Confidential and Proprietary - do not copy or distribute
21. Source: Oliver Wyman, Global Management Consultancy (May 2017)
Suddenly it’s serious!
Google hit with £44m GDPR fine over ads
Syncsort Confidential and Proprietary - do not copy or distribute
22. ID Title Forename Surname Full Name Address 1 City Postcode email Phone
SMI20033 XXX XXX Dr B. Smith 3 Davy Dr Maltby S66 7EN bob.smith@hotmail.comXXX XXX
bob.smith
@hotmail.com
Bob Smith bob.smith@hotmail.com
2000138604 Dr Smith xxx xxxBob bob.smith@hotmail.com
134567542 Smith 3 Davey Drive Rotherham S667EN 01189407600Bob
SMI16975 Dr B. Smith 3 Davy Dryve MALtby S66 7EN 07123 5579421bob.smith@hotmail.com
Dr Smith 3 Davy Drive Rotherham S66 7EN
01189407600
07123 5579421
Bob bob.smith@hotmail.com
Multiple touchpoints/databases - which is ‘right’?
Permission
xxxxxxxxxxxxx
Syncsort Confidential and Proprietary - do not copy or distribute
23. Single View enables accuracy and excellence in…
Analytics
Analysis of clean data will be accurate
Segmentation & Targeting
Marketers will place consumers into the correct
segments. Campaigns are more relevant
Reporting & Visualisation
Reports will be reliable. Dashboards show
correct findings - giving a true representation.
Customer Experience
Customers will receive consistent
messaging and communications.
Accurate understanding leads to appropriate
communications and dialogue.
Customer Understanding Strategy
All these lead to accurate, sensible
business decisions.
Syncsort Confidential and Proprietary - do not copy or distribute
24. Regulation demands evidence & documentation
ARTICLE
5
ARTICLE
30
ARTICLE
32
ARTICLE
35
Provide evidence that your company’s personal data processing adheres to GDPR principles:
Processed lawfully, transparently
Collected for specific purposes
Limited to data relevant for specific purposes
Kept accurate and current
Processed securely and protected
Provide documentation on your company’s Record Processing Activities
Provide documentation on your company’s Security of Processing
Provide documentation on your company’s Data Protection Impact Assessment
Syncsort Confidential and Proprietary - do not copy or distribute
GDPR is about more than just data quality though
25. Data Quality tools are
no longer a “nice to have”
Syncsort Confidential and Proprietary - do not copy or distribute
26. GDPR – where DQ helps deliver compliance
3. Data Integration
Integration with Data
Governance tools. Triggers
issue management and
controls.
Integration with analytical
& dashboarding tools so
that GDPR rules and reports
(and overall compliance) can
be easily understood and
monitored.
2. Data Quality Processing
Real-time & batch data cleansing & matching
across multiple data sources generating SCV;
enabling businesses to locate records by a
single record quickly
SCV also means customer permissions are
respected, records can be amended or
suppressed / deleted, plus businesses can react
to SAR requests quickly
Full traceability of original data source
Documented DQ routines for transparency &
auditing (e.g. user & process control, security)
1. Data Discovery
Highlights bad data, typos, mis-
fielded data, outlying data not
conforming to policy, formatting,
structure, syntax etc
Exposes text fields with buried,
unexpected personal & sensitive
data
Build Technical business rules to
mirror DG rules and identify and
monitor ongoing data issues
Syncsort Confidential and Proprietary - do not copy or distribute
27. GDPR mandates tight control of customer data!
Without DQ, duplication and poor data will propagate,
resulting in mis-understanding and mis-respecting
the customers’ wishes and demands. Over time, this will
inevitably escalate to non-compliance of GDPR!
DQ helps ensure DG compliance
GDPR Summary
Syncsort Confidential and Proprietary - do not copy or distribute
29. FATCA
FATCA is an abbreviation for: Foreign Account Tax Compliance Act.
2010 US federal law to enforce the requirement for US citizens (including those
living outside the US) to file yearly reports on their non-US financial accounts to
the Financial Crimes Enforcement Network (FinCEN).
Introduced April 2015, it requires all non-US financial institutions to search their
records for customers with indicia (flags) of ‘US citizen' status, such as a US
place of birth, and to flag & identify such records for further inspection.
Syncsort Confidential and Proprietary - do not copy or distribute
30. FATCA – where DQ helps deliver compliance
DQ processing is typically used as precursor to a bank’s internal FATCA process
it uses all key steps such as parsing, standardisation, cleansing, matching, commonisation and merging to
deliver Single Customer View (SCV).
SCV ensures all duplicate records are linked, often highlighting conflicting
information and indicia, such as:
Country of Origin of address (US vs. Non-US)
US Birthplace
US Telephone numbers
De-minimis (aggregated account balances with currency conversion)
Once data is remediated and harmonised, the right decisions can be made,
ensuring the organisation is FATCA compliant.
PO Box/Care of addresses
US Social Security Numbers
US Citizenship
Syncsort Confidential and Proprietary - do not copy or distribute
31. Identifies the real country of origin - irrespective of data captured.
DQ: highlights address indicia errors
Non-US country codes which would
otherwise have been incorrectly prevented
them from FATCA processing
Erroneous US country codes which
would have incorrectly included them in
FATCA processing, unnecessarily wasting
time and resource.
Syncsort Confidential and Proprietary - do not copy or distribute
32. Identifies where duplicate records contain conflicting Nationality indicia.
Different records have/not been have implicated for FATCA, leading to fuzzy decisions.
DQ harmonises the cluster so that each record has the same indicia.
DQ: highlights Nationality indicia conflicts
Syncsort Confidential and Proprietary - do not copy or distribute
33. No Data Quality = inaccurate decisionsDQ: results
Implicated
Records which
clearly contain
implicated indicia
Not Implicated
Records which do
not contain
implicated indicia
Suspect
Records which
may contain
implicated indicia.
= sensible decisions
Syncsort Confidential and Proprietary - do not copy or distribute
34. Not performing DQ processing before FATCA
procedures could easily lead to missing
implicated records from selection.
Thus failing FATCA regulation!
DQ helps ensure DG compliance
FATCA Summary
Syncsort Confidential and Proprietary - do not copy or distribute
36. AML
Money laundering refers to the exchange of money or assets that were obtained
criminally for money. It also includes money that is used to fund terrorism,
however it’s obtained.
Introduced in May 2018, FS organisations must put in place controls to prevent
their business from being used for money laundering:
checking the identity of your customers
checking the identity of ‘beneficial owners’ of corporate bodies and partnerships
monitoring your customers’ business activities and reporting anything suspicious to the National Crime
Agency (NCA)
making sure you have the necessary management control systems in place
keeping all documents that relate to financial transactions, the identity of your customers, risk
assessment and management procedures and processes
Syncsort Confidential and Proprietary - do not copy or distribute
37. AML – where DQ helps deliver compliance
DQ processing is typically used as prerequisite to a bank’s internal AML process
It uses key steps such as parsing, standardisation and cleansing to ensure the
bank’s own data is of the highest standard possible.
It also allows the organisation to link all monetary activities to specific
individuals, giving the firm the best chance of identifying and combatting
potential money-laundering and other financial crimes, and take appropriate
actions.
Syncsort Confidential and Proprietary - do not copy or distribute
38. DQ: enabling accurate matching & suppression
Syncsort Confidential and Proprietary - do not copy or distribute
PRE-DQPOST-DQ
Once standardised and cleansed, the bank’s data then has the optimum chance of
matching data on sanctions lists of known money launderers, criminals or terrorists.
39. When banks transfer money and data
SWIFT messages are the format or schema used by financial institutions
to exchange data
SWIFT messages are complex data structures consisting of five blocks of data
including three headers, message content and a trailer.
Data Quality is paramount for operational, reporting, governance, and
AML requirements.
DQ ensures SWIFT message quality
Syncsort Confidential and Proprietary - do not copy or distribute
40. 50K|/809615 01178139~MR BOB WONG~53 NEEDLESS RD~LINCOLN
LINCOLNSHIRE~LN21 |52A|BEASHKHHXXX|59|/1995 8242
207458~WONG MEI LING AND WONG BOB|57A| 5 | CANADA SQU
LONDON|SENDER|LOYDGB2XXX| RECEIVER|BKCHHKHH
Title
Forename
Recoded Forename
Surname
HouseNo
StreetName
StreetType
City
County
Postcode
Country
Clean / Correct / Validation
Cleanses, corrects, validates and enriches
customer information on SWIFT message to
enable accurate AML checks
DQ: highlights & remediates data in-flight
<OrderingCustomer>
…
<Name>MR BOB WONG</Name>
<Address>
<Line1>53 NEEDLESS RD</Line1>
<Line2>LINCOLN LINCOLNSHIRE</Line2>
<Line3>LN21 </Line3>
</Address>
…
</OrderingCustomer>
<BeneficiaryInstitution>
…
<BIC></BIC>
<Address>
<Line1> </Line1>
<Line2>5 CANADA SQU </Line2>
<Line3>LONDON</Line3>
</Address>
<Account/>
…
</BeneficiaryInstitution>
Parse
Syncsort Confidential and Proprietary - do not copy or distribute
MR BOB WONG
53 NEEDLESS RD
LINCOLN LINCOLNSHIRE
LN21
ROBERT
ROAD
LN21 1RW
GBR
41. Match / Link / Deduplication
Cleanses, corrects, validates and enriches Beneficiary Institution by matching BIC
codes on SWIFT message to enable accurate AML checks
DQ: highlights & remediates data in-flight
Bank of America NA
BOFAGB22SCP
E14 5AQ
Syncsort Confidential and Proprietary - do not copy or distribute
42. If there was no DQ processing, it would directly
increase the chances of unknowingly processing illegal
transactions, and/or trading with known criminals.
They would have failed AML regulation!
DQ helps ensure DG compliance
AML Summary
Syncsort Confidential and Proprietary - do not copy or distribute
44. 1. Start small: challenges & best practices
Information overload
Multiple versions of the truth
Data challenges
Lack of agility
Identify Business Objectives
• Increase revenue
• Minimize risk
• Decrease costs
Secure Executive Sponsorship
• Identify pain
• Understand policies
• Determine metrics
Initiate Small Projects
• Align to objectives
• Adopt what you need
• Adapt how you see fit
• Gain quick wins
Evaluate Progress
• Understand successes/failures
• Shift as needed
• Establish a ‘way of thinking’
Syncsort Confidential and Proprietary - do not copy or distribute
45. 2. Collaborate: challenges & best practices
Lack of Common Terminology
Organizational Barriers & Silos
Isolated or Unknown Work
Lack of Engagement
Establish a Common Language
• Define terminology – a ‘stake in the ground’
• Map information
• Support with policies/standards
Gain Broader Buy In
• Bring stakeholders together
• Build the structure, culture,
ownership, steering groups,
stewardship over time
Enrich Information
• Discover what you don’t know
• Resolve differences
• Enhance/annotate to increase insight
Share Insights Regularly
• Produce and share tangible outcomes
• Highlight ‘wins’
• Demonstrate efficiencies & savings
Syncsort Confidential and Proprietary - do not copy or distribute
46. 3. Quantify: challenges & best practices
Hidden Activities
Money, Time and Resource
Waste
Lack of Transparency and Trust
Disconnect Between Process
and Measures
Identify Baseline Measures
• Keep a focus on lean and agile
• Define value accurately for the business
Link to Business Performance
• Create and refine streams of value
• Transform culture through action
and empowerment
Monitor, Report and Remediate Issues
• Continuously review
• Ensure issues are visible and understood
• Understand root causes
• Address/resolve issues
Quantify Impact of Changes
• Demonstrate through clearly understood measures
• Establish value continuously
• Finish early, finish often
Syncsort Confidential and Proprietary - do not copy or distribute
48. The accuracy of data directly impinges on any activity
downstream – from analytics, reporting & dashboards,
segmentation & targeting, customer care through to risk &
compliance… in fact ANY business decision!
DQ not only strengthens DG compliance;
it also means you make SENSIBLE BUSINESS DECISIONS
Summary
Syncsort Confidential and Proprietary - do not copy or distribute