DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
Data Quality - Are We There Yet?
1. Data Quality – “Are We There Yet?”
August 17, 2011
Presented
By
Arvind Mattoo, CBIP
2. Data Quality
• Data Quality – Explained
• Data Quality – CEO’s Concern
• Data Quality – CIO’s Nightmare
• Data Quality – PM’s Approach
• Data Quality – IT’s Deliverable
2
3. Data Quality – Dimensions
Process Dimension Business Dimension
• Accessible • Relevant
• Consistent • Existent
• Complete • Reliable
• Lineage • Reportable
• Controllable • Compliant
• Secure • Measurable
Data Quality
FACT
Technical Dimension Time Dimension
• Accurate
• Integral • Currency
• Unique • Timeliness
• Valid • Historical
• Secure
3
4. Dimension – Business
Relevant: Does it Map to our Requirements?
Existent: Do we Own it?
Reliable: Can we Trust it?
Reportable: Can we Visualize it?
Compliance: Is it Mandated?
Measurable: Can we Baseline it?
4
5. Dimension – Process
Accessible: Can I Get it?
Consistent: Can I Standardize it?
Complete: Does it Encompass Usability?
Lineage: Can we Trace it?
Controllable: Can we Discipline it?
Secure: Can we Trust it?
5
6. Dimension – Technical
Accurate: To what Degree does it Jive?
Integral: Does it Comply Structurally?
Unique: To what extent is it De-Duped?
Valid: Does it Conform by the Rules?
Secure: To what Level is it Secured?
6
7. Dimension – Time
Currency: To what Degree is it Current?
Timeliness: How Readily is it Available?
Historical: How far back can we Audit?
7
8. Data Quality – CEO’s Concern
• Lack of Strategic Information Capabilities
• Quality of Decision Making
• Lack of Visibility
• Loss of Opportunities
• Increasing IT Expenditures
• Diminishing Rate of Return
• Lack of Collaboration
8
9. Data Quality – CIO’s Nightmare
• How did we get into this mess?
• How does it impact our business?
• Are we the only one?
• How do we get out of this?
• How do we sustain it?
• Are we there yet?
9
10. Data Quality – As We Speak!
• Data Misused: Not Authorized
• Data Abused: Not Qualified
• Data Confused: Not Clarified
• Data Refused: Not Ratified
• Data Diffused: Not Archived
10
11. How did we get into this mess?
Business Technical
• Mergers • Conversion
• Acquisitions • Manual Data Feeds
• Expansions • Lack of Automation
• Diversification • System Upgrades
• Regulatory • Consolidation
• Lack of Ownership • Insufficient DQ Rules
• Business Process Changes • System Errors
• Lack of Executive Awareness • Source System Changes
• Lack of Training • Lack of Expertise
11
12. How does it impact our business?
CEO CIO
• Reputation at Stake • Time to Reconcile Data
• Lower Quality of Service • Delay in New System Deployment
• Customer dissatisfaction • Poor System Performance
• Loss of Motivation • Loss of Credibility
• Compliance Issues • Downstream System Data Issues
• Expectations not met • No Single Version of Truth
Surging Cost
12
16. How do we get out of this?
• Data Quality – PM’s Approach
• Data Quality – IT’s Deliverables
16
17. Data Quality – PM’s Approach
Methodology
• Assess/Profile Data
• Define Baseline
• Define Metrics and Targets
• Define and Build Data Quality Rules
• Enforce Data Standards across Board
• Monitor Data Quality against Targets
• Review Exceptions and Gaps
• Cataloguing Errors
• Refine Data Quality Rules
• Manage Data Quality against Targets
• Automate Data Quality Process
• Fine Tuning Data Quality Rules
17
18. Data Quality – PM’s Approach
Governance Team
• Governance Committee
• Data Stewards
• Business SME
• Business Analysts
• Technology SME
• Process SME
18
19. Data Quality – PM’s Approach
Technology
• Data Profiler
• CRM
• Data Warehouse
• Master Data Management
• ETL/ELT
• CASE
• Custom Data Integration
• Master Data Integration
19
21. How do we Sustain over time?
• Follow Data Quality Framework
• Profile Data consistently
• Update Rule Based Engine Frequently
• Exploit Embedded DQ Functions/Solutions
• Adopt Proactive Approach
• Establish Stewardship
• Practice DQ Governance
21
22. Data Quality – Are We There Yet?
• Accessible • Accurate
• Relevant • Consistent
• Reliable • Complete
• Reportable • Secured
• Compliant • Integral
22
23. Data Quality – Are We There Yet?
Not really!
Data Quality is an iterative process…
23