SlideShare une entreprise Scribd logo
1  sur  48
Data Governance
in a Big Data Era
Pieter De Leenheer, PhD
Columbia University in the City of New York
August 7, 2017
Administration
• Slide Deck
• < slideshare link>
• Software
• Data Governance Center:
• inno.collibra.com
• User: johnfisher / Pwd: 08072017
• On the go:
• Collibra on the App Store
• Collibra University
• Sign up for free: http://university.collibra.com
• Collibra Blog, e.g.:
• https://www.collibra.com/blog/unleash-the-data-democracy-5-misconceptions-of-data-governance/
Overview
Intro - A Data Governance
Odyssey
•Library of Babel
•About the Company
Part 1 – The Chief Data Officer
Rises
•Digital Darwinism
•The Big Data Bang
•Data Brawls and FUD
•The Chief Data Officer Role Types
Part 2: Data Universe Expands
• Data Value Hierarchies, Networks and Hybrids
• Shift in Data Governance Approaches
• Systems of Record vs. Systems of Engagement
• Challenges:
• Big Data Analytics
• Digitalization of Trust
• Weapons of Math Destruction
Part 3: A Lens on the Data Universe
• A system of record for data
• Use Cases
Introduction
2008: A Data Governance Odyssey
Library of
Babel
JL Borges
6 | ©Collibra 2016
Leading Data Governance Software company
HQ in NYC
Founded in BE,
Spin Off VUB
Engineering in EU
7 | ©Collibra 2016
Across verticals & geographies
8 | ©Collibra 2016
With Blue-chip Customers
Analysts’ Positioning
A Fool with a Tool:
Collibra Data
Educationuniversity.collibra.com
Part 1
The Chief Data Officer Rises
What do the companies in these groups have in
common?
• Group A: American Motors, Brown Shoe, Studebaker,
Collins Radio, Detroit Steel, Zenith Electronics, and
National Sugar Refining.
• Group B: Boeing, Campbell Soup, General Motors,
Kellogg, Procter and Gamble, Deere, IBM and Whirlpool.
• Group C: Facebook, eBay, Home Depot, Microsoft, Office
Depot and Target.
Conclusion
• only 12.2% of the Fortune 500 companies in 1955 were
still on the list 59 years later in 2014
• life expectancy of a firm in the Fortune 500
• 50 years ago : ~ 75 years
• Today: < 15 years and declining
MIT Technology Review, Sept., 2013
What happened between 1955 and today
that caused this ‘creative destruction’?
• Name some compelling events in information
technology history
• Order them chronologically
• Try to explain the phenomenon in terms of the events
• E.g.,
• Invention of the transistor
• First modern computer
• Publication of the Internet protocol
• Launch of the World Wide Web
• Wikipedia
• Internet startups: FB, Google, etc.
• data big bang
Data Big Bang
• Phenomenon: connectivity between
• Social
• Knowledge
• Technology
• Draws curiosity
• Web Science (Pentland, etc)
• Big Data Native Market Entrants (23andMe, Uber,
Inventure)
• Big-date native entrants
• 23andMe, Uber, Inventure
• Enter Bottom up, Low-end and disrupt
• Pure data strategy
• Serving “data-citizen” Millenials
• +80% unstructured data or ‘dark energy’
16 | ©Collibra 2016
“Data-driven” is the Holy Grail of business
17 | ©Collibra 2016
But Becoming the Fittest is not Easy
74%of firms say they want
to be
data-driven
29%Say they are good at
connecting
analytics to action
In reality, only
18 | ©Collibra 2016
Here’s what typically happens
• Hours spent preparing for the meeting
• Collect data from finance, IT and the
data lake
• Do your analysis, prepare insights and
recommendations
• You’re ready to go
19 | ©Collibra 2016
Same company, same data, different results
20 | ©Collibra 2016
Meetings dissolve into data brawls
You need: trustworthy data, common understanding, complete
traceability, transparent data ownership
21 | ©Collibra 2016
Solving the problem with duct tape and chewing gum
No unified system.
22 | ©Collibra 2016
Who can help
me with
this ?
I don’t trust
these numbers
Am I allowed
to use this ? Where do I find
the data ?
I don’t
understand this
report
Data Chaos Leads to Data FUD*
* fear, uncertainty and doubt
TREND
Exploding volume,
velocity, and veracity
of data
NEED
Manage data complexity
TREND
Increased reliance on
analytics and
regulatory reporting
NEED
Trusted data as
a business dependency
NEED
Data
Collaboration
Understanding
Discovery
Trust
Data Infrastructure (IT) Data Consumers (Business)
23 | ©Collibra 2016
You Need the Right Level of Control and Trust in Data
Data governance & stewardship provide the right level of control and trust in data
LEADERSHIP
CEO, CFO, VP, Marketing
ROLES
Data Scientist, Business Analyst
TECHNOLOGY
Visualization, Self-service BI
LEADERSHIP
CIO
ROLES
Information Manager, Data
Architect, Data Modeler
TECHNOLOGY
Hadoop, Databases, Data
Integration
LEADERSHIP
Chief Data Officer
Data
Collaboration
ROLES
Data Governance Manager, Data
Steward
TECHNOLOGY
Data Stewardship
Platform
Data Infrastructure (IT) Data Consumers (Business)
24 | ©Collibra 2016
The Rise of the Chief Data Officer (CDO)
Gartner on Chief Data Officers:
1,000 Chief Data Officers or Chief Analytics
Officers Forecast in Large Organizations by
the End of 2015, Up From 400 in 2014
90% of Large Organizations Will Have a Chief
Data Officer by 2019
Role Types for the Chief Data Officer
(CDO)
(Lee et al., 2014)
• Dimensions of CDO Roles
• Collaboration: inwards / outwards
• Data Space: traditional data / big data
• Value Impact: service / strategy
• Reporting:
• 30% to CDO
• 20% to COO
• 10% to CFO
CDO Assessment
Join our MIT Sloan CDO Research
https://university.collibra.com/cdo-survey/
Shift in Data Governance Approaches
• Digital forces pose gigantic risk as well as opportunity on organizations,
balance needed between:
• Hierarchical data governance (system of record)
• CDO as a Coordinator: Inward-oriented / Traditional Data / Service
• Defensive: Risk-driven
• Scarcity: Few consumers, few producers
• Compromises on old obsolete cost assumptions of digital power
• Use of digital optimizes to some extent
• Not scalable for big data by larger ‘data scientist’ populations
• Networked data governance (systems of engagement)
• CDO as an Experimenter: Outward / Big Data / Strategy
• Offensive: Value-driven
• Abundance
• Many Producers(Data Democratization)
• Eliminate Breadlines
• Consumerization of BI and cheap digital power
• Many serve many
• Supports customer
• Many Consumers (Data Amazonification)
• Access, SLA, Trust, Secure Cloud, etc
32 | ©Collibra 2016
System of Record vs. System of Engagement (AAIM
2017)
System of Record
• Purpose: control and regulate
• Top-down design around discrete pieces of information
(“records”)
• Decomposition in ‘black boxes’
• Presumes ‘big picture’
• Examples: SFDC, Workday, ServiceNow, Atlassian
System of Engagement
• Purpose: innovate
• Bottom-up, decentralized, incorporate technologies which
encourage peer interactions, leveraged by cloud technologies
• Seed Model
• Emergence of complex system
• Examples: Slack, Confluence
Big Data Analytics
Challenges
• Where everybody has data scientists: predict next
transaction is not competitive anymore
• from 'predict next transaction' to life-long relation
building and value creation
• reduce search and navigation for customer with
better apps
• crowd sourcing to cross compare with and learn
from other customers (Opower, INRIX, zillow)
• get trust from customer through branded non-intrusive
apps: personal health monitoring, Nest
• Retention analysis example
Digitalization of Trust
Challenges
• In Hierarchical Data Governance, trust is
• established by a centrally sanctioned competence center
• Or external appointed trustees with formal roles: steward,
owners, architects
• In Networked Data Governance, trust is more complicated:
• Authenticity: is the data factual or opinioned?
• Intention: does this data have good intentions? Can I use
it without peril? Hidden privacy concerns I should be
aware of?
• Assess expertise or quality: are people involved skilled or
certified stewards?
• Is it accurately representing our business reality, i.e.
customer base?
• Is it complete and up to date?
• Has it be certified through standard process?
Danger of the old paradigm models
• Weapons of Math Destruction (WMD) are
models
• Threaten to destabilize
• Equality
• Democracy
• Traits of WMDs
• Opaque
• Unregulated
• Uncontestable
• …hence : ungoverned
Preliminary Conclusions
• Digital forces have digitally empowered individuals in the organization
• Hybrid data governance approach should combine
• Top-down governance of critical data assets to enhance internal coordination
• Networked peer-driven empowerment to drive ‘serendipity’
• On a shared platform
• Key challenges are:
• Digitalization of trust with focus on social capital
• Big data analytics that drives life-time value for customer
• Data Valuation based on Usage
• Legacy of oblique, unregulated and incontestable models
• Recognize CDO Leadership and Role transition
Part 3
A System of Record for Data Assets
38 | ©Collibra 2016
The System of Record for Data Assets
the authoritative source of information for any given data asset used by (hence valuable for) the organization
Know where the
data comes from
Know what the
data means
Know that the
data is right
39 | ©Collibra 2016
Know where the
data comes from
Know what the
data means
Know that the
data is right
Find Understand Trust
The System of Record for Data Assets
All activities and information surrounding the data, its meaning and its use.
the authoritative source of information for any given data asset used by (hence valuable for) the organization
Demo
Part 3b
Data Governance Business Cases
3 Industries
• Technology – Big Data Valuation
• Health Care – Reference Data
• Manufacturing – IoT and GDPR
• Banking - Compliance
• Can you identify hierarchical vs networked mechanisms in these
business cases?
Big Data in Tech Industry - Life Cycle
Not all data is of equal value
• At Dell, lean data governance:
catalogs an inventory of all types of
data assets while implementing a
minimum set of business specific
metadata attributes
• Data is governed based on level of
consumption – the value of the
data and how much is shared
• Categorized as enterprise
supported “operationalized" or
innovation discovery
(courtesy Barbara Tulippe)
Data Valuation
Reference Data in Health Care
• Independence is the leading health
insurer in southeastern
Pennsylvania.
• Serve close to seven million
people nationwide, including
2.1 million in the region.
• 42,000 physicians
• 160 hospitals
https://prezi.com/ve1ws8jmpqcn/workflow/
IoT + GDPR in Manufacturing
• Internet of Things and GDPR
• From responsive to competitive advantage
• Steps
• Identify processes ‘touching’ EU citizen data: employees / customers /…
• Identify critical data elements: name, SSN, address
• Lineage / Traceability
https://hbr.org/2014/11/how-smart-connected-products-are-transforming-competition
Compliance in Banking - Scorecard
• How many Critical Data Elements (CDEs) have a dedicated
stewardship resource assigned?
• Are those Business Stewards actively participating in stewardship
activities?
• Are CDEs progressing through the expected life cycle?
• Are relations to physical data assets and source systems defined?
• Is data profiling occurring based on defined Data Quality Rules?
Compliance in Banking – Operating
Model
Compliance results fetching
Principles &
Requirements
Policies &
Standards
Regulatory Report
Catalog
Critical reporting
elements
Regulations &
Regulators
Lines of Business Business Process Data Categories System inventory
Enterprise
business glossary
Market risk
business glossary
Credit risk
business glossary
Third party
business glossary
Financial instruments
business glossary
Compliance score cards
Collibra DGC automation for
computed results on asset counts
Collibra Connect for results
coming from external sytems
Collibra Workflows for results
captured by stakeholders
BCBS 239 model & content (1.0)
BCBS 239 Metamodel
BCBS 239 compliance KPI, result capturing mechanisms & scorecards (3.0)
Company specifics
Reference Business Glossaries (1.0)
…
1
2
3
4
BCBS 239 model & content – Collibra
configured metamodel and loaded
BCBS 239 content from by Basel
committee and other recognized
bodies. Independent from customer
context. Can be used out-of-the box.
BCBS 239 compliance KPI, result
capturing mechanisms & scorecards –
Collibra out-of-the box workflows,
asset counts dashboards. Compliance
KPIs and scorecards. To be used with
specific customer configurations and
integrations when required.
Company specifics – Collibra standard
content to be modified to fit to
companies specifics.
Business Glossaries – Common
definitions on major financial business
concepts. To be used with specific
customer adaptations.
1
2
3
4
Critical Business Term PoliciesData Categories Principles
Life Cycle Management (2.0)Life Cycle Management (2.0)3
Business Dimensions
Compliance in Banking - Scorecard
Appendix
Recommended Reading
• Books:
• O’Neil, C.: Weapons of Math Destruction
• Franks, B.: Taming the Big Data Tidal Wave
• Sundararajan, A.: The Sharing Economy
• Pentland, S.: Social Physics: How Good Ideas Spread
• Zittrain, J.: The Future of the Internet
• Tunguz, T.; Bien, F. (2016) Winning with Data
• Articles:
• Lee et al. (2014) A Cubic Framework for the Chief Data Officer: Succeeding in a World of Big Data. MIS Quarterly Executive 13:1
• AAIM, Systems of Engagement and the Future of Enterprise IT (2017)
• http://mitiq.mit.edu/IQIS/Documents/CDOIQS_201177/Papers/05_01_7A-1_Laney.pdf
• http://si.deis.unical.it/zumpano/2004-2005/PSI/lezione2/ValueOfInformation.pdf
• http://dupress.deloitte.com/dup-us-en/topics/emerging-technologies/the-burdens-of-the-past.html
• Blog Posts
• https://www.collibra.com/blog/unleash-the-data-democracy-5-misconceptions-of-data-governance/
• https://www.collibra.com/blog/the-rise-of-the-chief-data-officer-cdo/
• https://www.collibra.com/blog/blognew-years-resolution/
• https://www.collibra.com/blog/data-lineage-diagrams-paradigm-shift-information-architects/

Contenu connexe

Tendances

To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...Jochem van Grondelle
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceDenodo
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for DinnerKent Graziano
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best PracticesCapgemini
 
Data Governance
Data GovernanceData Governance
Data GovernanceRob Lux
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best PracticesDATAVERSITY
 
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation RoadmapWallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation RoadmapDavid Walker
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesLars E Martinsson
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)James Serra
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesDATAVERSITY
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Tristan Baker
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsDATAVERSITY
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodologyDatabase Architechs
 
Data Governance
Data GovernanceData Governance
Data GovernanceBoris Otto
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Formalize Data Governance with Policies and Procedures
Formalize Data Governance with Policies and ProceduresFormalize Data Governance with Policies and Procedures
Formalize Data Governance with Policies and ProceduresDATAVERSITY
 

Tendances (20)

To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
To mesh or mess up your data organisation - Jochem van Grondelle (Prosus/OLX ...
 
[XConf Brasil 2020] Data mesh
[XConf Brasil 2020] Data mesh[XConf Brasil 2020] Data mesh
[XConf Brasil 2020] Data mesh
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Data Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and GovernanceData Catalog for Better Data Discovery and Governance
Data Catalog for Better Data Discovery and Governance
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best Practices
 
Data Governance
Data GovernanceData Governance
Data Governance
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
Wallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation RoadmapWallchart - Data Warehouse Documentation Roadmap
Wallchart - Data Warehouse Documentation Roadmap
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021
 
Data Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced AnalyticsData Architecture Best Practices for Advanced Analytics
Data Architecture Best Practices for Advanced Analytics
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodology
 
Data Governance
Data GovernanceData Governance
Data Governance
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Formalize Data Governance with Policies and Procedures
Formalize Data Governance with Policies and ProceduresFormalize Data Governance with Policies and Procedures
Formalize Data Governance with Policies and Procedures
 

Similaire à Data Governance in a big data era

MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...Pieter De Leenheer
 
Data Governance in the Big Data Era
Data Governance in the Big Data EraData Governance in the Big Data Era
Data Governance in the Big Data EraPieter De Leenheer
 
Usama Fayyad talk in South Africa: From BigData to Data Science
Usama Fayyad talk in South Africa:  From BigData to Data ScienceUsama Fayyad talk in South Africa:  From BigData to Data Science
Usama Fayyad talk in South Africa: From BigData to Data ScienceUsama Fayyad
 
Fate of the Chief Data Officer
Fate of the Chief Data OfficerFate of the Chief Data Officer
Fate of the Chief Data OfficerTamarah Usher
 
Big data
Big dataBig data
Big dataRiya
 
Perspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data GovernancePerspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data GovernanceCloudera, Inc.
 
-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...webwinkelvakdag
 
Big Data, Big Investment
Big Data, Big InvestmentBig Data, Big Investment
Big Data, Big InvestmentGGV Capital
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Mukul Krishna
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationDatabricks
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataUmair Shafique
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Generating Big Value from Big Data
Generating Big Value from Big DataGenerating Big Value from Big Data
Generating Big Value from Big DataBrendan Aldrich
 
Keynote: Graphs in Government_Lance Walter, CMO
Keynote:  Graphs in Government_Lance Walter, CMOKeynote:  Graphs in Government_Lance Walter, CMO
Keynote: Graphs in Government_Lance Walter, CMONeo4j
 
Riding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information TechnologyRiding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information TechnologyGoutama Bachtiar
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Sciencedlamb3244
 
Data2030 Summit MEA: Data Chaos to Data Culture March 2023
Data2030 Summit MEA: Data Chaos to Data Culture March 2023Data2030 Summit MEA: Data Chaos to Data Culture March 2023
Data2030 Summit MEA: Data Chaos to Data Culture March 2023Matt Turner
 
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...Steven Callahan
 
Valuing the data asset
Valuing the data assetValuing the data asset
Valuing the data assetBala Iyer
 

Similaire à Data Governance in a big data era (20)

MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
MIT ICIQ 2017 Keynote: Data Governance and Data Capitalization in the Big Dat...
 
Data Governance in the Big Data Era
Data Governance in the Big Data EraData Governance in the Big Data Era
Data Governance in the Big Data Era
 
Usama Fayyad talk in South Africa: From BigData to Data Science
Usama Fayyad talk in South Africa:  From BigData to Data ScienceUsama Fayyad talk in South Africa:  From BigData to Data Science
Usama Fayyad talk in South Africa: From BigData to Data Science
 
Fate of the Chief Data Officer
Fate of the Chief Data OfficerFate of the Chief Data Officer
Fate of the Chief Data Officer
 
Big data
Big dataBig data
Big data
 
Perspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data GovernancePerspectives on Ethical Big Data Governance
Perspectives on Ethical Big Data Governance
 
-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...-Enrichment - Unlocking the value of data for digital transformation - Big Da...
-Enrichment - Unlocking the value of data for digital transformation - Big Da...
 
Big Data, Big Investment
Big Data, Big InvestmentBig Data, Big Investment
Big Data, Big Investment
 
Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101Internet of things, Big Data and Analytics 101
Internet of things, Big Data and Analytics 101
 
Active Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with AlationActive Governance Across the Delta Lake with Alation
Active Governance Across the Delta Lake with Alation
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Generating Big Value from Big Data
Generating Big Value from Big DataGenerating Big Value from Big Data
Generating Big Value from Big Data
 
Keynote: Graphs in Government_Lance Walter, CMO
Keynote:  Graphs in Government_Lance Walter, CMOKeynote:  Graphs in Government_Lance Walter, CMO
Keynote: Graphs in Government_Lance Walter, CMO
 
Riding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information TechnologyRiding and Capitalizing the Next Wave of Information Technology
Riding and Capitalizing the Next Wave of Information Technology
 
Big Data, Analytics and Data Science
Big Data, Analytics and Data ScienceBig Data, Analytics and Data Science
Big Data, Analytics and Data Science
 
Data2030 Summit MEA: Data Chaos to Data Culture March 2023
Data2030 Summit MEA: Data Chaos to Data Culture March 2023Data2030 Summit MEA: Data Chaos to Data Culture March 2023
Data2030 Summit MEA: Data Chaos to Data Culture March 2023
 
uae views on big data
  uae views on  big data  uae views on  big data
uae views on big data
 
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
20140826 I&T Webinar_The Proliferation of Data - Finding Meaning Amidst the N...
 
Valuing the data asset
Valuing the data assetValuing the data asset
Valuing the data asset
 

Plus de Pieter De Leenheer

TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...Pieter De Leenheer
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...Pieter De Leenheer
 
Business Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and StewardshipBusiness Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and StewardshipPieter De Leenheer
 
Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...Pieter De Leenheer
 
Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...Pieter De Leenheer
 
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAPOpen Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAPPieter De Leenheer
 

Plus de Pieter De Leenheer (6)

TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
TiE DC GovCon Panel on Emerging Technologies: AI/ML/Blockchain/Data Managemen...
 
The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...The Data Driven University - Automating Data Governance and Stewardship in Au...
The Data Driven University - Automating Data Governance and Stewardship in Au...
 
Business Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and StewardshipBusiness Semantics for Data Governance and Stewardship
Business Semantics for Data Governance and Stewardship
 
Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...Data Stewardship and Governance: how to reach global adoption and systematic ...
Data Stewardship and Governance: how to reach global adoption and systematic ...
 
Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...Business Service Semantics: Ontological Representation & Governance of Busine...
Business Service Semantics: Ontological Representation & Governance of Busine...
 
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAPOpen Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
Open Standards for the Semantic Web: XML / RDF(S) / OWL / SOAP
 

Dernier

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 

Dernier (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 

Data Governance in a big data era

  • 1. Data Governance in a Big Data Era Pieter De Leenheer, PhD Columbia University in the City of New York August 7, 2017
  • 2. Administration • Slide Deck • < slideshare link> • Software • Data Governance Center: • inno.collibra.com • User: johnfisher / Pwd: 08072017 • On the go: • Collibra on the App Store • Collibra University • Sign up for free: http://university.collibra.com • Collibra Blog, e.g.: • https://www.collibra.com/blog/unleash-the-data-democracy-5-misconceptions-of-data-governance/
  • 3. Overview Intro - A Data Governance Odyssey •Library of Babel •About the Company Part 1 – The Chief Data Officer Rises •Digital Darwinism •The Big Data Bang •Data Brawls and FUD •The Chief Data Officer Role Types Part 2: Data Universe Expands • Data Value Hierarchies, Networks and Hybrids • Shift in Data Governance Approaches • Systems of Record vs. Systems of Engagement • Challenges: • Big Data Analytics • Digitalization of Trust • Weapons of Math Destruction Part 3: A Lens on the Data Universe • A system of record for data • Use Cases
  • 4. Introduction 2008: A Data Governance Odyssey
  • 6. 6 | ©Collibra 2016 Leading Data Governance Software company HQ in NYC Founded in BE, Spin Off VUB Engineering in EU
  • 7. 7 | ©Collibra 2016 Across verticals & geographies
  • 8. 8 | ©Collibra 2016 With Blue-chip Customers
  • 10. A Fool with a Tool: Collibra Data Educationuniversity.collibra.com
  • 11. Part 1 The Chief Data Officer Rises
  • 12. What do the companies in these groups have in common? • Group A: American Motors, Brown Shoe, Studebaker, Collins Radio, Detroit Steel, Zenith Electronics, and National Sugar Refining. • Group B: Boeing, Campbell Soup, General Motors, Kellogg, Procter and Gamble, Deere, IBM and Whirlpool. • Group C: Facebook, eBay, Home Depot, Microsoft, Office Depot and Target. Conclusion • only 12.2% of the Fortune 500 companies in 1955 were still on the list 59 years later in 2014 • life expectancy of a firm in the Fortune 500 • 50 years ago : ~ 75 years • Today: < 15 years and declining MIT Technology Review, Sept., 2013
  • 13. What happened between 1955 and today that caused this ‘creative destruction’? • Name some compelling events in information technology history • Order them chronologically • Try to explain the phenomenon in terms of the events • E.g., • Invention of the transistor • First modern computer • Publication of the Internet protocol • Launch of the World Wide Web • Wikipedia • Internet startups: FB, Google, etc. • data big bang
  • 14. Data Big Bang • Phenomenon: connectivity between • Social • Knowledge • Technology • Draws curiosity • Web Science (Pentland, etc) • Big Data Native Market Entrants (23andMe, Uber, Inventure) • Big-date native entrants • 23andMe, Uber, Inventure • Enter Bottom up, Low-end and disrupt • Pure data strategy • Serving “data-citizen” Millenials • +80% unstructured data or ‘dark energy’
  • 15. 16 | ©Collibra 2016 “Data-driven” is the Holy Grail of business
  • 16. 17 | ©Collibra 2016 But Becoming the Fittest is not Easy 74%of firms say they want to be data-driven 29%Say they are good at connecting analytics to action In reality, only
  • 17. 18 | ©Collibra 2016 Here’s what typically happens • Hours spent preparing for the meeting • Collect data from finance, IT and the data lake • Do your analysis, prepare insights and recommendations • You’re ready to go
  • 18. 19 | ©Collibra 2016 Same company, same data, different results
  • 19. 20 | ©Collibra 2016 Meetings dissolve into data brawls You need: trustworthy data, common understanding, complete traceability, transparent data ownership
  • 20. 21 | ©Collibra 2016 Solving the problem with duct tape and chewing gum No unified system.
  • 21. 22 | ©Collibra 2016 Who can help me with this ? I don’t trust these numbers Am I allowed to use this ? Where do I find the data ? I don’t understand this report Data Chaos Leads to Data FUD* * fear, uncertainty and doubt TREND Exploding volume, velocity, and veracity of data NEED Manage data complexity TREND Increased reliance on analytics and regulatory reporting NEED Trusted data as a business dependency NEED Data Collaboration Understanding Discovery Trust Data Infrastructure (IT) Data Consumers (Business)
  • 22. 23 | ©Collibra 2016 You Need the Right Level of Control and Trust in Data Data governance & stewardship provide the right level of control and trust in data LEADERSHIP CEO, CFO, VP, Marketing ROLES Data Scientist, Business Analyst TECHNOLOGY Visualization, Self-service BI LEADERSHIP CIO ROLES Information Manager, Data Architect, Data Modeler TECHNOLOGY Hadoop, Databases, Data Integration LEADERSHIP Chief Data Officer Data Collaboration ROLES Data Governance Manager, Data Steward TECHNOLOGY Data Stewardship Platform Data Infrastructure (IT) Data Consumers (Business)
  • 23. 24 | ©Collibra 2016 The Rise of the Chief Data Officer (CDO) Gartner on Chief Data Officers: 1,000 Chief Data Officers or Chief Analytics Officers Forecast in Large Organizations by the End of 2015, Up From 400 in 2014 90% of Large Organizations Will Have a Chief Data Officer by 2019
  • 24. Role Types for the Chief Data Officer (CDO) (Lee et al., 2014) • Dimensions of CDO Roles • Collaboration: inwards / outwards • Data Space: traditional data / big data • Value Impact: service / strategy • Reporting: • 30% to CDO • 20% to COO • 10% to CFO
  • 25. CDO Assessment Join our MIT Sloan CDO Research https://university.collibra.com/cdo-survey/
  • 26. Shift in Data Governance Approaches • Digital forces pose gigantic risk as well as opportunity on organizations, balance needed between: • Hierarchical data governance (system of record) • CDO as a Coordinator: Inward-oriented / Traditional Data / Service • Defensive: Risk-driven • Scarcity: Few consumers, few producers • Compromises on old obsolete cost assumptions of digital power • Use of digital optimizes to some extent • Not scalable for big data by larger ‘data scientist’ populations • Networked data governance (systems of engagement) • CDO as an Experimenter: Outward / Big Data / Strategy • Offensive: Value-driven • Abundance • Many Producers(Data Democratization) • Eliminate Breadlines • Consumerization of BI and cheap digital power • Many serve many • Supports customer • Many Consumers (Data Amazonification) • Access, SLA, Trust, Secure Cloud, etc
  • 27. 32 | ©Collibra 2016 System of Record vs. System of Engagement (AAIM 2017) System of Record • Purpose: control and regulate • Top-down design around discrete pieces of information (“records”) • Decomposition in ‘black boxes’ • Presumes ‘big picture’ • Examples: SFDC, Workday, ServiceNow, Atlassian System of Engagement • Purpose: innovate • Bottom-up, decentralized, incorporate technologies which encourage peer interactions, leveraged by cloud technologies • Seed Model • Emergence of complex system • Examples: Slack, Confluence
  • 28. Big Data Analytics Challenges • Where everybody has data scientists: predict next transaction is not competitive anymore • from 'predict next transaction' to life-long relation building and value creation • reduce search and navigation for customer with better apps • crowd sourcing to cross compare with and learn from other customers (Opower, INRIX, zillow) • get trust from customer through branded non-intrusive apps: personal health monitoring, Nest • Retention analysis example
  • 29. Digitalization of Trust Challenges • In Hierarchical Data Governance, trust is • established by a centrally sanctioned competence center • Or external appointed trustees with formal roles: steward, owners, architects • In Networked Data Governance, trust is more complicated: • Authenticity: is the data factual or opinioned? • Intention: does this data have good intentions? Can I use it without peril? Hidden privacy concerns I should be aware of? • Assess expertise or quality: are people involved skilled or certified stewards? • Is it accurately representing our business reality, i.e. customer base? • Is it complete and up to date? • Has it be certified through standard process?
  • 30. Danger of the old paradigm models • Weapons of Math Destruction (WMD) are models • Threaten to destabilize • Equality • Democracy • Traits of WMDs • Opaque • Unregulated • Uncontestable • …hence : ungoverned
  • 31. Preliminary Conclusions • Digital forces have digitally empowered individuals in the organization • Hybrid data governance approach should combine • Top-down governance of critical data assets to enhance internal coordination • Networked peer-driven empowerment to drive ‘serendipity’ • On a shared platform • Key challenges are: • Digitalization of trust with focus on social capital • Big data analytics that drives life-time value for customer • Data Valuation based on Usage • Legacy of oblique, unregulated and incontestable models • Recognize CDO Leadership and Role transition
  • 32. Part 3 A System of Record for Data Assets
  • 33. 38 | ©Collibra 2016 The System of Record for Data Assets the authoritative source of information for any given data asset used by (hence valuable for) the organization Know where the data comes from Know what the data means Know that the data is right
  • 34. 39 | ©Collibra 2016 Know where the data comes from Know what the data means Know that the data is right Find Understand Trust The System of Record for Data Assets All activities and information surrounding the data, its meaning and its use. the authoritative source of information for any given data asset used by (hence valuable for) the organization
  • 35. Demo
  • 36. Part 3b Data Governance Business Cases
  • 37. 3 Industries • Technology – Big Data Valuation • Health Care – Reference Data • Manufacturing – IoT and GDPR • Banking - Compliance • Can you identify hierarchical vs networked mechanisms in these business cases?
  • 38. Big Data in Tech Industry - Life Cycle
  • 39. Not all data is of equal value • At Dell, lean data governance: catalogs an inventory of all types of data assets while implementing a minimum set of business specific metadata attributes • Data is governed based on level of consumption – the value of the data and how much is shared • Categorized as enterprise supported “operationalized" or innovation discovery (courtesy Barbara Tulippe)
  • 41. Reference Data in Health Care • Independence is the leading health insurer in southeastern Pennsylvania. • Serve close to seven million people nationwide, including 2.1 million in the region. • 42,000 physicians • 160 hospitals https://prezi.com/ve1ws8jmpqcn/workflow/
  • 42. IoT + GDPR in Manufacturing • Internet of Things and GDPR • From responsive to competitive advantage • Steps • Identify processes ‘touching’ EU citizen data: employees / customers /… • Identify critical data elements: name, SSN, address • Lineage / Traceability https://hbr.org/2014/11/how-smart-connected-products-are-transforming-competition
  • 43.
  • 44. Compliance in Banking - Scorecard • How many Critical Data Elements (CDEs) have a dedicated stewardship resource assigned? • Are those Business Stewards actively participating in stewardship activities? • Are CDEs progressing through the expected life cycle? • Are relations to physical data assets and source systems defined? • Is data profiling occurring based on defined Data Quality Rules?
  • 45. Compliance in Banking – Operating Model Compliance results fetching Principles & Requirements Policies & Standards Regulatory Report Catalog Critical reporting elements Regulations & Regulators Lines of Business Business Process Data Categories System inventory Enterprise business glossary Market risk business glossary Credit risk business glossary Third party business glossary Financial instruments business glossary Compliance score cards Collibra DGC automation for computed results on asset counts Collibra Connect for results coming from external sytems Collibra Workflows for results captured by stakeholders BCBS 239 model & content (1.0) BCBS 239 Metamodel BCBS 239 compliance KPI, result capturing mechanisms & scorecards (3.0) Company specifics Reference Business Glossaries (1.0) … 1 2 3 4 BCBS 239 model & content – Collibra configured metamodel and loaded BCBS 239 content from by Basel committee and other recognized bodies. Independent from customer context. Can be used out-of-the box. BCBS 239 compliance KPI, result capturing mechanisms & scorecards – Collibra out-of-the box workflows, asset counts dashboards. Compliance KPIs and scorecards. To be used with specific customer configurations and integrations when required. Company specifics – Collibra standard content to be modified to fit to companies specifics. Business Glossaries – Common definitions on major financial business concepts. To be used with specific customer adaptations. 1 2 3 4 Critical Business Term PoliciesData Categories Principles Life Cycle Management (2.0)Life Cycle Management (2.0)3 Business Dimensions
  • 46. Compliance in Banking - Scorecard
  • 48. Recommended Reading • Books: • O’Neil, C.: Weapons of Math Destruction • Franks, B.: Taming the Big Data Tidal Wave • Sundararajan, A.: The Sharing Economy • Pentland, S.: Social Physics: How Good Ideas Spread • Zittrain, J.: The Future of the Internet • Tunguz, T.; Bien, F. (2016) Winning with Data • Articles: • Lee et al. (2014) A Cubic Framework for the Chief Data Officer: Succeeding in a World of Big Data. MIS Quarterly Executive 13:1 • AAIM, Systems of Engagement and the Future of Enterprise IT (2017) • http://mitiq.mit.edu/IQIS/Documents/CDOIQS_201177/Papers/05_01_7A-1_Laney.pdf • http://si.deis.unical.it/zumpano/2004-2005/PSI/lezione2/ValueOfInformation.pdf • http://dupress.deloitte.com/dup-us-en/topics/emerging-technologies/the-burdens-of-the-past.html • Blog Posts • https://www.collibra.com/blog/unleash-the-data-democracy-5-misconceptions-of-data-governance/ • https://www.collibra.com/blog/the-rise-of-the-chief-data-officer-cdo/ • https://www.collibra.com/blog/blognew-years-resolution/ • https://www.collibra.com/blog/data-lineage-diagrams-paradigm-shift-information-architects/

Notes de l'éditeur

  1. Intro: intro to our company to make sure what the background is of what I am going to tell you
  2. inno.com
  3. Invention of the transistor 60’s Internet 1989 Publication by CERN of global hypertext system 2004 Social media streaming Explosion of mobie devices
  4. * Evolution , not transformation because the latter would not take into account the “creative destruction” part. An acceleration will happen as many of these young data citizens will enter the job market and change their employers behaviour.
  5. Data-driven is the word of the day. Increasingly depending on data to thrive
  6. According to Forrester Research, even though 74% of firms say they want to be data-driven, in reality, only 29% say the are good at connecting analytics to action. There are many reasons why this happens. See if this one sounds familiar:
  7. You spend hours preparing for a meeting. You collect the data you need from finance, IT, and even retrieve some from the data lake. You analyze it from every angle, and prepare your insights and recommendations. You’re confident in your findings, and are ready to make your data-driven argument.
  8. But before you have a chance to present, your colleague presents her findings and recommendations, using data he analyzed from the data lake, salesforce.com, and IT. It’s vaguely familiar, but distinctly different and leads to a data-driven argument with a completely different conclusion.
  9. Now, instead of making a data driven argument, you find yourself engaged in a data brawl. The meeting is no longer focused on the decisions at hand, but rather on answering questions about the data itself: Where did it come from? What does it mean? What metrics were applied to it? Who can access it? It is correct? If not, how can I fix it? Everyone realizes that “data-driven” doesn’t work without trustworthy data.
  10. In an effort to answer those questions so this situation doesn’t happen again, you head back to your office and put in place processes driven by Excel spreadsheets, emails, meetings, and Sharepoint documents. You track information about the data - where it comes from, who can access it, what it means, and more - in all these different ways, holding the process together with duct tape and string. Your colleague does the same, and before you know it, many people across your organization have a similar process that they follow, which only multiplies the problem. These processes are inefficient, inaccurate, and expensive. It’s no wonder your meetings dissolve into data brawls. This approach isn’t just unsustainable – it simply doesn’t work. ENDLESS MEETINGS – people fly in every two weeks
  11. On the left side of the diagram, you have the traditional data infrastructure (IT) and they are coping with the trend of exploding volumes and types of data.   In addition, data growth is becoming more and more complex, increasing their challenge. At the same time, on the right side, you have the business who need to report more complex data, faster than ever, as well as a need for analytics.   This can include internal reports, which are often compiled manually on spreadsheets; often reports rely on multiple layers of interrelated spreadsheets. This makes it more difficult than ever to trust and depend on the data.   Then from the Business focus, in the center, you have all kinds of trends like consumerization of IT (where IT is purchasing software tools), social and data maturity.   This leads to a need for a data authority. #
  12. If all these communication channels are in place, we can trace whereabouts and usage of every data product individually, from the definition level down to the storage.
  13. If all these communication channels are in place, we can trace whereabouts and usage of every data product individually, from the definition level down to the storage.
  14. Outwards: e.g., manufacturing company may agree with his suppliers and distributors on 1 global product ID Traditional: enterprise-level MDM, BI and Analytics Big Data: more on the application level, more self-service BI, more data scientist experimenting with big data require appropr. Approvals for data usage and sharing Data as a service as immediate need to improve service quality, regulatory compliance, reputation of the company Data as a strategy: build aggregated data products and resell them as a strategy: e.g., the ab company selling GPS information of cabs to Google.
  15. Top-down : methodological step-wise decomposition into sub components ‘black boxes’ in a reverse engineering fasion Presumes a preconcpetion of the ‘big picture’ Bottom =-up Allowed behaviour based on simple rules set Subsystems gives rise to complex systems Original systems become subsystems of the emeging ssytem Perception Seed model
  16. To be data driven, an organization needs three things: Knowing where its data comes from Knowing what it means, and Knowing that it’s right What if everyone in your organization had the same ability – to find, understand, and trust their data? This is what Collibra brings to its clients:
  17. Collibra allows you to find, understand and trust your data. Finding data by giving you a catalog which ingests information about datasets and the metadata underneath. Understanding data by putting this metadata into context. By linking it to business concepts: tags, business units, data dimensions, KPI’s reports... And finally Collibra allows you to trust this information by enabling policy management, a data helpdesk and the worlds best stewardship platform.