SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Tristan Baker - linkedin.com/in/tristanbaker
Suresh Raman - linkedin.com/in/ramansuresh
Allison Bellah (in absentia) - linkedin.com/in/allisonbellah
Data Mesh at Intuit
May 13, 2021
©2021 Intuit Inc. All rights reserved. 2
Arriving at data mesh
Our vision and four part strategy
Now with 25% more parts!
Q&A
Fire away!
Agenda
Arriving at data mesh
©2021 Intuit Inc. All rights reserved. 4
A brief history of data infrastructure
©2021 Intuit Inc. All rights reserved. 5
A brief history of data infrastructure
©2021 Intuit Inc. All rights reserved. 6
A brief history of data infrastructure
©2021 Intuit Inc. All rights reserved. 7
Today
©2021 Intuit Inc. All rights reserved. 8
What “we cannot scale” sounds like from our users
Discovering Data
● Where can I find data about a particular thing (customer,
company, etc)?
● Where can I find the data sourced from a particular product
or service?
Understanding Data
● Who can approve my access so that I can see samples of the
data?
● What is the schema of the data?
● What is the business meaning and context of the data?
● Is this data related to other concepts? Is it joinable to other
data? What is the meaning of the relationship?
Trusting Data
● What system produces this data and at what latency?
● What other systems use this data?
● What is the quality of this data? Is it ‘clean’?
● Which team supports this data if it breaks?
Publishing Data
● How do I describe my data so that others understand what it
means and how to use it?
● Where do I host my data so that other systems can access it?
● Data systems are complicated, how can I build and operate
my process on top of one?
● What are my operational responsibilities once my
process/data is in production?
● How do I meet my compliance requirements for
processing/storing/publishing data?
● Am I duplicating processing/data that already exists?
Consuming Data
● How is this table/topic partitioned?
● Who can approve my production system to access it?
● Will I get alerted if the schema changes?
©2021 Intuit Inc. All rights reserved. 9
The future of data infrastructure
● Data treated as code
● Data service as a facet of a product
● Data responsibility decentralized
● Producers take responsibility for data
● Producers serve consumers
● Data platform provides the ecosystem to
govern and manage the lifecycle of data and
machine learning
The provocation
Data Mesh is born
Our vision and four part
strategy
©2021 Intuit Inc. All rights reserved. 11
Enable more Intuit teams
to more easily use and
create data
©2021 Intuit Inc. All rights reserved. 12
Four part strategy
• Stewardship
– ensures accountability for a set of defined responsibilities in building and managing their solutions; including
adherence to a set of defined best practices to produce only high quality data.
• Organizing people, code and data
– A systematic approach to organizing the people, code and data which clearly identifies the owners of a business problem and its
solution.
• Self serve products
– A rich suite of self serve products that enable teams to more easily author, deploy, govern and operate their own solutions, aided
by automation and processes that support best practices and high quality as a precondition for deployment.
• Rationalizing data definitions
– A process for rationalizing all critical data definitions at the company so that data concepts like Customer, Product and
Entitlement are unique, re-usable and non-conflicting.
Stewardship
©2021 Intuit Inc. All rights reserved. 14
©2021 Intuit Inc. All rights reserved. 15
Stewardship goals for next year
Organizing People,
Code, and Data
©2021 Intuit Inc. All rights reserved. 17
Raw information about physical systems that describes where the data is stored and where code is
executing. This describes where data is physically located so that it can be accessed.
©2021 Intuit Inc. All rights reserved. 18
Basic dependency, ownership and classification information provides additional context about physical
data and code locations so that data can be better governed, secured and operated by the owning
teams.
©2021 Intuit Inc. All rights reserved. 19
Why organizing people, code and data matters
19
Private vs Public
~50% tables are either
temp/sandbox/staging/test/backu
p tables
- Messes up search & discovery
- Teams consume data not meant
for external use
Data Ownership
~50% tables don’t have clearly
identified owners
- Erodes Trust
- Copies proliferate
- Operational, Governance risk
Self Serve Products
©2021 Intuit Inc. All rights reserved. 21
Data Processing Capabilities Data Serving Capabilities
©2021 Intuit Inc. All rights reserved. 22
Self Serve goals for next year
100% of Top 20 tasks in the Data lifecycle are Self Serve
Infra Provisioning
● Transactional Persistence
● Compute for stream, batch
processing
● Monitor, Debug Infra
● Cost
Data Authoring
● Events, Schemas
● Ingestion
● Transformations
● Entities
● ML Features
● Data Quality,
Observability
● Orchestration
Data Governance
● Access Management
● Key management
● Compliance Controls &
Audit
● Privacy
Rationalizing Data
Definitions
©2021 Intuit Inc. All rights reserved. 24
Clean entity information with formally defined meaning and relationships enables better data understanding. This
is the purpose of entity definitions. They ensure that data is clean, organized, connected, discoverable and
documented in a formal way.
©2021 Intuit Inc. All rights reserved. 25
When you bring it all together, you get Intuit’s Data Mesh
©2021 Intuit Inc. All rights reserved. 26
©2021 Intuit Inc. All rights reserved. 27
©2021 Intuit Inc. All rights reserved. 28
©2021 Intuit Inc. All rights reserved. 29
©2021 Intuit Inc. All rights reserved. 30
Capturing meaning,
relationship, ownership, and
system dependencies builds a
full, rich picture for everyone.
No tribal knowledge needed!
In this example, the clean
information describes entities
OII Account and Intuit Product
and the Entitled To relationship
between them.
The basic information describes
how the data for these entities
are sourced from the Identity
Universal Service and the
Entitlement Reference Service.
The raw information describes
which Event Bus topic and Data
Lake table the data for these
entities can be found in.
Q&A
Tristan Baker - linkedin.com/in/tristanbaker
Suresh Raman - linkedin.com/in/ramansuresh
Allison Bellah (in absentia) - linkedin.com/in/allisonbellah
©2021 Intuit Inc. All rights reserved. 32
32

Contenu connexe

Tendances

Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogDATAVERSITY
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesDATAVERSITY
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...DATAVERSITY
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...HostedbyConfluent
 
Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureLorenzo Nicora
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichDatabricks
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Cathrine Wilhelmsen
 
The Importance of Metadata
The Importance of MetadataThe Importance of Metadata
The Importance of MetadataDATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best PracticesDATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 

Tendances (20)

Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data PipelinesPutting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
Putting the Ops in DataOps: Orchestrate the Flow of Data Across Data Pipelines
 
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
Data Architecture, Solution Architecture, Platform Architecture — What’s the ...
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and Future
 
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichLambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
Lambda Architecture in the Cloud with Azure Databricks with Andrei Varanovich
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
 
The Importance of Metadata
The Importance of MetadataThe Importance of Metadata
The Importance of Metadata
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data Quality Best Practices
Data Quality Best PracticesData Quality Best Practices
Data Quality Best Practices
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 

Similaire à Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021

Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
inSis Suite - Process Data Analytics, Dashboards, Portal & Historian
inSis Suite - Process Data Analytics, Dashboards, Portal & HistorianinSis Suite - Process Data Analytics, Dashboards, Portal & Historian
inSis Suite - Process Data Analytics, Dashboards, Portal & HistorianKondapi V Siva Rama Brahmam
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Denodo
 
RWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance ProgramRWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance ProgramDATAVERSITY
 
How often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web Day
How often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web DayHow often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web Day
How often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web DayAWS Germany
 
Data Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data StrategyData Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data StrategyDenodo
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Denodo
 
The value of big data analytics
The value of big data analyticsThe value of big data analytics
The value of big data analyticsMarc Vael
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
 
Modernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your DataModernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your DataPrecisely
 
SG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptxSG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptxssuser57f752
 
Tdwi march 2015 presentation
Tdwi march 2015 presentationTdwi march 2015 presentation
Tdwi march 2015 presentationAlison Macfie
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationDenodo
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...TigerGraph
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationDenodo
 
A Real World Case Study for Implementing an Enterprise Scale Data Fabric
A Real World Case Study for Implementing an Enterprise Scale Data FabricA Real World Case Study for Implementing an Enterprise Scale Data Fabric
A Real World Case Study for Implementing an Enterprise Scale Data FabricNeo4j
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsDenodo
 
China data-mngnt-solution-market-report
China data-mngnt-solution-market-reportChina data-mngnt-solution-market-report
China data-mngnt-solution-market-reportssuser7709011
 
Data Governance for End-User Computing
Data Governance for  End-User ComputingData Governance for  End-User Computing
Data Governance for End-User ComputingDATAVERSITY
 
Go from data to decision in one unified platform.pdf
Go from data to decision in one unified platform.pdfGo from data to decision in one unified platform.pdf
Go from data to decision in one unified platform.pdfwebmaster553228
 

Similaire à Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021 (20)

Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
inSis Suite - Process Data Analytics, Dashboards, Portal & Historian
inSis Suite - Process Data Analytics, Dashboards, Portal & HistorianinSis Suite - Process Data Analytics, Dashboards, Portal & Historian
inSis Suite - Process Data Analytics, Dashboards, Portal & Historian
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
RWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance ProgramRWDG Slides: Using Tools to Advance Your Data Governance Program
RWDG Slides: Using Tools to Advance Your Data Governance Program
 
How often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web Day
How often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web DayHow often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web Day
How often do Your Machines and People talk? Humanizing the IoT - AWS IoT Web Day
 
Data Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data StrategyData Virtualization Accelerating Your Data Strategy
Data Virtualization Accelerating Your Data Strategy
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
 
The value of big data analytics
The value of big data analyticsThe value of big data analytics
The value of big data analytics
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Modernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your DataModernize your Infrastructure and Mobilize Your Data
Modernize your Infrastructure and Mobilize Your Data
 
SG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptxSG Data Mgt - Findings and Recommendations.pptx
SG Data Mgt - Findings and Recommendations.pptx
 
Tdwi march 2015 presentation
Tdwi march 2015 presentationTdwi march 2015 presentation
Tdwi march 2015 presentation
 
GDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data VirtualizationGDPR Noncompliance: Avoid the Risk with Data Virtualization
GDPR Noncompliance: Avoid the Risk with Data Virtualization
 
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
 
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data VirtualizationKASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
KASHTECH AND DENODO: ROI and Economic Value of Data Virtualization
 
A Real World Case Study for Implementing an Enterprise Scale Data Fabric
A Real World Case Study for Implementing an Enterprise Scale Data FabricA Real World Case Study for Implementing an Enterprise Scale Data Fabric
A Real World Case Study for Implementing an Enterprise Scale Data Fabric
 
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit UnionsWebinar #2 - Transforming Challenges into Opportunities for Credit Unions
Webinar #2 - Transforming Challenges into Opportunities for Credit Unions
 
China data-mngnt-solution-market-report
China data-mngnt-solution-market-reportChina data-mngnt-solution-market-report
China data-mngnt-solution-market-report
 
Data Governance for End-User Computing
Data Governance for  End-User ComputingData Governance for  End-User Computing
Data Governance for End-User Computing
 
Go from data to decision in one unified platform.pdf
Go from data to decision in one unified platform.pdfGo from data to decision in one unified platform.pdf
Go from data to decision in one unified platform.pdf
 

Dernier

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...Dr Arash Najmaei ( Phd., MBA, BSc)
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxHimangsuNath
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingsocarem879
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBoston Institute of Analytics
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxHaritikaChhatwal1
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Milind Agarwal
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksdeepakthakur548787
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data VisualizationKianJazayeri1
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfblazblazml
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfSubhamKumar3239
 

Dernier (20)

Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
6 Tips for Interpretable Topic Models _ by Nicha Ruchirawat _ Towards Data Sc...
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Networking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptxNetworking Case Study prepared by teacher.pptx
Networking Case Study prepared by teacher.pptx
 
INTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processingINTRODUCTION TO Natural language processing
INTRODUCTION TO Natural language processing
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis ProjectBank Loan Approval Analysis: A Comprehensive Data Analysis Project
Bank Loan Approval Analysis: A Comprehensive Data Analysis Project
 
SMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptxSMOTE and K-Fold Cross Validation-Presentation.pptx
SMOTE and K-Fold Cross Validation-Presentation.pptx
 
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
 
Digital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing worksDigital Marketing Plan, how digital marketing works
Digital Marketing Plan, how digital marketing works
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Principles and Practices of Data Visualization
Principles and Practices of Data VisualizationPrinciples and Practices of Data Visualization
Principles and Practices of Data Visualization
 
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdfEnglish-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
English-8-Q4-W3-Synthesizing-Essential-Information-From-Various-Sources-1.pdf
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
convolutional neural network and its applications.pdf
convolutional neural network and its applications.pdfconvolutional neural network and its applications.pdf
convolutional neural network and its applications.pdf
 

Intuit's Data Mesh - Data Mesh Leaning Community meetup 5.13.2021

  • 1. Tristan Baker - linkedin.com/in/tristanbaker Suresh Raman - linkedin.com/in/ramansuresh Allison Bellah (in absentia) - linkedin.com/in/allisonbellah Data Mesh at Intuit May 13, 2021
  • 2. ©2021 Intuit Inc. All rights reserved. 2 Arriving at data mesh Our vision and four part strategy Now with 25% more parts! Q&A Fire away! Agenda
  • 4. ©2021 Intuit Inc. All rights reserved. 4 A brief history of data infrastructure
  • 5. ©2021 Intuit Inc. All rights reserved. 5 A brief history of data infrastructure
  • 6. ©2021 Intuit Inc. All rights reserved. 6 A brief history of data infrastructure
  • 7. ©2021 Intuit Inc. All rights reserved. 7 Today
  • 8. ©2021 Intuit Inc. All rights reserved. 8 What “we cannot scale” sounds like from our users Discovering Data ● Where can I find data about a particular thing (customer, company, etc)? ● Where can I find the data sourced from a particular product or service? Understanding Data ● Who can approve my access so that I can see samples of the data? ● What is the schema of the data? ● What is the business meaning and context of the data? ● Is this data related to other concepts? Is it joinable to other data? What is the meaning of the relationship? Trusting Data ● What system produces this data and at what latency? ● What other systems use this data? ● What is the quality of this data? Is it ‘clean’? ● Which team supports this data if it breaks? Publishing Data ● How do I describe my data so that others understand what it means and how to use it? ● Where do I host my data so that other systems can access it? ● Data systems are complicated, how can I build and operate my process on top of one? ● What are my operational responsibilities once my process/data is in production? ● How do I meet my compliance requirements for processing/storing/publishing data? ● Am I duplicating processing/data that already exists? Consuming Data ● How is this table/topic partitioned? ● Who can approve my production system to access it? ● Will I get alerted if the schema changes?
  • 9. ©2021 Intuit Inc. All rights reserved. 9 The future of data infrastructure ● Data treated as code ● Data service as a facet of a product ● Data responsibility decentralized ● Producers take responsibility for data ● Producers serve consumers ● Data platform provides the ecosystem to govern and manage the lifecycle of data and machine learning The provocation Data Mesh is born
  • 10. Our vision and four part strategy
  • 11. ©2021 Intuit Inc. All rights reserved. 11 Enable more Intuit teams to more easily use and create data
  • 12. ©2021 Intuit Inc. All rights reserved. 12 Four part strategy • Stewardship – ensures accountability for a set of defined responsibilities in building and managing their solutions; including adherence to a set of defined best practices to produce only high quality data. • Organizing people, code and data – A systematic approach to organizing the people, code and data which clearly identifies the owners of a business problem and its solution. • Self serve products – A rich suite of self serve products that enable teams to more easily author, deploy, govern and operate their own solutions, aided by automation and processes that support best practices and high quality as a precondition for deployment. • Rationalizing data definitions – A process for rationalizing all critical data definitions at the company so that data concepts like Customer, Product and Entitlement are unique, re-usable and non-conflicting.
  • 14. ©2021 Intuit Inc. All rights reserved. 14
  • 15. ©2021 Intuit Inc. All rights reserved. 15 Stewardship goals for next year
  • 17. ©2021 Intuit Inc. All rights reserved. 17 Raw information about physical systems that describes where the data is stored and where code is executing. This describes where data is physically located so that it can be accessed.
  • 18. ©2021 Intuit Inc. All rights reserved. 18 Basic dependency, ownership and classification information provides additional context about physical data and code locations so that data can be better governed, secured and operated by the owning teams.
  • 19. ©2021 Intuit Inc. All rights reserved. 19 Why organizing people, code and data matters 19 Private vs Public ~50% tables are either temp/sandbox/staging/test/backu p tables - Messes up search & discovery - Teams consume data not meant for external use Data Ownership ~50% tables don’t have clearly identified owners - Erodes Trust - Copies proliferate - Operational, Governance risk
  • 21. ©2021 Intuit Inc. All rights reserved. 21 Data Processing Capabilities Data Serving Capabilities
  • 22. ©2021 Intuit Inc. All rights reserved. 22 Self Serve goals for next year 100% of Top 20 tasks in the Data lifecycle are Self Serve Infra Provisioning ● Transactional Persistence ● Compute for stream, batch processing ● Monitor, Debug Infra ● Cost Data Authoring ● Events, Schemas ● Ingestion ● Transformations ● Entities ● ML Features ● Data Quality, Observability ● Orchestration Data Governance ● Access Management ● Key management ● Compliance Controls & Audit ● Privacy
  • 24. ©2021 Intuit Inc. All rights reserved. 24 Clean entity information with formally defined meaning and relationships enables better data understanding. This is the purpose of entity definitions. They ensure that data is clean, organized, connected, discoverable and documented in a formal way.
  • 25. ©2021 Intuit Inc. All rights reserved. 25 When you bring it all together, you get Intuit’s Data Mesh
  • 26. ©2021 Intuit Inc. All rights reserved. 26
  • 27. ©2021 Intuit Inc. All rights reserved. 27
  • 28. ©2021 Intuit Inc. All rights reserved. 28
  • 29. ©2021 Intuit Inc. All rights reserved. 29
  • 30. ©2021 Intuit Inc. All rights reserved. 30 Capturing meaning, relationship, ownership, and system dependencies builds a full, rich picture for everyone. No tribal knowledge needed! In this example, the clean information describes entities OII Account and Intuit Product and the Entitled To relationship between them. The basic information describes how the data for these entities are sourced from the Identity Universal Service and the Entitlement Reference Service. The raw information describes which Event Bus topic and Data Lake table the data for these entities can be found in.
  • 31. Q&A Tristan Baker - linkedin.com/in/tristanbaker Suresh Raman - linkedin.com/in/ramansuresh Allison Bellah (in absentia) - linkedin.com/in/allisonbellah
  • 32. ©2021 Intuit Inc. All rights reserved. 32 32