SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
Navigate
Architecting
Modern Data Platforms
by ankitrathi.com
Content
• Data Architecture Principles
• Data Lake Basics
• High Level Architecture
• Data Characteristics
• Putting It All Together
• Product-Driven Data Architecture
• Reference Architecture
Data Architecture Principals
• Adhere to ADDA (Accessibility, Definition, Decoupling, Agility)
• Design for RSM (Reliability, Scalability, Maintainability)
• Use Right Tools
• Cloud Native/Agnostic
• Be Cost Conscious
Adhere to ADDA
Accessibility
Easily accessible data
for business
Definition
Data catalog for
simplified data
discovery
Decoupling
Decoupled layers for
flexibility
Agility
Agile enough to cater
evolving business
requirements
Design for RSM
Reliability
works correctly,
fault-tolerant
Scalability
adapts to growth
Maintainability
remains easy to maintain
Use Right Tools
Data Structure
Structured, Semi-
structured, Unstructured
Latency
Low, Medium, High
Throughput
High, Medium, Low
Access Pattern
Key-value, Search,
Transactions
Cloud Native/Agnostic
Cloud Native Cloud Agnostic
Pros:
• Better performance
• Better efficiency
• Lower costs (generic services)
Pros:
• Flexibility
• Minimal vendor lock-in
• Standard performance
Cons:
• Vendor lock-in
• Higher costs (specific services)
Cons:
• Underutilization of vendor capabilities
• Solution can become complex
• Performance, logging and monitoring
can take a hit
Be Cost Conscious
• Efficient consumption of services
• Select cost-conscious options
• Enforce policies and controls
Data Lake
• Data Lake Definition
• An architectural approach
• Massive heterogenous data stored centrally
• Available to diverse group of users
• To be categorized, processed, analyzed & consumed
• Data Lake Characteristics
• Structured, semi-structured & unstructured data
• Scaled out as required
• Diverse set of storage, analytics and ML/AI tools
• Designed for low-cost storage and analytics
High-Level Architecture
Process/
Analyse
Ingest Store Serve
Latency, Throughput, Cost
Data Actionable Insights
Ingest
Source Data Type Data
Web/Mobile Apps Records Transactions
Databases Records Transactions
Logging Search documents Files
Logging Log files Files
Messaging Messages Events
IoT Data Streams Events
Data Characteristics
Hot Warm Cold
Volume MB-GB GB-PB PB-EB
Item Size B-KB KB-MB KB-TB
Latency ms ms, sec min, hrs
Durability Low-high High Very high
Request Rate Very high High Low
Cost/GB $$-$ $-¢¢ ¢¢-¢
Data Characteristics
• Type of Data Structures
• Fixed Schema
• Schema Free
• Key-Value
• Type of Access Patterns
• Key-Value
• Simple relations (1:N, M:N)
• Multi-table joins, transactions
• Faceting, Search
Storage
In-memory
File Storage
NoSQL
SQL
Hot data Warm data Cold data
Structure
HighLow
Request rate, Cost per GBHigh Low
Latency, Data VolumeLow High
Analytics Types
• Message/Stream Analysis
• Interactive Analysis
• Batch Analysis
• Machine Learning/AI
ETL Processing
Process/AnalyseStore ETL
Serve
• Applications & APIs
• Analysis & Visualization
• Notebooks
• IDEs
Putting It All Together
Process/AnalyseStore
ETL
Ingest Serve
Web Apps
Mobile Apps
Data Centers
Logging
Messaging
Devices
Sensors
Cache
NoSQL
SQL
ElasticSearch
Object Storage
SQS
Streams
ML/AI
Interactive
Batch
Message
Streams
APIs
Analysis
Visualization
Notebooks
IDE
Records
Documents
Files
Messages
Streams
Security & Governance, Data Catalog
Product-Driven Data Architecture
Reference: https://martinfowler.com/articles/data-monolith-to-mesh.html
Reference Architecture - Azure
Reference: https://docs.microsoft.com/en-us/azure/architecture/example-scenario/dataplate2e/data-platform-end-to-end
Reference Architecture - AWS
Reference: https://docs.aws.amazon.com/solutions/latest/data-lake-solution/architecture.html
Reference Architecture - GCP
Reference: https://cloud.google.com/solutions/big-data
Navigate
Questions…?
Navigate
Thank You
ankitrathi.com

Contenu connexe

Tendances

Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2
Christopher Bradley
 
Data modeling for the business 09282010
Data modeling for the business  09282010Data modeling for the business  09282010
Data modeling for the business 09282010
ERwin Modeling
 

Tendances (20)

Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Business Intelligence (BI) and Data Management Basics
Business Intelligence (BI) and Data Management  Basics Business Intelligence (BI) and Data Management  Basics
Business Intelligence (BI) and Data Management Basics
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2Data Governance by stealth v0.0.2
Data Governance by stealth v0.0.2
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
data warehouse vs data lake
data warehouse vs data lakedata warehouse vs data lake
data warehouse vs data lake
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Accelerate and modernize your data pipelines
Accelerate and modernize your data pipelinesAccelerate and modernize your data pipelines
Accelerate and modernize your data pipelines
 
Data modeling for the business 09282010
Data modeling for the business  09282010Data modeling for the business  09282010
Data modeling for the business 09282010
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Data Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesData Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph Databases
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleHow to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
 
Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020Data platform architecture principles - ieee infrastructure 2020
Data platform architecture principles - ieee infrastructure 2020
 

Similaire à Architecting Modern Data Platforms

(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
Amazon Web Services
 

Similaire à Architecting Modern Data Platforms (20)

Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel AvivBig Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
Big Data and Architectural Patterns on AWS - Pop-up Loft Tel Aviv
 
Database and Analytics on the AWS Cloud
Database and Analytics on the AWS CloudDatabase and Analytics on the AWS Cloud
Database and Analytics on the AWS Cloud
 
(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS(BDT317) Building A Data Lake On AWS
(BDT317) Building A Data Lake On AWS
 
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
AWS re:Invent 2016: Big Data Architectural Patterns and Best Practices on AWS...
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
kalyani.ppt
kalyani.pptkalyani.ppt
kalyani.ppt
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
kalyani.ppt
kalyani.pptkalyani.ppt
kalyani.ppt
 
AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS AWS March 2016 Webinar Series Building Your Data Lake on AWS
AWS March 2016 Webinar Series Building Your Data Lake on AWS
 
MariaDB AX: Solución analítica con ColumnStore
MariaDB AX: Solución analítica con ColumnStoreMariaDB AX: Solución analítica con ColumnStore
MariaDB AX: Solución analítica con ColumnStore
 
MariaDB AX: Analytics with MariaDB ColumnStore
MariaDB AX: Analytics with MariaDB ColumnStoreMariaDB AX: Analytics with MariaDB ColumnStore
MariaDB AX: Analytics with MariaDB ColumnStore
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Architectures styles and deployment on the hadoop
Architectures styles and deployment on the hadoopArchitectures styles and deployment on the hadoop
Architectures styles and deployment on the hadoop
 
Serverless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSightServerless Big Data Analytics with Amazon Athena and QuickSight
Serverless Big Data Analytics with Amazon Athena and QuickSight
 
Deep Dive in Big Data
Deep Dive in Big DataDeep Dive in Big Data
Deep Dive in Big Data
 
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
Sql Bits 2020 - Designing Performant and Scalable Data Lakes using Azure Data...
 
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
Big Data adoption success using AWS Big Data Services - Pop-up Loft TLV 2017
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Big Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWSBig Data Architectural Patterns and Best Practices on AWS
Big Data Architectural Patterns and Best Practices on AWS
 
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information managementFoundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
 

Plus de Ankit Rathi

Big Data Overview
Big Data OverviewBig Data Overview
Big Data Overview
Ankit Rathi
 

Plus de Ankit Rathi (19)

5 Data Science Use Cases for Every Business
5 Data Science Use Cases for Every Business5 Data Science Use Cases for Every Business
5 Data Science Use Cases for Every Business
 
Kaggle Vs Real-world Projects
Kaggle Vs Real-world ProjectsKaggle Vs Real-world Projects
Kaggle Vs Real-world Projects
 
SQL for Data Professionals (Beginner)
SQL for Data Professionals (Beginner)SQL for Data Professionals (Beginner)
SQL for Data Professionals (Beginner)
 
Data & AI Session @ RBS
Data & AI Session @ RBSData & AI Session @ RBS
Data & AI Session @ RBS
 
Data Professionals: Job of the Century
Data Professionals: Job of the CenturyData Professionals: Job of the Century
Data Professionals: Job of the Century
 
Cloud Computing for Data Professionals
Cloud Computing for Data ProfessionalsCloud Computing for Data Professionals
Cloud Computing for Data Professionals
 
Data & AI Platform Concepts
Data & AI Platform ConceptsData & AI Platform Concepts
Data & AI Platform Concepts
 
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
Data & AI Platforms — Open Source Vs Managed Services (AWS vs Azure vs GCP)
 
Artificial Intelligence Do-It-Yourself: Course Outline
Artificial Intelligence Do-It-Yourself: Course OutlineArtificial Intelligence Do-It-Yourself: Course Outline
Artificial Intelligence Do-It-Yourself: Course Outline
 
Artificial Intelligence Do-It-Yourself: Course Intro
Artificial Intelligence Do-It-Yourself: Course IntroArtificial Intelligence Do-It-Yourself: Course Intro
Artificial Intelligence Do-It-Yourself: Course Intro
 
Auto Encoder & Clustering Based Data Anonymization
Auto Encoder & Clustering Based Data AnonymizationAuto Encoder & Clustering Based Data Anonymization
Auto Encoder & Clustering Based Data Anonymization
 
Analytics Induction
Analytics InductionAnalytics Induction
Analytics Induction
 
Data Science Session
Data Science SessionData Science Session
Data Science Session
 
Becoming Data-Driven
Becoming Data-DrivenBecoming Data-Driven
Becoming Data-Driven
 
Machine Learning with Python
Machine Learning with PythonMachine Learning with Python
Machine Learning with Python
 
Data My Perspective
Data My PerspectiveData My Perspective
Data My Perspective
 
SPEM
SPEMSPEM
SPEM
 
Big Data Overview
Big Data OverviewBig Data Overview
Big Data Overview
 
Oracle DBKB Project
Oracle DBKB ProjectOracle DBKB Project
Oracle DBKB Project
 

Dernier

FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 

Dernier (20)

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 

Architecting Modern Data Platforms

  • 2. Content • Data Architecture Principles • Data Lake Basics • High Level Architecture • Data Characteristics • Putting It All Together • Product-Driven Data Architecture • Reference Architecture
  • 3. Data Architecture Principals • Adhere to ADDA (Accessibility, Definition, Decoupling, Agility) • Design for RSM (Reliability, Scalability, Maintainability) • Use Right Tools • Cloud Native/Agnostic • Be Cost Conscious
  • 4. Adhere to ADDA Accessibility Easily accessible data for business Definition Data catalog for simplified data discovery Decoupling Decoupled layers for flexibility Agility Agile enough to cater evolving business requirements
  • 5. Design for RSM Reliability works correctly, fault-tolerant Scalability adapts to growth Maintainability remains easy to maintain
  • 6. Use Right Tools Data Structure Structured, Semi- structured, Unstructured Latency Low, Medium, High Throughput High, Medium, Low Access Pattern Key-value, Search, Transactions
  • 7. Cloud Native/Agnostic Cloud Native Cloud Agnostic Pros: • Better performance • Better efficiency • Lower costs (generic services) Pros: • Flexibility • Minimal vendor lock-in • Standard performance Cons: • Vendor lock-in • Higher costs (specific services) Cons: • Underutilization of vendor capabilities • Solution can become complex • Performance, logging and monitoring can take a hit
  • 8. Be Cost Conscious • Efficient consumption of services • Select cost-conscious options • Enforce policies and controls
  • 9. Data Lake • Data Lake Definition • An architectural approach • Massive heterogenous data stored centrally • Available to diverse group of users • To be categorized, processed, analyzed & consumed • Data Lake Characteristics • Structured, semi-structured & unstructured data • Scaled out as required • Diverse set of storage, analytics and ML/AI tools • Designed for low-cost storage and analytics
  • 10. High-Level Architecture Process/ Analyse Ingest Store Serve Latency, Throughput, Cost Data Actionable Insights
  • 11. Ingest Source Data Type Data Web/Mobile Apps Records Transactions Databases Records Transactions Logging Search documents Files Logging Log files Files Messaging Messages Events IoT Data Streams Events
  • 12. Data Characteristics Hot Warm Cold Volume MB-GB GB-PB PB-EB Item Size B-KB KB-MB KB-TB Latency ms ms, sec min, hrs Durability Low-high High Very high Request Rate Very high High Low Cost/GB $$-$ $-¢¢ ¢¢-¢
  • 13. Data Characteristics • Type of Data Structures • Fixed Schema • Schema Free • Key-Value • Type of Access Patterns • Key-Value • Simple relations (1:N, M:N) • Multi-table joins, transactions • Faceting, Search
  • 14. Storage In-memory File Storage NoSQL SQL Hot data Warm data Cold data Structure HighLow Request rate, Cost per GBHigh Low Latency, Data VolumeLow High
  • 15. Analytics Types • Message/Stream Analysis • Interactive Analysis • Batch Analysis • Machine Learning/AI
  • 17. Serve • Applications & APIs • Analysis & Visualization • Notebooks • IDEs
  • 18. Putting It All Together Process/AnalyseStore ETL Ingest Serve Web Apps Mobile Apps Data Centers Logging Messaging Devices Sensors Cache NoSQL SQL ElasticSearch Object Storage SQS Streams ML/AI Interactive Batch Message Streams APIs Analysis Visualization Notebooks IDE Records Documents Files Messages Streams Security & Governance, Data Catalog
  • 19. Product-Driven Data Architecture Reference: https://martinfowler.com/articles/data-monolith-to-mesh.html
  • 20. Reference Architecture - Azure Reference: https://docs.microsoft.com/en-us/azure/architecture/example-scenario/dataplate2e/data-platform-end-to-end
  • 21. Reference Architecture - AWS Reference: https://docs.aws.amazon.com/solutions/latest/data-lake-solution/architecture.html
  • 22. Reference Architecture - GCP Reference: https://cloud.google.com/solutions/big-data