5. The Trends
● Realtime DWHs
● Analytics Engineering
● Data Mesh
● Data Catalogs
● Reverse ETL
● Headless BI
● Data Integrity
● Data Lakehouses
● DataOps
● White-label Data Viz
https://twitter.com/criccomini/status/1451557884769169412
https://preset.io/blog/reshaping-data-engineering/
6. The Trends
● Realtime DWHs
● Analytics Engineering
● Data Mesh
● Data Catalogs
● Reverse ETL
● Headless BI
● Data Integrity
● Data Lakehouses
● DataOps
● White-label Data Viz
https://twitter.com/criccomini/status/1451557884769169412
https://preset.io/blog/reshaping-data-engineering/
13. Why Realtime DWHs?
● Debugging
○ Investigate application errors
○ Audit log shows how things changed
● Operational
○ Monitoring
○ Scripts that pull from DWH
● Security/Compliance
○ Audit log
● Customer data products
○ Ad hoc customer reports (e.g. Stripe Sigma, WePay txns)
○ Data clean rooms
14. Realtime DWHs Technical Advantages
● Handles hard deletes
● No schema requirements (timestamps)
● Replay from Kafka
● Data integration
15. Realtime DWH Drawbacks
● Operationally complex
● Depends on source DB support (for CDC)
● Inline transformation is harder
● Fixing bad data is harder
17. “A data mesh is a type of data platform architecture that embraces the ubiquity of
data in the enterprise by leveraging a domain-oriented, self-serve design.”
● Domain-oriented decentralized data ownership and architecture
● Data as a product
● Self-serve data infrastructure as a platform
● Federated computational governance
Data Mesh
https://towardsdatascience.com/what-is-a-data-mesh-and-how-not-to-mesh-it-up-210710bb41e0
18. “A data mesh is a type of data platform architecture that embraces the ubiquity of
data in the enterprise by leveraging a domain-oriented, self-serve design.”
● Domain-oriented decentralized data ownership and architecture
● Data as a product
● Self-serve data infrastructure as a platform
● Federated computational governance
Data Mesh
https://towardsdatascience.com/what-is-a-data-mesh-and-how-not-to-mesh-it-up-210710bb41e0
wat.
19. “A product is any item or service you sell to serve a customer's need or want.”
Data is a Product
https://www.aha.io/roadmapping/guide/product-management/what-is-a-product
20. ● Customers
○ Data scientists
○ Business analysts
○ Finance
○ Sales
○ Product managers
○ Engineers
○ External customers
● Products
○ Recommender systems
○ Billing
○ Fraud
○ Reports
○ Dashboards
Data is a Product
https://cnr.sh/essays/what-the-heck-data-mesh
24. Metrics then
● BI tools to create and visualize metrics
○ Looker
○ Mode
○ Tableau
○ Data Studio
● Answer internal business questions
○ How is a product's health?
○ What does revenue look like?
25. Metrics now
● Metrics matter for external business workflows
○ Predicting when a customer might churn
○ Notifying users when they reach their capacity limit
○ DS wants to create models to optimize certain metrics
○ Computing customer bills
● BI tools aren’t meant for this
○ Walled garden
○ Have to re-implement the same metrics in different systems
26. “For data consumption, we heard complaints from decision makers that different
teams reported different numbers for very simple business questions, and
there was no easy way to know which number was correct.”
"...the teams that own metrics would be able to define them once, in a way
that’s consistent across dashboards, automation tools, sales reporting, and so
on. Let’s call this ‘Headless BI’."
Headless BI
https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70
https://basecase.vc/blog/headless-bi
29. Q&A
● Realtime DWHs
● Analytics Engineering
● Data Mesh
● Data Catalogs
● Reverse ETL
● Headless BI
● Data Integrity
● Data Lakehouses
● DataOps
● White-label Data Viz
31. Analytics Engineering
Analytics engineers provide clean data sets to end users, modeling data in a way
that empowers end users to answer their own questions.
While a data analyst spends their time analyzing data, an analytics engineer
spends their time transforming, testing, deploying, and documenting data.
Analytics engineers apply software engineering best practices like version
control and continuous integration to the analytics code base.
https://www.getdbt.com/what-is-analytics-engineering
32. Analytics Engineering
● Job
○ Building
○ Testing
○ Cataloging
● Tools
○ DBT
○ Airflow
● Customers
○ Data science
○ Data analysts
○ BI
○ Reporting
33. Data Catalogs
● Flavor of the month
○ Amundsen
○ DataHub
○ Metaphor
○ Marquez
○ Atlan
○ Collibra
○ Alation
● Use cases
○ Discoverability
○ Operations
○ Governance
34. “Reverse ETL syncs data from a system of records like a warehouse to a system
of actions like CRM, MAP, and other SaaS apps to operationalize data.”
Reverse ETL
https://blog.getcensus.com/what-is-reverse-etl/