Data Domain-Driven Design

Data & Domain-Driven Design:
Data Mesh
Kiran Kumar Chittoori

e-Commerce: Platform as a service
Producers Consumers
Providers
Owner
PLATFORM
Creators of the Product
offerings
Buyers of the Product
offerings
Creators of Interfaces
for the platform
Creators of the Product
offerings

Data Mesh: Topology
Self-serve Data Platform

Data Landscape
• Operational data sits in databases behind business capabilities served with microservices, has a
transactional nature, keeps the current state and serves the needs of the applications running the
business.
• Analytical data is a temporal and aggregated view of the facts of the business over time, often
modeled to provide retrospective or future-perspective insights; it trains the ML models or feeds the
analytical reports.

Evolution
EDW
(1st)
Data
Lake
(2nd)
Data
Platforms
(3rd)
Data
Mesh
(4th)

Data and distributed domain driven architecture convergence

Data Mesh
• Data mesh is a inverted model and
topology based on domains and not
technology stack - with a focus on
the analytical data plane.

Data Mesh: Architecture
Data
Mesh
Broadridge
Stagecoach
Trades
Basel
Reporting
Regulatory
Reporting
SEC
Reporting
Data Producers Technology Data Consumers

Data Mesh: Architecture Principles
Domain-oriented
decentralized data
ownership and architecture
Data as a product
Federated computational
Governance
Self-serve data
infrastructure as a platform

Data Mesh Addressing Dimensions
Data
Mesh
Changes in the data landscape
Proliferation of sources of data
Diversity of data use cases and users
Speed of response to change

Data Mesh: Product Owner
• Delivering data as a product
• Objective measures
• data quality
• decreased lead time of data consumption
• data user satisfaction
• closest to the data are best equipped to manage it capably

Data Product: Attributes
• Discoverable. Easy to find in natural language.
• Addressable. Easy to access (once found), assuming the end user has permissions. If they don’t have
permissions, it’s vital they have a means to request access, or work with someone granted access.
• Trustworthy and truthful. Signals around the quality and integrity of the data are essential if people
are to understand and trust it. Data provenance and lineage, for example, clarify an asset’s origin and
past usages, important details for a newcomer to understand and trust that asset. Data observability —
comprising identifying, troubleshooting, and resolving data issues — can be achieved through quality
testing built by teams within each domain.

Data Product: Attributes
• Self-describing. The data must be easily understood and consumed — e.g., through data schemas,
wiki-like articles, and other crowdsourced feedback, like deprecations or warnings.
• Interoperable and governed by global standards. With different teams responsible for data,
governance will be federated (more on this later). But everyone must still abide by a global set of rules
that reflect current regulatory laws that respect geography.
• Secure and governed by a global access control. Users must be able to access data securely — e.g.,
through RBAC policy definition.

Data Product: Structural Components
Code
Infrastructure
Data As a
Product
Data
&
Metadata

Self-Serve Data Infrastructure as a Platform: Persona Benefits
• For producers: Producers need a place to manage their data products (store, create, curate, destroy,
etc.) and make those products accessible to consumers.
• For consumers: Consumers need a place to find data products, within a UI that guides how to use
these products compliantly and successfully.

Technology planes of a self-service data mesh
• Plane 1: Data Infrastructure Plane. Addresses networking, storage, access control. Examples include
public cloud vendors like AWS, Azure, and GCP.
• Plane 2: Data Product Developer Experience Plane. This plane uses “declarative interfaces to manage
the lifecycle of a data product” to help developers, for example, build, deploy, and monitor data
products. This is relevant to many development environments, depending on the underlying
repository, e.g., SQL for cloud data warehouses.
• Plane 3: Mesh Supervision Plane. This is a consumer-facing place to discover & explore data products,
curate data, manage security policies, etc. While some may call it a data marketplace, others see the
data catalog as the mesh supervision plane. Simply put, this plane addresses the consumer needs
discussed above: discoverability, trustworthiness, etc. And this is where the data catalog plays a role.

Data Domains
• Domain oriented data decomposition and ownership
• Source oriented domain data
• systems of reality
• truths of their business domain
• raw data at the point of creation
• Consumer oriented and shared domain data
• Distributed pipelines as domain internal implementation
• Service Level Objectives for the quality of the data it provides: timeliness, error rates, etc

Data Mesh Implementation
• As such, a data mesh implementation “requires a governance model that embraces decentralization
and domain self-sovereignty, interoperability through global standardization, a dynamic topology, and,
most importantly, automated execution of decisions by the platform.” In this way, a conflict arises:
which rules are universal, and which are centralized? Which practices are universal, and which must be
tailored by domain?

Paradigm Shift : A New Language
Pre data mesh governance aspect Data mesh governance aspect
Centralized team Federated team
Responsible for data quality Responsible for defining how to model what constitutes quality
Responsible for data security
Responsible for defining aspects of data security i.e. data sensitivity
levels for the platform to build in and monitor automatically
Responsible for complying with regulation
Responsible for defining the regulation requirements for the
platform to build in and monitor automatically
Centralized custodianship of data Federated custodianship of data by domains
Responsible for global canonical data modeling
Responsible for modeling polysemes - data elements that cross the
boundaries of multiple domains
Team is independent from domains Team is made of domains representatives
Aiming for a well defined static structure of data
Aiming for enabling effective mesh operation embracing a
continuously changing and a dynamic topology of the mesh
Centralized technology used by monolithic lake/warehouse Self-serve platform technologies used by each domain
Measure success based on number or volume of governed data
(tables)
Measure success based on the network effect - the connections
representing the consumption of data on the mesh
Manual process with human intervention Automated processes implemented by the platform
Prevent error Detect error and recover through platform’s automated processing

Principles underpinning Data mesh
Domain-oriented decentralized data ownership
and architecture
So that the ecosystem creating and consuming data can
scale out as the number of sources of data, number of use
cases, and diversity of access models to the data increases;
simply increase the autonomous nodes on the mesh.
Data as a product
So that data users can easily discover, understand and
securely use high quality data with a delightful experience;
data that is distributed across many domains.
Self-serve data infrastructure as a platform
So that the domain teams can create and consume data
products autonomously using the platform abstractions,
hiding the complexity of building, executing and
maintaining secure and interoperable data products.
Federated computational governance
So that data users can get value from aggregation and
correlation of independent data products - the mesh is
behaving as an ecosystem following global interoperability
standards; standards that are baked computationally into
the platform.

Paradigm Shift : A New Language
• Serving over Ingesting
• Discovering and using over Extracting and loading
• Publishing events as streams over flowing data around via centralized pipelines
• Ecosystem of data products over centralized data platform

Data Mesh: Architecture Principles
• Domain-oriented decentralized data ownership and architecture
• Data as a product
• Federated computational governance
• Self-serve data infrastructure as a platform

Thank YOU!
Kiran Kumar Chittoori

Data Domain-Driven Design

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Data Domain-Driven Design

Similaire à Data Domain-Driven Design (20)

Dernier

Dernier (20)

Data Domain-Driven Design

Notes de l'éditeur