6. What is Domain Driven Design?
It is an approach to building software that has complex and ever-
changing business requirements
7.
8. Key Concepts
A sphere of knowledge, influence, or activity. What an organisation does, and the world it
does it in.
Domain
A system of abstractions describing selected aspects of a Domain. It is used to solve
problems related to that Domain
Subdomains
9. Key Concepts
A Language structured around the Subdomain and used by all team members.
Ubiquitous Language
An explicit boundary within which a Subdomain exists. Inside, all terms have a specific
meaning and are part of the Ubiquitous Language
Bounded Context
12. Event Storming
Event Storming is a workshop format for
quickly exploring complex business
domains.
http://ziobrando.blogspot.com/2013/11/introducing-event-storming.html
By Henning Schwentner - Own work, CC BY-SA 4.0,
https://commons.wikimedia.org/w/index.php?curid=57766348
13. Key Concepts
Something happened, of interest to a domain expert
Domain Event
An external instruction to do something. It triggers a Domain Event.
Command
The portion of the system that receives commands and decides whether to execute them
or not, thus producing a domain event.
Aggregate
20. Where to start?
● Pace of change
● Performance
● Security
● Team structure
● Dependencies (internal or external)
● Technology
Identify
Business Value
25. “Observability is the ability to interrogate your system
and get accurate answers that improve your
understanding of it”
26. Tracing
MonitoringAlerts
Logging
No Logs by Sweet Farm from the Noun Project
Alert by ✦ Shmidt Sergey ✦ from the Noun Project
traceability by Timofey Rostilov from the Noun Project
29. Monitoring
● What’s relevant for each microservice and for the entire system?
● It could be not just single data points but a combination of things.
● Start with something simple and evolve as things get more complicated.
Allows you to make informed decisions
37. Take Away
● Start by identifying business domains, DDD can help you on this.
● Don’t wait until having the perfect list of domains, they will change over time.
● Prioritize decoupling based on business value.
● Include operational concerns in your backlog (logging, monitoring, alerting & tracing)
○ Be intentional about things you log, monitor, and alert on
○ Proactively iterate and eliminate noise
○ Evaluate and use existing tools and libraries (ELK stack, OpenTracing, Honeycomb, Datadog, Opsgenie, etc)
Microservices is really not a free lunch!
Hi everyone, thank you for coming to this talk. Before I start let me introduce myself, my name is Maria Gomez, I’m the Head of Technology in ThoughtWorks Spain and I have spent the last 4 years helping organizations on their transformation journey. This talk has experiences and learnings that I gathered over that period, so I hope what I’ll say resonates with you, either because you already lived a similar situation or because you’re in the middle of one. We are going to be talking microservices about two different angles: first how to use DDD to decide how to break down the monolith and second, some operational concerns you should look at for this new microservices.
So let’s start
We have a monolith that is not working for us anymore so we would like to break it down into smaller pieces that are autonomous and that allows us to deliver value to our customers faster.
That's the dream. But this is a very naïve scenario, in reality things a bit more complex
Because organizations, have very complex systems and they have delivery pressure. There was probably a initiative or two trying to separate part of the system that failed and ended up in a slightly worse place. There are decisions that were taken that seemed right at the time but now they are giving us a lot of coupling and accidental complexity and etc etc. And untangling this into the nice small autonomous microservices doesn’t look that easy anymore. We don’t know where to start, we don’t know how to keep or even better increase the confidence in our system. And, the biggest of all, we don’t know how to talk to the business to make them understand that we need to do something about this and we need time.
Does this picture sound familiar to any of you?
There are various antipatterns in here:
DB integration, Chatty services, Lack of deployment independency, Decisions being taken by technical teams without understanding the business impact, and that can lead to badly define services that can’t be truly autonomous.
So what I do is to help teams take a step back and allow them to use Domain Driven Design techniques as a way of learning more about the business processes and therefore being able to untangle the different business capabilities. And, in time, extract them intro microservices.
So, how can DDD help us?
Domain Driven Design, in a nutshell, is an approach to building software that has complex and ever-changing business requirements. And the emphasis here is on the “business” requirements.
There are many books in the subject, some of them more theoretical, some of them more practical. But, regardless it’s a topic that has been around for some time and that it’s again getting a lot of traction
A Customer may not represent the same thing in Payments than in Credit Card applications, for instance.
Your bounded context should look like this. We are within the ecommerce domain for instance, and we have a Sales and a Support subdomains. They both have Customer and Product as entities but in each subdomain the meaning of them would be different, they may have different attributes and perform different actions. A Customer in the Sale context would contain some financial information whereas in the Support context may be able to interact with their open tickets.
The power of DDD is that it allows you to draw this boundaries and create specialization within them, so you can have a team (product and Tech) specialized in Sales that evolves that space more or less autonomously.
Bringing that over to our monolith problem, we can use DDD to identify our own subdomains and bounded context, focus on isolate them in the monolith and then extract them into their own microservice.
There are many ways of doing this. I’m going to show you a couple that I have used in the past.
Regardless of the how, the most important thing to identify contexts and create this common language is to involve people from all the organization in these exercises (product owners, infrastructure people, developers, etc).
That’s when you can do another exercise a bit more complex, Event Storming. Created by Alberto Brandolini
The basic idea is to bring together software developers and domain experts and learn from each other. You use stickies and walls and that helps making this learning easier.
These few concepts are used in Event Storming and are also part of the Domain Driven Design vocabulary
Command: Domain: User has added an article to their basket
Add item to basket
Aggregate: Basket
Cinema website. Customers go and buy tickets for their shows.
Event storming talks about getting people from different parts of the organization interested on this buying a cinema ticket flow and start talking about the process. Like what we said before, PO, developers, Subject Matter Experts, etc
If you’re lucky, like in this example, you know what’s the first thing that happened at the last one, so the group needs to fill the gaps.
The process could start at either end of the flow, what’s the first event that happened, or what’s the last one
Another example, something that I did quite recently. This team is in charge of the whole checkout process for an online retailer and we run this exercise to start exploring the different subdomains within checkout.
I’ve simplified the flow quite a lot, but in essence this is what you do
We identified multiple subdomains. The boundaries were a bit fuzzy at the end, and that’s fine because we knew we were not going to get it right the first time and this is an exercise we will keep running every so often to keep validating our assumptions and the boundaries.
But in a nutshell, if you have identified subdomains, you just need to isolate them within the monolith (I can’t stress that enough, make it the context independent inside your monolith, remove all database integrations) and only them, extract it into a service
Because….
So, if we go back to the retail example, we looked at the different context identified and obviously we couldn’t tackle them all at the same time, we needed to find a way of prioritize them and choose one. In this case, we knew that the business wanted to improve the delivery experience that the customer had and wanted to experiment with a few things, so the Delivery context was the chosen one to start working on. But there may be many dimensions to decide where to start
So, now, you have a new set of tools you can use to untangle your complex system and extract business domains into microservices
With this, we now have a framework to identify subdomains and extract them into services based on the business value they provide.
This is where problems can start, when we underestimate what’s needed for creating new services
We’ve just looked at the tip of the iceberg, what to extract, which is very complex but there are tons of other things under the surface we need to take care of as we start creating new services. And these are the areas people get wrong time and time again.
Microservices or any distributed architecture, may reduce the complexity of your code, but it will increase the complexity of something else. Your system will be complex, just in a different way.
Testing, deployment and operations are just three of those things and each of them have enough material for another 1 hour talk. Actually we’ve seen them covered in some of the talks during today.
The advantage you have is that by applying DDD you now understand much better your domain and business needs and will be able to make better decisions to solve all the concerns below the surface.
For the rest of this talk, we’re focusing on Operations, and more specifically on a concept called Observability
It encompasses any question you might want to ask your system, not limited to errors or anomalies.
Why is observability so important now? Because our systems are increasingly complex, so it’s not enough to react to failures gracefully. It’s also important to be able to understand and ask questions that give insights on to what our system is up to. Not just of the software that we built but the ones someone else did and we use, like DBs or 3rd party software, or the monolith we still are going live with for some time.
And when we start creating new services, we need to keep observability as a first class citizen and design the services to be observable.
What does that mean?
Well, it means we should have an strategy for Logging, Monitoring, Alerting and Tracing
Logging is your building block for everything else. And having to access production boxes to copy the application.log is not scalable when you’re having multiple services.
So, the most common approach to organize logs is to apply the Log aggregation pattern, which should be quite standard by now.
We have multiple services in different tech stacks writing logs. A collector collects this logs, transform them into a structure that can be easily consumed by other systems, like a visualization or monitoring tool
We should set up our new services so the logs can be collected. Building this infrastructure is something that you should be doing from the first microservices you decide to create. ELK stack is pretty standard right now for instance
The library used for logging or the specific implementation is the concern of individual teams. As an architect you should care about what it is logged not how.
Make sure the data is logged in a structure format, so you can search by various dimensions, transformed them in the way you need them, etc and in short, use them in a good way
The goal of monitoring is to have enough information to help you make decisions
See trends, observe the system over time, confirm your hypothesis. When a new functionality is deployed in production you should be able to assess how it affects the system as a whole and then act accordingly.
Probably the tool you’re using to query your logs has Dashboard capabilities, Kibana has it for instance. You could start with that and evolves if the tool is unsufficient.
Example of things you may want to monitor: in the retail we talked about before, we wanted to monitor the behavior of the new Delivery service, and our data points were the response times of the new service and the conversion rate in general (our hypothesis was that by decoupling this we were not affecting sale).
With DDD we understood the impact that Delivery has in the rest of the checkout process and be able to make better decisions in terms of monitoring. For example,
For the last one we used a technique called semantic monitoring. We automated some user flows and run then in production round the clock. We tag those transactions so they don’t mess with other metrics and obviously they didn’t trigger a real sale, but they gave us information that we needed. You can also do this type of monitoring with some of the analytics tools out there.
Regardless you want all of that data in a single place that can be accessible by everyone in the team, not just developers or POs
Make sure you don’t end up with a dashboard full of things and noise, cos then it’s not as useful. So, review what you’re monitoring frequently to remove noise or unnecessary metrics
Each tool you use will allow you to configure templates for alerts. Kibana, Opsgenie, Prometheus, etc
Autoscaling is a great example. If you’re using a cloud provider for your infra then that should be easy enough to set up
If you need to investigate an error in production, you need to be able to track the request as it flew through your system. And most likely the place where it happens is not the most obvious one. This is a very simple example, but in real life some of this integration may be asynchronous, some of these service may not fail gracefully or given you proper information.
The most obvious thing is to set up your services in a way that they attach a TraceId to every request that come through them.
However, this may not be enough. Sometime having to trace this manually is very time consuming. So, you may want to look into a distributed tracing tool for your services
This tool will take care of attaching the traceId and to send the information to a centralized place where it can be visualize.
What the services use for implementing tracing is not as important as the protocol. OpenTracing for instance, it’s an standard more and more adopted so probably that’s something you’d like to have a look at. Most current tools accept it (Datadog, Dynatrace, Zipkin, etc).
Service templates to bootstrap microservices over shared libraries. Service mesh if required.