Observability in a̶c̶t̶i̶o̶n̶! the wild!

Florian Lautenschlager
florian.lautenschlager@qaware.de
@flolaut
Observability in action! the wild!
Hamburg, 20. Juni 2018
Josef Fuchshuber
josef.fuchshuber@qaware.de
@fuchshuber

Florian Lautenschlager
& &&
@&

Observability in action the wild! 5
In our cloud backend we have a vital microservice ecosystem.

Our team is just as vital and heterogeneous as our software.
Platform Developer
App Developer
Skill Developer
Client Developer
Tester
Ops
Help Desk
Product Management
Data Scientist
UX Designer

Observability isn't just for operations.

What is the hardest step in the DevOps process?
DEV OPS

Much better: The 6 Cs of the DevOps Cycle.
Observability in action the wild! 9Source: https://dzone.com/articles/6-cs-of-devops-adoption

Observability in the wild!
A case study… and how we found
collaborative monitoring.

Monitoring Toolchain: Simply Cloud Native Standard.
Metrics Events Traces
Java (Spring Boot) or Python
on
Azure / Kubernetes / Openshift / Docker

Monitoring
Technical and Functional
Hardware
Hypervisor
Operating System
Kubernetes
Docker
Runtime
Application
Generic monitoring that
does not need knowledge
about the application.
Monitoring that does
need knowledge about
the application.
Health of platform and application Telemetry data
Infrastructure-Monitoring
Application-Monitoring

Monitoring
Technical and Functional
Questions:
Services are up and running
Services can accept traffic
Sources:
Kubestate-Exporter
Prometheus-Node-Exporter
JMX, top, iostat etc.
Questions:
Use-Cases runtimes
Service level agreements
Sources:
Specific instrumentation
(around use cases, etc.)
Health of platform and application Telemetry data
Hardware
Hypervisor
Operating System
Kubernetes
Docker
Runtime
Application
Infrastructure-Monitoring
Application-Monitoring

USE Dashboard
Observability in action the wild!

RED Dashboard

I know. Most of you do this already. But what about ..
Collaborative Monitoring!?!?

An example is the best explanation.
and a chatbot…
and a monitoring toolchain…
Once there was a little tiny application…

Snip Snap
Links request
with trace and
logs.
verbose

Or in case
of an error

Total duration
Involved services <click>
Standard Zipkin Features

Code-Slide: Standardize tracing and metrics.
Traces and metrics for every database call
with standardized names and trace tags.
database_call_duration{repository=yy, Call=zz}

Code-Slide: Standardize tracing logs and tags.
Span logs: We model database calls as well
as other expensive calls as logs using a
template to reduce the size of traces:
db:<Repo>.<Call> took: xx ms.
call:<Class>.<Method> took: xx ms.
Span tags: Used to model values that are
valid for a span. We use a template to
standardize tags.
span.tag. (to mark our tags)
Environment (staging, integration , etc.)
db (to mark spans with db calls.)
param.<name>=value (call parameters)

Logs for
a given
trace
Involved
Services
Standard EFK + Contextual Logging

Code-Slide: Contextual logging.
Context of a log event.
Everyone can easily see the logs for a specific context (trace etc.)

Or for checking
the health of
the services

Or for checking
the status of
e2e tests

end-2-end tests are also integrated in our observability stack.
See the logs
VIDEO =)
Run in their own docker containers
execute spock tests periodically
and export Prometheus metrics

Our current setup: A chatbot as generic interface.
Development Setup!

and even our help desk / first level support.
Production Setup!

Early prototype of the Customer Care Observability Tool.
Activate tracing
for a user
Health
Checks
+
e2e
Logs

Ease of communication
within bug tickets.

Happy end.
33

Collaborative Monitoring.
Monitoring that allows everyone to benefit of without
the need of expert knowledge.

Three steps to enable collaborative monitoring.
Standardize
metrics, logs
and traces
Link and
combine them
as far as
possible
Integrate them
into everyone's
tools
Start Here
Correlate Events and Trace by Context
Metrics with Events and Traces by Time
Structured Logging + Context, Metric names, etc.
Tools your team

Did we create an uncontrollable observability monster?

There’s No
Such Thing as
a Free Lunch
• The more complex a
microservice architecture is,
the more sophisticated the
observability solution must be.
• For Collaborative Observability
there is no out of the box
solution.

Collaborative Monitoring by everyone.
Ease of use.
Simple general interface to access various monitoring tools.
Integrated into everyone's daily tools (ChatBots, E-Mail, etc.)
Support all kinds of teams: Operations / Dev-Ops / Developers / QA-Team / My mum =)
Allow everyone to get superman insights.
Decrease Mean Time To Recovery (MTTR) with a fast analysis
Integrates different kinds of monitoring data (traces, metrics and logs) of different monitoring layers.
The right information. Provide relevant information for different teams, e.g. runtimes for perf. engineer.
Level of Detail: Abstract (use case level) for management vs. details (database calls) for developers
The behavior of system is not just a single metric.

Lessons Learned
Tool stack is awesome: Prometheus, Sleuth / Zipkin, Logging (fluentD, elastic) is stable with a good
documentation.
Maximum flexibility compared to commercial products.
But: Effort for concepts, implementation and quality checks. Conventions and rulesets are important!
Mindset: We found that we had to convince people first. But we have seen a high level of acceptance.
Example: Chatbot with trace-links is standard tool for discussing possible bugs between all project roles.
Development and system understanding: No need of “cloudy” conversations. Just provide the context, e.g. a
trace id.
Example: Issues typically contain the context (trace id) that points the developer to the logs and the trace.

Any Questions?
Come to our
booth
We’re hiring!
#CloudNativeNerd
#CloudKoffer
chatbot:
cvi scale
up team

Observability in a̶c̶t̶i̶o̶n̶! the wild!

Recommended

Recommended

More Related Content

More from QAware GmbH

More from QAware GmbH (20)

Recently uploaded

Recently uploaded (20)

Observability in a̶c̶t̶i̶o̶n̶! the wild!