SlideShare une entreprise Scribd logo
1  sur  61
Télécharger pour lire hors ligne
chronosphere.io
chronosphere.io
3 Pitfalls
Eric D. Schabell
Director Evangelism
@ericschabell{@fosstodon.org}
10 Jan 2024
Everyone Should Avoid with Cloud Native Observability
Cloud Native London
chronosphere.io
Cloud Native Observability
chronosphere.io
Cloud Native
chronosphere.io
Data volume
Experiment:
- Hello World app on 4 node
Kubernetes cluster with
Tracing, End User Metrics
(EUM), Logs, Metrics
(containers / nodes)
- 30 days == +450 GB
chronosphere.io
Retention
Retention
Retention
Retention
Retention
Retention
Retention
chronosphere.io
Cloud Native at Scale
chronosphere.io
Observability…
chronosphere.io
Cloud Native Observability
at Scale
chronosphere.io
O11y at Scale (need)
chronosphere.io
1. Ignoring costs in application landscape
2. Focusing on The Pillars
3. Sneaky sprawling tooling mess
4. Controlling costs
5. Losing your way in the protocol jungles
6. Underestimating cardinality
(Click on a pitfall to jump to that section)
Picking Your Pitfalls
chronosphere.io
1. Ignoring costs in
application landscape
chronosphere.io
chronosphere.io
Prometheus for metrics, alerting, queries
chronosphere.io
Prometheus auto discovery
chronosphere.io
Manual instrumentation (java client lib)
chronosphere.io
Short link:
bit.ly/prom-workshop
chronosphere.io
Applications (Java)
OTel Auto Instrumentation (libraries)
OTel API
OTel SDK
OTel Collector
OTLP
OTLP
OTLP
OpenTelemetry (Auto) instrumentation
chronosphere.io
Host
Observability Backend
(Prometheus, Jaeger, Fluent Bit, etc.),
Applications
OTel Auto Instrumentation
OTel API
OTel SDK
OTel Collector Agent
OTLP
OTLP
OTLP
OTLP
OTLP
OpenTelemetry Collector (agent)
chronosphere.io
Host
Host
Host
Observability Backend
(Prometheus, Jaeger, Fluent Bit, etc.),
Applications
OTel Auto Instrumentation
OTel API
OTel SDK
OTel Collector Agent
OTLP
OTLP
OTLP
OTLP
Collector (gateway)
OTel Collector Gateway
chronosphere.io
Short link:
bit.ly/opentelemetry-workshop
chronosphere.io
1. Ignoring costs in application landscape
2. Focusing on The Pillars
3. Sneaky sprawling tooling mess
4. Controlling costs
5. Losing your way in the protocol jungles
6. Underestimating cardinality
(Click on a pitfall to jump to that section, or jump to end)
Picking Your Next Pitfall
chronosphere.io
2. Focusing on The Pillars
chronosphere.io
Pillars Phases
chronosphere.io
Developer
Technology
Bottom up
chronosphere.io
Pillar problems…
chronosphere.io
Car is on fire…
chronosphere.io
Better outcomes…
Faster remediation…
Easier detection…
Happier customers…
chronosphere.io
Phase 1
Know something is
happening as fast
as possible…
chronosphere.io
Phase 2
Triage with specific
information…
chronosphere.io
Phase 3
Understand to
ensure never
happens again…
chronosphere.io
chronosphere.io
chronosphere.io
1. Ignoring costs in application landscape
2. Focusing on The Pillars
3. Sneaky sprawling tooling mess
4. Controlling costs
5. Losing your way in the protocol jungles
6. Underestimating cardinality
(Click on a pitfall to jump to that section, or jump to end)
Picking Your Next Pitfall
chronosphere.io
3. Sneaky sprawling tooling mess
chronosphere.io
Over 66% of organizations
use more than 10 different
observability tools
– ESG report over exploding data volumes
chronosphere.io
chronosphere.io
Know
Triage
Understand
chronosphere.io
chronosphere.io
1. Ignoring costs in application landscape
2. Focusing on The Pillars
3. Sneaky sprawling tooling mess
4. Controlling costs
5. Losing your way in the protocol jungles
6. Underestimating cardinality
(Click on a pitfall to jump to that section, or jump to end)
Picking Your Next Pitfall
chronosphere.io
4. Controlling costs
chronosphere.io
“It’s remarkable how common
this situation is, where an
organization is paying more for
their observability data, than
they do for their production
infrastructure.”
?
chronosphere.io
O11y data storage costs
are broken.
Keeping everything
model?
chronosphere.io
Know the cost of
observability
metrics data?
chronosphere.io
Traces
Metrics
Events
CHRONOSPHERE LENS
CONTROL PLANE
Developer-centric observability
Real-time insight and transformation
of observability data
PURPOSE-BUILT DATA
STORES PER TELEMETRY TYPE
Single tenant architecture, proven
industrial reliability and scalability
chronosphere.io
chronosphere.io
1. Ignoring costs in application landscape
2. Focusing on The Pillars
3. Sneaky sprawling tooling mess
4. Controlling costs
5. Losing your way in the protocol jungles
6. Underestimating cardinality
(Click on a pitfall to jump to that section, or jump to end)
Picking Your Next Pitfall
chronosphere.io
5. Losing your way in the
protocol jungles
chronosphere.io
Without open standards,
you’ll not find a way back…
chronosphere.io
chronosphere.io
Host
Observability Backend
(Prometheus, Jaeger, Fluent Bit, etc.),
Applications
OTel Auto Instrumentation
OTel API
OTel SDK
OTel Collector Agent
OTLP
OTLP
OTLP
OTLP
OTLP
OpenTelemetry Collector (agent)
chronosphere.io
Prometheus for metrics, alerting, queries
chronosphere.io
1. Ignoring costs in application landscape
2. Focusing on The Pillars
3. Sneaky sprawling tooling mess
4. Controlling costs
5. Losing your way in the protocol jungles
6. Underestimating cardinality
(Click on a pitfall to jump to that section, or jump to end)
Picking Your Next Pitfall
chronosphere.io
6. Underestimating cardinality
chronosphere.io
The struggle is real
“I don't yet collect spans/traces because I can hardly get our devs to care about basic metrics, let alone
traces.”
“This is a large enterprise with approx. 1000 developers. Cultivating a culture of engineering that cares
about availability is a challenge that we need to solve alongside any technical implementations.”
chronosphere.io
10 hours
on average, per week,
trying to triage and
understand incidents -
a quarter of a
40 hour work week
chronosphere.io
33%
said those issues
disrupted their
personal life
39%
admitting they are
frequently
stressed out
chronosphere.io
Cloud Native
Observability
at Scale
chronosphere.io
Traces
Metrics
Events
CHRONOSPHERE LENS
CONTROL PLANE
Developer-centric observability
Real-time insight and transformation
of observability data
PURPOSE-BUILT DATA
STORES PER TELEMETRY TYPE
Single tenant architecture, proven
industrial reliability and scalability
chronosphere.io
1. Ignoring costs in application landscape
2. Focusing on The Pillars
3. Sneaky sprawling tooling mess
4. Controlling costs
5. Losing your way in the protocol jungles
6. Underestimating cardinality
(Click on a pitfall to jump to that section, or jump to end)
Picking Your Next Pitfall
chronosphere.io
What should be
the #1 item on
cloud wishlist?
What should be #1 item on your
cloud native observability
wishlist?
chronosphere.io
chronosphere.io
Questions?
Eric D. Schabell
Director Evangelism
@ericschabell{@fosstodon.org}

Contenu connexe

Similaire à 3 Pitfalls Everyone Should Avoid with Cloud Native Observability

Observability For You and Me with openTelemetry
Observability For You and Me with openTelemetryObservability For You and Me with openTelemetry
Observability For You and Me with openTelemetry
All Things Open
 
Open Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetryOpen Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetry
Eric D. Schabell
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
Eric D. Schabell
 

Similaire à 3 Pitfalls Everyone Should Avoid with Cloud Native Observability (20)

How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!How to Wrestle Your Observability Data Demons and Win!
How to Wrestle Your Observability Data Demons and Win!
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: Metrics
 
Chaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in ProductionChaos Engineering - The Art of Breaking Things in Production
Chaos Engineering - The Art of Breaking Things in Production
 
Shift left Observability
Shift left ObservabilityShift left Observability
Shift left Observability
 
Observability For You and Me with openTelemetry
Observability For You and Me with openTelemetryObservability For You and Me with openTelemetry
Observability For You and Me with openTelemetry
 
Open Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetryOpen Source 101 - Observability For You and Me with OpenTelemetry
Open Source 101 - Observability For You and Me with OpenTelemetry
 
Practical Chaos Engineering
Practical Chaos EngineeringPractical Chaos Engineering
Practical Chaos Engineering
 
1025 track1 Malin
1025 track1 Malin1025 track1 Malin
1025 track1 Malin
 
Green Custard Friday Talk 19: Chaos Engineering
Green Custard Friday Talk 19: Chaos EngineeringGreen Custard Friday Talk 19: Chaos Engineering
Green Custard Friday Talk 19: Chaos Engineering
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Three Pillars, No Answers: Helping Platform Teams Solve Real Observability Pr...
Three Pillars, No Answers: Helping Platform Teams Solve Real Observability Pr...Three Pillars, No Answers: Helping Platform Teams Solve Real Observability Pr...
Three Pillars, No Answers: Helping Platform Teams Solve Real Observability Pr...
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
Cloud Operations and Analytics: Improving Distributed Systems Reliability usi...
 
Chaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient SystemsChaos Engineering: Why the World Needs More Resilient Systems
Chaos Engineering: Why the World Needs More Resilient Systems
 
Engineering Velocity: Shifting the Curve at Netflix
Engineering Velocity: Shifting the Curve at NetflixEngineering Velocity: Shifting the Curve at Netflix
Engineering Velocity: Shifting the Curve at Netflix
 
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
tranSMART Community Meeting 5-7 Nov 13 - Session 1: Chilly-Mazarin Meeting Ob...
 
Chaos engineering
Chaos engineering Chaos engineering
Chaos engineering
 
Using security to drive chaos engineering - April 2018
Using security to drive chaos engineering - April 2018Using security to drive chaos engineering - April 2018
Using security to drive chaos engineering - April 2018
 
SRECon Coherent Performance
SRECon Coherent PerformanceSRECon Coherent Performance
SRECon Coherent Performance
 
Sure you’re growing, but are you scaling?
Sure you’re growing, but are you scaling?Sure you’re growing, but are you scaling?
Sure you’re growing, but are you scaling?
 

Plus de Eric D. Schabell

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
Eric D. Schabell
 
Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)
Eric D. Schabell
 
Cloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into ActionCloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into Action
Eric D. Schabell
 
Designing Your Best Architectural Diagrams
Designing Your Best Architectural DiagramsDesigning Your Best Architectural Diagrams
Designing Your Best Architectural Diagrams
Eric D. Schabell
 

Plus de Eric D. Schabell (18)

OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)Observability For You and Me with OpenTelemetry (with demo)
Observability For You and Me with OpenTelemetry (with demo)
 
Observability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetryObservability For You and Me with OpenTelemetry
Observability For You and Me with OpenTelemetry
 
Roadmap to Becoming a CNCF Ambassador
Roadmap to Becoming a CNCF AmbassadorRoadmap to Becoming a CNCF Ambassador
Roadmap to Becoming a CNCF Ambassador
 
Cloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into ActionCloud Native Bedtime Stories - Terrifying Execs into Action
Cloud Native Bedtime Stories - Terrifying Execs into Action
 
Optimizing Observability Spend: Metrics
Optimizing Observability Spend: MetricsOptimizing Observability Spend: Metrics
Optimizing Observability Spend: Metrics
 
Engaging Your Execs - Telling Great Observability Tales Inspiring Action
Engaging Your Execs - Telling Great Observability Tales Inspiring ActionEngaging Your Execs - Telling Great Observability Tales Inspiring Action
Engaging Your Execs - Telling Great Observability Tales Inspiring Action
 
WTF is SRE - Telling Effective Tales about Production
WTF is SRE - Telling Effective Tales about ProductionWTF is SRE - Telling Effective Tales about Production
WTF is SRE - Telling Effective Tales about Production
 
Storytelling - How to build and delivery a story
Storytelling - How to build and delivery a storyStorytelling - How to build and delivery a story
Storytelling - How to build and delivery a story
 
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
PromCon EU 2022 - Centralized vs Decentralized Prometheus Scraping Architectu...
 
DevConf.US 2022 - Exploring Open Source Edge Success at Scale
DevConf.US 2022 - Exploring Open Source Edge Success at ScaleDevConf.US 2022 - Exploring Open Source Edge Success at Scale
DevConf.US 2022 - Exploring Open Source Edge Success at Scale
 
DevConf.US 2022 - Exploring Open Source Telco Success at Scale
DevConf.US 2022 - Exploring Open Source Telco Success at ScaleDevConf.US 2022 - Exploring Open Source Telco Success at Scale
DevConf.US 2022 - Exploring Open Source Telco Success at Scale
 
OpenShift Commons Dublin 2022 - 3 Pitfalls Everyone Should Avoid with Cloud Data
OpenShift Commons Dublin 2022 - 3 Pitfalls Everyone Should Avoid with Cloud DataOpenShift Commons Dublin 2022 - 3 Pitfalls Everyone Should Avoid with Cloud Data
OpenShift Commons Dublin 2022 - 3 Pitfalls Everyone Should Avoid with Cloud Data
 
Designing Your Best Architectural Diagrams
Designing Your Best Architectural DiagramsDesigning Your Best Architectural Diagrams
Designing Your Best Architectural Diagrams
 
Talking Architecture Shop - Exploring Open Source Success at Scale
Talking Architecture Shop - Exploring Open Source Success at ScaleTalking Architecture Shop - Exploring Open Source Success at Scale
Talking Architecture Shop - Exploring Open Source Success at Scale
 
Talking Architecture Shop - Exploring Open Source DevOps at Scale
Talking Architecture Shop - Exploring Open Source DevOps at ScaleTalking Architecture Shop - Exploring Open Source DevOps at Scale
Talking Architecture Shop - Exploring Open Source DevOps at Scale
 
DevConf.CZ - Talking Architecture Shop with Anyone
DevConf.CZ - Talking Architecture Shop with AnyoneDevConf.CZ - Talking Architecture Shop with Anyone
DevConf.CZ - Talking Architecture Shop with Anyone
 
Talking architecture shop - Exploring open source success at scale
Talking architecture shop - Exploring open source success at scaleTalking architecture shop - Exploring open source success at scale
Talking architecture shop - Exploring open source success at scale
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

3 Pitfalls Everyone Should Avoid with Cloud Native Observability