The Top Outages of 2022: Analysis and Takeaways

ThousandEyes
ThousandEyesThousandEyes
1
© 1992–2023 Cisco Systems, Inc. All rights reserved.
2
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Featured Speaker
Mike Hicks
Principal Solutions Analyst
3
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Before We Begin...
• If you have any questions, please type them in the Questions window.
• If you have any audio problems, please chat us for help.
• A recording of this presentation will be sent to you in a few days.
3
@ThousandEyes
© 1992–2023 Cisco Systems, Inc. All rights reserved.
4
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Agenda
• About ThousandEyes
• Noteworthy Outages of 2022
• Primer: Digital Service Building Blocks
• Top Ten Outage Countdown
• Lessons & Takeaways
• Q&A
4
@ThousandEyes
5
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Actionable Insight for Internet, Cloud, and SaaS
Correlated Insights
Quickly isolate issues to app, network,
or service
Network Visibility
Overlay, hop-by-hop underlay, ISP
performance, and BGP routing
App Experience
SaaS, API, and internal app
performance and user experience
6
© 1992–2023 Cisco Systems, Inc. All rights reserved.
2022 Noteworthy Outages
Major
Significant
Shadow
British
Airways
(2/25)
Twitter
prefixes
hijacked
(3/28)
Atlassian
services
unavailable
(4/5)
Rogers
routing
failure
(7/8)
AWS AZ
Failure
(8/9)
Zoom
Outage
(9/15)
Zscaler
Internet
Access
Failure
(10/25)
WhatsApp
Outage
(10/25)
AWS
packet
loss
(12/5)
7
© 1992–2023 Cisco Systems, Inc. All rights reserved.
CDN
Cloud
BGP
DNS
The Building Blocks of Today’s Digital Services
SaaS
8
© 1992–2023 Cisco Systems, Inc. All rights reserved.
DNS
BGP
Many Options, Complex Dependencies
ISP
Users
CDN
Your App
Security
9
© 1992–2023 Cisco Systems, Inc. All rights reserved.
DNS
BGP
Many Options, Complex Dependencies
ISP
Users
CDN
Your App
Cloud APIs
Data Center
Cloud IaaS
Security
10
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 1: DNS – Where are We Going?
Users CDN Your App
BGP
ISP
DNS
Root Server
TLD Server
Authoritative
Server
11
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 2: How do We Get There?
Users BGP
ISP
DNS CDN Your App
12
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 3: CDNs - Do We Have to Travel So Far?
Users Your App
CDN
BGP
ISP
DNS
13
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Step 4: Rinse and Repeat For Services & API Calls
Your App
SaaS Apps
Cloud APIs
Data
Center
Backend
Services
Top Ten Countdown
15
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Atlassian, Apr 5, 2022
#9
#8
#10
#7
#6
Zscaler Internet Access, Oct 25, 2022
WhatsApp, Oct 25, 2022
AWS, Dec 5, 2022
Rogers, Jul 8, 2022
~24 hours
App + routing issues
~2.5 days
Service unavailable/data loss
Rogers withdrew its prefixes due to an internal routing issue,
rendering it unreachable across the Internet for nearly 24 hours.
Lesson: No provider is immune to outages. Plan for a backup
network provider that can alleviate the length and scope of an
outage.
Customers using Zscaler Internet Access (ZIA) experienced
connectivity failures or high latency in reaching Zscaler proxies.
Lesson: Having network-agnostic data for complex scenarios like
this can enable quicker attribution and remediation.
~30 minutes
Network traffic loss
~2 hours
Failure to send/receive messages
~1 hour
Network traffic/packet loss
Significant packet loss between 2 global locations and AWS' us-
east-2 region. Lesson: it’s important to monitor not just the
applications, but also the cloud infrastructure components and
any dependent cloud software services.
The two-hour outage left WhatsApp users unable to send or
receive messages. Lesson: A thriving SaaS business relies on
continuous improvement, which is why an immediate feedback
loop—whereby mistakes can be rectified quickly—is necessary.
Due to a maintenance script error, Atlassian services
experienced a days-long outage. Lesson: One cannot rely on
status pages alone to communicate about outages. Customers
can be left worrying with no answer as to how serious an outage
is and when it will be fixed.
Outage
Blog
Outage
Blog
16
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Zoom, September 15th, 2022
#5
• Service unavailable ~20
minutes
• Users were unable to
log in or join meetings
• Most of the HTTP errors
seen were 503 Bad
Gateway responses,
indicative of potential
CDN issues
• The service would
appear to be available if
just testing via IP, but
looking at HTTP
results/service status
tells a different story
Lesson: It may be that the app itself is causing issues rather than
the network. Having visibility into which it is can prevent confusion
and finger-pointing during root cause analysis.
17
© 1992–2023 Cisco Systems, Inc. All rights reserved.
British Airways, February 25, 2022
#4
• Service unavailable
~20 minutes
• Outage caused
hundreds of flight
cancellations and
disruptions in the
airline's operations
• Network paths to the
airline’s online services
(and servers) were
reachable, but server
and site responses
were timing out
Lesson: Architecting backends that avoid single points of failure
can reduce the likelihood of a chain of events
18
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Google, August 9, 2022
#3
• Service unavailable for
~60 minutes
• Outage affected Google
search and maps
• During this time, Google
web servers responded
with HTTP 500 Internal
Server Error messages,
502 bad gateway errors,
and timeouts
Lesson: It is important to monitor not just your application front
ends but also the performance-critical dependencies that power
your app. Outage Blog
19
© 1992–2023 Cisco Systems, Inc. All rights reserved.
AWS AZ Failure, July 28th, 2022
#2
• Service unavailable ~20
minutes, ~3 hours for
customers to recover
• Caused by an
Availability Zone power
failure
• Impacted applications
such as Webex, Okta,
and Splunk.
• Affected EC2 instances
and EBS volumes as
well as traffic routing
Lesson: Be sure to have redundant AZ architecture as
they are typically active/active and remove the need to
execute a backup plan. Outage Blog
20
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Twitter, March 28th, 2022
#1
• Service unavailable ~45
minutes
• Twitter was rendered
unreachable for some users
when JSC RTComm.RU
(AS 8342) announced one
of Twitter’s prefixes and
subsequently blackholed
traffic
• Since Twitter’s service is not
located within RTComm’s
network, any Twitter traffic
destined to RTComm would
have failed.
Lesson: Though your company might have RPKI implemented to
fend off BGP threats, it's possible that your telco won't. Something
to consider when selecting ISPs. Outage Blog
21
© 1992–2023 Cisco Systems, Inc. All rights reserved.
Lessons and Takeaways
• BGP powers the Internet, but can also be misused and abused.
Visibility and planning is needed to protect your network.
• Public cloud is ubiquitous and reliable. But, ensure that you are
monitoring all cloud dependencies.
• Avoid single points of failure. Your apps are only as resilient as your
architecture.
• Security is essential, but it can add great complexity that requires
continuous end-to-end visibility.
• Whenever the infrastructure is touched, failures can occur. Visibility is
critical before and after each network change to avoid impacts.
© 1992–2023 Cisco Systems, Inc. All rights reserved. 22
@ThousandEyes
Learn
more
Free
Trial /
Demo
Next Steps
Copyright ©2023 ThousandEyes
• Subscribe! https://blog.thousandeyes.com
• Get a real-time view of the health of the Internet
https://thousandeyes.com/outages
• Sign up for a Free Trial:
https://www.thousandeyes.com/signup
• Request a demo:
https://www.thousandeyes.com/request-demo
Q&A
The Top Outages of 2022: Analysis and Takeaways
1 sur 24

Recommandé

EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptx par
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptxEMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptx
EMEA.23.02.23_Top_Outages_of_2022_Webinar_Slides.pptxThousandEyes
64 vues24 diapositives
Introduction to ThousandEyes par
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
90 vues35 diapositives
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edition par
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 EditionTakeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edition
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 EditionThousandEyes
39 vues29 diapositives
Introduction to ThousandEyes par
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
100 vues38 diapositives
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edit... par
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edit...Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edit...
Takeaways, Lessons, and Insights From the Cloud Performance Report: 2022 Edit...ThousandEyes
63 vues29 diapositives
Introduction to ThousandEyes par
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
170 vues39 diapositives

Contenu connexe

Similaire à The Top Outages of 2022: Analysis and Takeaways

Introduction To ThousandEyes par
Introduction To ThousandEyesIntroduction To ThousandEyes
Introduction To ThousandEyesThousandEyes
173 vues37 diapositives
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf par
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdfSaurabh Chauhan
10 vues37 diapositives
Introduction to ThousandEyes par
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
125 vues34 diapositives
Introduction to ThousandEyes par
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyesThousandEyes
629 vues36 diapositives
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce par
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceOptimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceThousandEyes
122 vues30 diapositives
Microsoft Outage Analysis par
Microsoft Outage AnalysisMicrosoft Outage Analysis
Microsoft Outage AnalysisThousandEyes
415 vues14 diapositives

Similaire à The Top Outages of 2022: Analysis and Takeaways(20)

Introduction To ThousandEyes par ThousandEyes
Introduction To ThousandEyesIntroduction To ThousandEyes
Introduction To ThousandEyes
ThousandEyes173 vues
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf par Saurabh Chauhan
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
0328apjcintrotothousandeyeswebinar-230328233735-4df10d7f.pdf
Saurabh Chauhan10 vues
Introduction to ThousandEyes par ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
ThousandEyes125 vues
Introduction to ThousandEyes par ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
ThousandEyes629 vues
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce par ThousandEyes
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceOptimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes122 vues
Microsoft Outage Analysis par ThousandEyes
Microsoft Outage AnalysisMicrosoft Outage Analysis
Microsoft Outage Analysis
ThousandEyes415 vues
Introduction to ThousandEyes par ThousandEyes
Introduction to ThousandEyesIntroduction to ThousandEyes
Introduction to ThousandEyes
ThousandEyes69 vues
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects par ThousandEyes
How to Evaluate, Rollout and Operationalize Your SD-WAN ProjectsHow to Evaluate, Rollout and Operationalize Your SD-WAN Projects
How to Evaluate, Rollout and Operationalize Your SD-WAN Projects
ThousandEyes235 vues
What is ThousandEyes Webinar par ThousandEyes
What is ThousandEyes WebinarWhat is ThousandEyes Webinar
What is ThousandEyes Webinar
ThousandEyes62 vues
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce par ThousandEyes
Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceOptimizing and Troubleshooting Digital Experience for a Hybrid Workforce
Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes25 vues
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce par ThousandEyes
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid WorkforceEMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
EMEA Optimizing and Troubleshooting Digital Experience for a Hybrid Workforce
ThousandEyes67 vues
Owning End-to-end Application Experience With ThousandEyes par ThousandEyes
Owning End-to-end Application Experience With ThousandEyesOwning End-to-end Application Experience With ThousandEyes
Owning End-to-end Application Experience With ThousandEyes
ThousandEyes148 vues
Level-up Your Cloud Visibility Into AWS With ThousandEyes par ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
ThousandEyes78 vues
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base... par ThousandEyes
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...
Adopting SD-WAN With Confidence: How To Assure and Troubleshoot Internet-base...
ThousandEyes237 vues
Cisco IT and ThousandEyes par ThousandEyes
Cisco IT and ThousandEyesCisco IT and ThousandEyes
Cisco IT and ThousandEyes
ThousandEyes5.6K vues
The Top Outages of 2021: Analysis and Takeaways par ThousandEyes
The Top Outages of 2021: Analysis and TakeawaysThe Top Outages of 2021: Analysis and Takeaways
The Top Outages of 2021: Analysis and Takeaways
ThousandEyes692 vues
Discover the Power of ThousandEyes on Your Meraki MX par ThousandEyes
Discover the Power of ThousandEyes on Your Meraki MXDiscover the Power of ThousandEyes on Your Meraki MX
Discover the Power of ThousandEyes on Your Meraki MX
ThousandEyes29 vues
Getting Started with ThousandEyes Proof of Concepts par ThousandEyes
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
ThousandEyes42 vues
Getting Started with ThousandEyes Proof of Concepts par ThousandEyes
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
ThousandEyes136 vues
Realize True Business Value With ThousandEyes par ThousandEyes
Realize True Business Value With ThousandEyesRealize True Business Value With ThousandEyes
Realize True Business Value With ThousandEyes
ThousandEyes56 vues

Plus de ThousandEyes

Level-up Your Cloud Visibility Into AWS With ThousandEyes par
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesThousandEyes
92 vues33 diapositives
How Financial Institutions Can Deliver Seamless Customer Digital Engagements par
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsHow Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsThousandEyes
53 vues27 diapositives
New ThousandEyes Product Features and Release Highlights: November 2023 par
New ThousandEyes Product Features and Release Highlights: November 2023New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023ThousandEyes
36 vues32 diapositives
New ThousandEyes Product Features and Release Highlights: October 2023 par
New ThousandEyes Product Features and Release Highlights: October 2023New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023ThousandEyes
74 vues32 diapositives
Introduction to ThousandEyes and Meraki MX for Partners par
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for PartnersThousandEyes
25 vues14 diapositives
Introduction to ThousandEyes and Meraki MX for Partners in Spanish par
Introduction to ThousandEyes and Meraki MX for Partners in SpanishIntroduction to ThousandEyes and Meraki MX for Partners in Spanish
Introduction to ThousandEyes and Meraki MX for Partners in SpanishThousandEyes
28 vues15 diapositives

Plus de ThousandEyes(20)

Level-up Your Cloud Visibility Into AWS With ThousandEyes par ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyesLevel-up Your Cloud Visibility Into AWS With ThousandEyes
Level-up Your Cloud Visibility Into AWS With ThousandEyes
ThousandEyes92 vues
How Financial Institutions Can Deliver Seamless Customer Digital Engagements par ThousandEyes
How Financial Institutions Can Deliver Seamless Customer Digital EngagementsHow Financial Institutions Can Deliver Seamless Customer Digital Engagements
How Financial Institutions Can Deliver Seamless Customer Digital Engagements
ThousandEyes53 vues
New ThousandEyes Product Features and Release Highlights: November 2023 par ThousandEyes
New ThousandEyes Product Features and Release Highlights: November 2023New ThousandEyes Product Features and Release Highlights: November 2023
New ThousandEyes Product Features and Release Highlights: November 2023
ThousandEyes36 vues
New ThousandEyes Product Features and Release Highlights: October 2023 par ThousandEyes
New ThousandEyes Product Features and Release Highlights: October 2023New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023
ThousandEyes74 vues
Introduction to ThousandEyes and Meraki MX for Partners par ThousandEyes
Introduction to ThousandEyes and Meraki MX for PartnersIntroduction to ThousandEyes and Meraki MX for Partners
Introduction to ThousandEyes and Meraki MX for Partners
ThousandEyes25 vues
Introduction to ThousandEyes and Meraki MX for Partners in Spanish par ThousandEyes
Introduction to ThousandEyes and Meraki MX for Partners in SpanishIntroduction to ThousandEyes and Meraki MX for Partners in Spanish
Introduction to ThousandEyes and Meraki MX for Partners in Spanish
ThousandEyes28 vues
Introduction to ThousandEyes and Meraki MX for Partners in French par ThousandEyes
Introduction to ThousandEyes and Meraki MX for Partners in FrenchIntroduction to ThousandEyes and Meraki MX for Partners in French
Introduction to ThousandEyes and Meraki MX for Partners in French
ThousandEyes50 vues
Introduction to ThousandEyes and Meraki MX for Partners in German.pptx par ThousandEyes
Introduction to ThousandEyes and Meraki MX for Partners in German.pptxIntroduction to ThousandEyes and Meraki MX for Partners in German.pptx
Introduction to ThousandEyes and Meraki MX for Partners in German.pptx
ThousandEyes25 vues
New ThousandEyes Product Features and Release Highlights: October 2023 par ThousandEyes
New ThousandEyes Product Features and Release Highlights: October 2023New ThousandEyes Product Features and Release Highlights: October 2023
New ThousandEyes Product Features and Release Highlights: October 2023
ThousandEyes105 vues
roomos_webinar_280923_v2.pptx par ThousandEyes
roomos_webinar_280923_v2.pptxroomos_webinar_280923_v2.pptx
roomos_webinar_280923_v2.pptx
ThousandEyes71 vues
Improving Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft ... par ThousandEyes
Improving Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft ...Improving Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft ...
Improving Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft ...
ThousandEyes87 vues
Improve Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft Te... par ThousandEyes
Improve Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft Te...Improve Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft Te...
Improve Employee Experiences on Cisco RoomOS Devices, Webex, and Microsoft Te...
ThousandEyes106 vues
New ThousandEyes Product Features and Release Highlights: July 2023 par ThousandEyes
New ThousandEyes Product Features and Release Highlights: July 2023New ThousandEyes Product Features and Release Highlights: July 2023
New ThousandEyes Product Features and Release Highlights: July 2023
ThousandEyes51 vues
How to Monitor Digital Dependencies Across Your Modern IT Stack par ThousandEyes
How to Monitor Digital Dependencies Across Your Modern IT StackHow to Monitor Digital Dependencies Across Your Modern IT Stack
How to Monitor Digital Dependencies Across Your Modern IT Stack
ThousandEyes29 vues
How to Monitor Digital Dependencies Across Your Modern IT Stack par ThousandEyes
How to Monitor Digital Dependencies Across Your Modern IT StackHow to Monitor Digital Dependencies Across Your Modern IT Stack
How to Monitor Digital Dependencies Across Your Modern IT Stack
ThousandEyes5 vues
New ThousandEyes Product Features and Release Highlights: June 2023 par ThousandEyes
New ThousandEyes Product Features and Release Highlights: June 2023New ThousandEyes Product Features and Release Highlights: June 2023
New ThousandEyes Product Features and Release Highlights: June 2023
ThousandEyes63 vues
A Partner Overview to ThousandEyes - v1_1_ES.pptx par ThousandEyes
A Partner Overview to ThousandEyes - v1_1_ES.pptxA Partner Overview to ThousandEyes - v1_1_ES.pptx
A Partner Overview to ThousandEyes - v1_1_ES.pptx
ThousandEyes34 vues
A Partner Overview to ThousandEyes - v1_2_DE.pptx par ThousandEyes
A Partner Overview to ThousandEyes - v1_2_DE.pptxA Partner Overview to ThousandEyes - v1_2_DE.pptx
A Partner Overview to ThousandEyes - v1_2_DE.pptx
ThousandEyes34 vues
How to Monitor Digital Dependencies Across Your Modern IT Stack par ThousandEyes
How to Monitor Digital Dependencies Across Your Modern IT StackHow to Monitor Digital Dependencies Across Your Modern IT Stack
How to Monitor Digital Dependencies Across Your Modern IT Stack
ThousandEyes58 vues
emea_cisco_live_webinar_150623.pptx par ThousandEyes
emea_cisco_live_webinar_150623.pptxemea_cisco_live_webinar_150623.pptx
emea_cisco_live_webinar_150623.pptx
ThousandEyes216 vues

Dernier

How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... par
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...ShapeBlue
97 vues28 diapositives
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue par
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlueShapeBlue
75 vues23 diapositives
Microsoft Power Platform.pptx par
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
74 vues38 diapositives
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue par
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueShapeBlue
63 vues15 diapositives
20231123_Camunda Meetup Vienna.pdf par
20231123_Camunda Meetup Vienna.pdf20231123_Camunda Meetup Vienna.pdf
20231123_Camunda Meetup Vienna.pdfPhactum Softwareentwicklung GmbH
49 vues73 diapositives
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... par
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...ShapeBlue
52 vues10 diapositives

Dernier(20)

How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ... par ShapeBlue
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
How to Re-use Old Hardware with CloudStack. Saving Money and the Environment ...
ShapeBlue97 vues
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue par ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue75 vues
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue par ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue63 vues
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... par ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue52 vues
Business Analyst Series 2023 - Week 4 Session 7 par DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10110 vues
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... par ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue113 vues
The Role of Patterns in the Era of Large Language Models par Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li74 vues
State of the Union - Rohit Yadav - Apache CloudStack par ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue218 vues
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... par ShapeBlue
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
ShapeBlue48 vues
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P... par ShapeBlue
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
Developments to CloudStack’s SDN ecosystem: Integration with VMWare NSX 4 - P...
ShapeBlue120 vues
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates par ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue178 vues
Data Integrity for Banking and Financial Services par Precisely
Data Integrity for Banking and Financial ServicesData Integrity for Banking and Financial Services
Data Integrity for Banking and Financial Services
Precisely76 vues
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... par ShapeBlue
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
ShapeBlue59 vues
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue par ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue191 vues
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT par ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue138 vues
Future of AR - Facebook Presentation par Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty54 vues
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... par ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue114 vues

The Top Outages of 2022: Analysis and Takeaways

  • 1. 1 © 1992–2023 Cisco Systems, Inc. All rights reserved.
  • 2. 2 © 1992–2023 Cisco Systems, Inc. All rights reserved. Featured Speaker Mike Hicks Principal Solutions Analyst
  • 3. 3 © 1992–2023 Cisco Systems, Inc. All rights reserved. Before We Begin... • If you have any questions, please type them in the Questions window. • If you have any audio problems, please chat us for help. • A recording of this presentation will be sent to you in a few days. 3 @ThousandEyes © 1992–2023 Cisco Systems, Inc. All rights reserved.
  • 4. 4 © 1992–2023 Cisco Systems, Inc. All rights reserved. Agenda • About ThousandEyes • Noteworthy Outages of 2022 • Primer: Digital Service Building Blocks • Top Ten Outage Countdown • Lessons & Takeaways • Q&A 4 @ThousandEyes
  • 5. 5 © 1992–2023 Cisco Systems, Inc. All rights reserved. Actionable Insight for Internet, Cloud, and SaaS Correlated Insights Quickly isolate issues to app, network, or service Network Visibility Overlay, hop-by-hop underlay, ISP performance, and BGP routing App Experience SaaS, API, and internal app performance and user experience
  • 6. 6 © 1992–2023 Cisco Systems, Inc. All rights reserved. 2022 Noteworthy Outages Major Significant Shadow British Airways (2/25) Twitter prefixes hijacked (3/28) Atlassian services unavailable (4/5) Rogers routing failure (7/8) AWS AZ Failure (8/9) Zoom Outage (9/15) Zscaler Internet Access Failure (10/25) WhatsApp Outage (10/25) AWS packet loss (12/5)
  • 7. 7 © 1992–2023 Cisco Systems, Inc. All rights reserved. CDN Cloud BGP DNS The Building Blocks of Today’s Digital Services SaaS
  • 8. 8 © 1992–2023 Cisco Systems, Inc. All rights reserved. DNS BGP Many Options, Complex Dependencies ISP Users CDN Your App Security
  • 9. 9 © 1992–2023 Cisco Systems, Inc. All rights reserved. DNS BGP Many Options, Complex Dependencies ISP Users CDN Your App Cloud APIs Data Center Cloud IaaS Security
  • 10. 10 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 1: DNS – Where are We Going? Users CDN Your App BGP ISP DNS Root Server TLD Server Authoritative Server
  • 11. 11 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 2: How do We Get There? Users BGP ISP DNS CDN Your App
  • 12. 12 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 3: CDNs - Do We Have to Travel So Far? Users Your App CDN BGP ISP DNS
  • 13. 13 © 1992–2023 Cisco Systems, Inc. All rights reserved. Step 4: Rinse and Repeat For Services & API Calls Your App SaaS Apps Cloud APIs Data Center Backend Services
  • 15. 15 © 1992–2023 Cisco Systems, Inc. All rights reserved. Atlassian, Apr 5, 2022 #9 #8 #10 #7 #6 Zscaler Internet Access, Oct 25, 2022 WhatsApp, Oct 25, 2022 AWS, Dec 5, 2022 Rogers, Jul 8, 2022 ~24 hours App + routing issues ~2.5 days Service unavailable/data loss Rogers withdrew its prefixes due to an internal routing issue, rendering it unreachable across the Internet for nearly 24 hours. Lesson: No provider is immune to outages. Plan for a backup network provider that can alleviate the length and scope of an outage. Customers using Zscaler Internet Access (ZIA) experienced connectivity failures or high latency in reaching Zscaler proxies. Lesson: Having network-agnostic data for complex scenarios like this can enable quicker attribution and remediation. ~30 minutes Network traffic loss ~2 hours Failure to send/receive messages ~1 hour Network traffic/packet loss Significant packet loss between 2 global locations and AWS' us- east-2 region. Lesson: it’s important to monitor not just the applications, but also the cloud infrastructure components and any dependent cloud software services. The two-hour outage left WhatsApp users unable to send or receive messages. Lesson: A thriving SaaS business relies on continuous improvement, which is why an immediate feedback loop—whereby mistakes can be rectified quickly—is necessary. Due to a maintenance script error, Atlassian services experienced a days-long outage. Lesson: One cannot rely on status pages alone to communicate about outages. Customers can be left worrying with no answer as to how serious an outage is and when it will be fixed. Outage Blog Outage Blog
  • 16. 16 © 1992–2023 Cisco Systems, Inc. All rights reserved. Zoom, September 15th, 2022 #5 • Service unavailable ~20 minutes • Users were unable to log in or join meetings • Most of the HTTP errors seen were 503 Bad Gateway responses, indicative of potential CDN issues • The service would appear to be available if just testing via IP, but looking at HTTP results/service status tells a different story Lesson: It may be that the app itself is causing issues rather than the network. Having visibility into which it is can prevent confusion and finger-pointing during root cause analysis.
  • 17. 17 © 1992–2023 Cisco Systems, Inc. All rights reserved. British Airways, February 25, 2022 #4 • Service unavailable ~20 minutes • Outage caused hundreds of flight cancellations and disruptions in the airline's operations • Network paths to the airline’s online services (and servers) were reachable, but server and site responses were timing out Lesson: Architecting backends that avoid single points of failure can reduce the likelihood of a chain of events
  • 18. 18 © 1992–2023 Cisco Systems, Inc. All rights reserved. Google, August 9, 2022 #3 • Service unavailable for ~60 minutes • Outage affected Google search and maps • During this time, Google web servers responded with HTTP 500 Internal Server Error messages, 502 bad gateway errors, and timeouts Lesson: It is important to monitor not just your application front ends but also the performance-critical dependencies that power your app. Outage Blog
  • 19. 19 © 1992–2023 Cisco Systems, Inc. All rights reserved. AWS AZ Failure, July 28th, 2022 #2 • Service unavailable ~20 minutes, ~3 hours for customers to recover • Caused by an Availability Zone power failure • Impacted applications such as Webex, Okta, and Splunk. • Affected EC2 instances and EBS volumes as well as traffic routing Lesson: Be sure to have redundant AZ architecture as they are typically active/active and remove the need to execute a backup plan. Outage Blog
  • 20. 20 © 1992–2023 Cisco Systems, Inc. All rights reserved. Twitter, March 28th, 2022 #1 • Service unavailable ~45 minutes • Twitter was rendered unreachable for some users when JSC RTComm.RU (AS 8342) announced one of Twitter’s prefixes and subsequently blackholed traffic • Since Twitter’s service is not located within RTComm’s network, any Twitter traffic destined to RTComm would have failed. Lesson: Though your company might have RPKI implemented to fend off BGP threats, it's possible that your telco won't. Something to consider when selecting ISPs. Outage Blog
  • 21. 21 © 1992–2023 Cisco Systems, Inc. All rights reserved. Lessons and Takeaways • BGP powers the Internet, but can also be misused and abused. Visibility and planning is needed to protect your network. • Public cloud is ubiquitous and reliable. But, ensure that you are monitoring all cloud dependencies. • Avoid single points of failure. Your apps are only as resilient as your architecture. • Security is essential, but it can add great complexity that requires continuous end-to-end visibility. • Whenever the infrastructure is touched, failures can occur. Visibility is critical before and after each network change to avoid impacts.
  • 22. © 1992–2023 Cisco Systems, Inc. All rights reserved. 22 @ThousandEyes Learn more Free Trial / Demo Next Steps Copyright ©2023 ThousandEyes • Subscribe! https://blog.thousandeyes.com • Get a real-time view of the health of the Internet https://thousandeyes.com/outages • Sign up for a Free Trial: https://www.thousandeyes.com/signup • Request a demo: https://www.thousandeyes.com/request-demo
  • 23. Q&A