SlideShare une entreprise Scribd logo
1  sur  47
Learning Webops
  the Hard Way

What could possibly go wrong?

                    Cosimo Streppone
         WebOps Lead – Opera Software
The Hard Way ?
failure
teams organization

     mail




  webops    sysadmin
1
a cascade of errors
#
# Cyrus IMAPD annotation definitions file
#
/vendor/messagingengine.com/
preview,message,string,backend,value.shared,
misplaced comma                           +
 fix didn't make it to master             +
  unintended general rollout              +
    parser choked on comma                +
      fork with no rate limiting          +
        fatal() dumped core               +
         kernel.core_uses_pid = 1         +
           small SSD metadata partition   +
             indexes corruption           =
                massive outage (no data loss)
DO
Rate limit fork of children

Test disk full conditions

Master your infrastructure
DO NOT
Underestimate Mighty Comma

Rollout everywhere at once

Leave your CI builds messy
read more
“A cascade of errors”
http://blog.fastmail.fm/2011/05/15/outage-
report-a-cascade-of-errors/
2
magic numbers
physical bladecenters?
  LVS?                               network?
              kernel?
                                      solar storms?
  WTF?!?
           random failures in our
defective cpus?
                infrastructure
                                       DDoS?
                  Mayas?
     bnx2?                     traffic?
           recent deploys?
what we experienced

random performance degradation
general instability
steady increase of WTFs/min!
real problem
●
  2.6.32 = debian squeeze kernel
● sched – find_busiest_group()

● TSC register wraparound
Proof
            64
        2
                             = 208,49
 10                      9
2     · 86400 · 10
Subject:    [PATCH] sched: avoid unnecessary overflow in sched_clock
From:       Salman Qazi <sqazi@google.com>
Date:       2011-11-16 20:55:31

In hundreds of days, the __cycles_2_ns calculation in sched_clock
has an overflow. cyc * per_cpu(cyc2ns, cpu) exceeds 64 bits, causing
the final value to become zero. We can solve this without losing
any precision.

We can decompose TSC into quotient and remainder of division by the
scale factor, and then use this to convert TSC into nanoseconds.

Reviewed-by: Paul Turner <pjt@google.com>
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Salman Qazi <sqazi@google.com>
---
 arch/x86/include/asm/timer.h |   23 ++++++++++++++++++++++-
 1 files changed, 22 insertions(+), 1 deletions(-)

                                   Patch #1, Nov 16th 2011
diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h
index fa7b917..431793e 100644
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -32,6 +32,22 @@ extern int no_timer_check;
  * (mathieu.desnoyers@polymtl.ca)
  *
  *        -johnstul@us.ibm.com "math is hard, lets go shopping!"
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -608,6 +608,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, ...)
 {
    unsigned long long tsc_now, ns_now, *offset;
    unsigned long flags, *scale;
+ unsigned long long quot;
+ unsigned long long rem;             Patch #2, Mar 8th 2012
    local_irq_save(flags);
    sched_clock_idle_sleep_event();
@@ -620,7 +622,15 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, ...)

    if (cpu_khz) {
        *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
-       *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR);
+
+       /*
+        * Avoid premature overflow by splitting into quotient
+        * and remainder. See the comment above __cycles_2_ns
+        */
+       quot = (tsc_now >> CYC2NS_SCALE_FACTOR);
+       rem = tsc_now & ((1ULL << CYC2NS_SCALE_FACTOR) - 1);
+       *offset = ns_now - (quot * *scale +
+                  ((rem * *scale) >> CYC2NS_SCALE_FACTOR));
    }
32
     2
               = 49,7
           3
86400 · 10
sven's explanation video
DO
 Be perseverant and creative :)

 Learn more about your kernel

 Improve tools to collect data
DO NOT
 Run servers continuously
 for more than 208 days?
3
#Leapocalypse
23:59:60
t - 4y 2m
From: Roman Zippel <zippel@linux-m68k.org>
Date: Thu, 1 May 2008 04:34:41 -0700
Subject: [PATCH] ntp: handle leap second via timer

Remove the leap second handling from second_overflow(), which doesn't have to
check for it every second anymore. With CONFIG_NO_HZ this also makes sure the
leap second is handled close to the full second. Additionally this makes it
possible to abort a leap second properly by resetting the STA_INS/STA_DEL status bits.

Signed-off-by: Roman Zippel <zippel@linux-m68k.org>
Cc: john stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/clocksource.h |    2 +
 include/linux/timex.h       |    1 +
 kernel/time/ntp.c           | 133 +++++++++++++++++++++++++++++--------------
 kernel/time/timekeeping.c   |    4 +-
t – 9m
lie(t) = (1 – cos(πt / w)) / 2
lie(s)




                                   t
T - 6m




http://bit.ly/NmA47E

http://my.opera.com/marcomarongiu/blog/index.dml/tag/ntp
T – 1 month
package {
    ntpdate: ensure => installed;
    adjtimex: ensure => installed;
}

file { "/usr/local/bin/leap-adjust.pl":
    ensure => present,
    source => "puppet:///modules/ntp/leap-adjust.pl",
}

file { "/etc/cron.d/ntp-leap-second":
    ensure => present,
    source => "puppet:///modules/ntp/leap-crontab",
    require => [ Package["ntp"], Package["adjtimex"] ],
}
T - 2d


June
29th
T – 1 day
  June 30th 2012



chaos begins
T - 8h




http://bit.ly/PSBMRP

http://serverfault.com/questions/403732/leapocalypse
the work around

 # date -s now
T + {1,2}m
 {August,September} 1st, 2012



fake leap seconds
read more
A story of leaping seconds
   http://blog.fastmail.fm/2012/07/03/a-story-of-leaping-seconds/


Tips and tricks to deal with leap seconds
   http://my.opera.com/marcomarongiu/blog/index.dml/tag/ntp



Serverfault question on random debian crashes
   http://serverfault.com/questions/403732/leapocalypse

Wired article about leap second problems
    http://www.wired.com/wiredenterprise/2012/07/leap-second-bug-
wreaks-havoc-with-java-linux/
DO
Keep your kernel updated

Use valuable external resources
(serverfault etc...)
DO NOT
Underestimate the
importance of time
¿questions?
failure lessons learned



             }
    expect
   assume
   prepare
  simulate       failure
  measure
  embrace
ops lessons learned
Don't repeat yourself (DRY)
Always keep it simple (KISS)
Separate ops team doesn't work well
Practice Continuous deployment. Now.
Communication makes the difference
Learn your tools
Master your infrastructure
RTFM
...
Thanks!

@cstrep
cosimo@opera.com

Contenu connexe

Tendances

Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
Brendan Gregg
 

Tendances (20)

NetConf 2018 BPF Observability
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF Observability
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
 
LISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
 
LISA17 Container Performance Analysis
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
Performance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
 
LSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
 
Kernel Recipes 2019 - Kernel documentation: past, present, and future
Kernel Recipes 2019 - Kernel documentation: past, present, and futureKernel Recipes 2019 - Kernel documentation: past, present, and future
Kernel Recipes 2019 - Kernel documentation: past, present, and future
 
YOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
 
Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
 
Linux System Troubleshooting
Linux System TroubleshootingLinux System Troubleshooting
Linux System Troubleshooting
 
Kernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
 
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF ExporterLISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
LISA18: Hidden Linux Metrics with Prometheus eBPF Exporter
 
Performance Tuning EC2 Instances
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
 
Linux Systems Performance 2016
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
 
The Internet
The InternetThe Internet
The Internet
 
bcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challengesbcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challenges
 
QCon 2015 Broken Performance Tools
QCon 2015 Broken Performance ToolsQCon 2015 Broken Performance Tools
QCon 2015 Broken Performance Tools
 
ZFSperftools2012
ZFSperftools2012ZFSperftools2012
ZFSperftools2012
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
 

Similaire à Velocity 2012 - Learning WebOps the Hard Way

Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Ontico
 
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytesWindows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Peter Hlavaty
 

Similaire à Velocity 2012 - Learning WebOps the Hard Way (20)

Troubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contentionTroubleshooting Complex Performance issues - Oracle SEG$ contention
Troubleshooting Complex Performance issues - Oracle SEG$ contention
 
Kernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
 
Sge
SgeSge
Sge
 
Hacking the swisscom modem
Hacking the swisscom modemHacking the swisscom modem
Hacking the swisscom modem
 
[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis[CCC-28c3] Post Memory Corruption Memory Analysis
[CCC-28c3] Post Memory Corruption Memory Analysis
 
1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster1 m+ qps on mysql galera cluster
1 m+ qps on mysql galera cluster
 
Debugging linux issues with eBPF
Debugging linux issues with eBPFDebugging linux issues with eBPF
Debugging linux issues with eBPF
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Prague
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Rac 12c optimization
Rac 12c optimizationRac 12c optimization
Rac 12c optimization
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5Trying and evaluating the new features of GlusterFS 3.5
Trying and evaluating the new features of GlusterFS 3.5
 
Reverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande ModemReverse engineering Swisscom's Centro Grande Modem
Reverse engineering Swisscom's Centro Grande Modem
 
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
 
Debugging Ruby
Debugging RubyDebugging Ruby
Debugging Ruby
 
Debugging Ruby Systems
Debugging Ruby SystemsDebugging Ruby Systems
Debugging Ruby Systems
 
Slackware Demystified [SELF 2011]
Slackware Demystified [SELF 2011]Slackware Demystified [SELF 2011]
Slackware Demystified [SELF 2011]
 
Mysql Latency
Mysql LatencyMysql Latency
Mysql Latency
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytesWindows Kernel Exploitation : This Time Font hunt you down in 4 bytes
Windows Kernel Exploitation : This Time Font hunt you down in 4 bytes
 

Plus de Cosimo Streppone

Surge 2010 - from disaster to stability - scaling my.opera.com
Surge 2010 - from disaster to stability - scaling my.opera.comSurge 2010 - from disaster to stability - scaling my.opera.com
Surge 2010 - from disaster to stability - scaling my.opera.com
Cosimo Streppone
 

Plus de Cosimo Streppone (11)

How we use and deploy Varnish at Opera
How we use and deploy Varnish at OperaHow we use and deploy Varnish at Opera
How we use and deploy Varnish at Opera
 
Puppet at Opera Sofware - PuppetCamp Oslo 2013
Puppet at Opera Sofware - PuppetCamp Oslo 2013Puppet at Opera Sofware - PuppetCamp Oslo 2013
Puppet at Opera Sofware - PuppetCamp Oslo 2013
 
Italian, do you speak it?
Italian, do you speak it?Italian, do you speak it?
Italian, do you speak it?
 
VUG5: Varnish at Opera Software
VUG5: Varnish at Opera SoftwareVUG5: Varnish at Opera Software
VUG5: Varnish at Opera Software
 
Velocity 2011 - Our first DDoS attack
Velocity 2011 - Our first DDoS attackVelocity 2011 - Our first DDoS attack
Velocity 2011 - Our first DDoS attack
 
Mojolicious: what works and what doesn't
Mojolicious: what works and what doesn'tMojolicious: what works and what doesn't
Mojolicious: what works and what doesn't
 
Surge 2010 - from disaster to stability - scaling my.opera.com
Surge 2010 - from disaster to stability - scaling my.opera.comSurge 2010 - from disaster to stability - scaling my.opera.com
Surge 2010 - from disaster to stability - scaling my.opera.com
 
My Opera meets Varnish, Dec 2009
My Opera meets Varnish, Dec 2009My Opera meets Varnish, Dec 2009
My Opera meets Varnish, Dec 2009
 
YAPC::EU::2009 - How Opera Software uses Perl
YAPC::EU::2009 - How Opera Software uses PerlYAPC::EU::2009 - How Opera Software uses Perl
YAPC::EU::2009 - How Opera Software uses Perl
 
NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0NPW2009 - my.opera.com scalability v2.0
NPW2009 - my.opera.com scalability v2.0
 
IPW2008 - my.opera.com scalability
IPW2008 - my.opera.com scalabilityIPW2008 - my.opera.com scalability
IPW2008 - my.opera.com scalability
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Velocity 2012 - Learning WebOps the Hard Way

  • 1. Learning Webops the Hard Way What could possibly go wrong? Cosimo Streppone WebOps Lead – Opera Software
  • 2.
  • 3.
  • 6. teams organization mail webops sysadmin
  • 7. 1 a cascade of errors
  • 8. # # Cyrus IMAPD annotation definitions file # /vendor/messagingengine.com/ preview,message,string,backend,value.shared,
  • 9. misplaced comma + fix didn't make it to master + unintended general rollout + parser choked on comma + fork with no rate limiting + fatal() dumped core + kernel.core_uses_pid = 1 + small SSD metadata partition + indexes corruption = massive outage (no data loss)
  • 10.
  • 11. DO Rate limit fork of children Test disk full conditions Master your infrastructure
  • 12. DO NOT Underestimate Mighty Comma Rollout everywhere at once Leave your CI builds messy
  • 13.
  • 14. read more “A cascade of errors” http://blog.fastmail.fm/2011/05/15/outage- report-a-cascade-of-errors/
  • 16. physical bladecenters? LVS? network? kernel? solar storms? WTF?!? random failures in our defective cpus? infrastructure DDoS? Mayas? bnx2? traffic? recent deploys?
  • 17. what we experienced random performance degradation general instability steady increase of WTFs/min!
  • 18. real problem ● 2.6.32 = debian squeeze kernel ● sched – find_busiest_group() ● TSC register wraparound
  • 19. Proof 64 2 = 208,49 10 9 2 · 86400 · 10
  • 20. Subject: [PATCH] sched: avoid unnecessary overflow in sched_clock From: Salman Qazi <sqazi@google.com> Date: 2011-11-16 20:55:31 In hundreds of days, the __cycles_2_ns calculation in sched_clock has an overflow. cyc * per_cpu(cyc2ns, cpu) exceeds 64 bits, causing the final value to become zero. We can solve this without losing any precision. We can decompose TSC into quotient and remainder of division by the scale factor, and then use this to convert TSC into nanoseconds. Reviewed-by: Paul Turner <pjt@google.com> Acked-by: John Stultz <johnstul@us.ibm.com> Signed-off-by: Salman Qazi <sqazi@google.com> --- arch/x86/include/asm/timer.h | 23 ++++++++++++++++++++++- 1 files changed, 22 insertions(+), 1 deletions(-) Patch #1, Nov 16th 2011 diff --git a/arch/x86/include/asm/timer.h b/arch/x86/include/asm/timer.h index fa7b917..431793e 100644 --- a/arch/x86/include/asm/timer.h +++ b/arch/x86/include/asm/timer.h @@ -32,6 +32,22 @@ extern int no_timer_check; * (mathieu.desnoyers@polymtl.ca) * * -johnstul@us.ibm.com "math is hard, lets go shopping!"
  • 21. --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -608,6 +608,8 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, ...) { unsigned long long tsc_now, ns_now, *offset; unsigned long flags, *scale; + unsigned long long quot; + unsigned long long rem; Patch #2, Mar 8th 2012 local_irq_save(flags); sched_clock_idle_sleep_event(); @@ -620,7 +622,15 @@ static void set_cyc2ns_scale(unsigned long cpu_khz, ...) if (cpu_khz) { *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz; - *offset = ns_now - (tsc_now * *scale >> CYC2NS_SCALE_FACTOR); + + /* + * Avoid premature overflow by splitting into quotient + * and remainder. See the comment above __cycles_2_ns + */ + quot = (tsc_now >> CYC2NS_SCALE_FACTOR); + rem = tsc_now & ((1ULL << CYC2NS_SCALE_FACTOR) - 1); + *offset = ns_now - (quot * *scale + + ((rem * *scale) >> CYC2NS_SCALE_FACTOR)); }
  • 22. 32 2 = 49,7 3 86400 · 10
  • 24. DO Be perseverant and creative :) Learn more about your kernel Improve tools to collect data
  • 25. DO NOT Run servers continuously for more than 208 days?
  • 27.
  • 29. t - 4y 2m From: Roman Zippel <zippel@linux-m68k.org> Date: Thu, 1 May 2008 04:34:41 -0700 Subject: [PATCH] ntp: handle leap second via timer Remove the leap second handling from second_overflow(), which doesn't have to check for it every second anymore. With CONFIG_NO_HZ this also makes sure the leap second is handled close to the full second. Additionally this makes it possible to abort a leap second properly by resetting the STA_INS/STA_DEL status bits. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> Cc: john stultz <johnstul@us.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --- include/linux/clocksource.h | 2 + include/linux/timex.h | 1 + kernel/time/ntp.c | 133 +++++++++++++++++++++++++++++-------------- kernel/time/timekeeping.c | 4 +-
  • 31. lie(t) = (1 – cos(πt / w)) / 2 lie(s) t
  • 33. T – 1 month package { ntpdate: ensure => installed; adjtimex: ensure => installed; } file { "/usr/local/bin/leap-adjust.pl": ensure => present, source => "puppet:///modules/ntp/leap-adjust.pl", } file { "/etc/cron.d/ntp-leap-second": ensure => present, source => "puppet:///modules/ntp/leap-crontab", require => [ Package["ntp"], Package["adjtimex"] ], }
  • 35. T – 1 day June 30th 2012 chaos begins
  • 37.
  • 38.
  • 39. the work around # date -s now
  • 40. T + {1,2}m {August,September} 1st, 2012 fake leap seconds
  • 41. read more A story of leaping seconds http://blog.fastmail.fm/2012/07/03/a-story-of-leaping-seconds/ Tips and tricks to deal with leap seconds http://my.opera.com/marcomarongiu/blog/index.dml/tag/ntp Serverfault question on random debian crashes http://serverfault.com/questions/403732/leapocalypse Wired article about leap second problems http://www.wired.com/wiredenterprise/2012/07/leap-second-bug- wreaks-havoc-with-java-linux/
  • 42. DO Keep your kernel updated Use valuable external resources (serverfault etc...)
  • 45. failure lessons learned } expect assume prepare simulate failure measure embrace
  • 46. ops lessons learned Don't repeat yourself (DRY) Always keep it simple (KISS) Separate ops team doesn't work well Practice Continuous deployment. Now. Communication makes the difference Learn your tools Master your infrastructure RTFM ...