Peter holditch devops

Realising the true
value of DevOps
The DevOps Payrise

Peter Holditch
Senior Sales Engineer
@pholditch

Developers working
together with
Operations to get
things done faster in an
automated and
repeatable way

Typical Dev Day
1. Look at the overnight integration tests
2. Buy chocolates for the team if you broke the build
3. Scramble to fix the build
4. Pick the top priority item from your backlog
5. Start coding
6. Get dragged into troubleshooting prod. incidents
7. Hastily check in new code in as you ran out of time

What do developers care
about?
Learn
Eat Pizza Innovate

What does development really
care about?

What did the Business
care about?
£

Features = £
Even though the business never measured it.

OPS:
“Everything is fine
from our end.”

Typical Ops Day
1. Open 30 new tickets
2. Make 200 phone calls
3. Attend executive P1 status update meeting
4. Argue about what a P1 and P2 really is
5. Reprioritise P2 tickets to P1
6. Reprioritise P3 tickets to P2
7. Close tickets as ‘Cannot reproduce’ or ‘Duplicate’

What do operators care
about?

What does operations really
care about?
P1’s
SLA’s

P1 = £
Even though the business could never prove it.

How the Business often
view dev & ops

How L2 & L3 Support
often view dev & ops

False Alarms
Site is
down
404 Errors
My search
is slow

2am Friday - #FFS
We have had an
alert that the load on
one of your staging
servers is critical.

How much time do false
alarms waste?
Role Hours Per Week Cost Per Week Cost Per Year
Ops 20 £400 £20,800
L2 10 £200 £10,400
L3 15 £300 £15,600
Hosting 6 £120 £6240
Network 6 £120 £6240
CMS 10 £200 £10,400
Total 55 £1,340 £69,680
Conservative estimates assuming £20/hour

How much revenue did the
business lose?
No
idea

Typical Day
1. Open 30 new tickets
2. Make 300 phone calls
3. Attend executive P1 status update meeting
4. Argue about what a P1 and P2 really is
5. Reprioritize P2 tickets to P1
6. Reprioritize P3 tickets to P2
7. Close tickets as ‘Cannot reproduce’ or ‘Duplicate’
1. Look at the overnight integration tests
2. Buy chocolates for the team if you broke the build
3. Scramble to fix the build
4. Pick the top priority item from your backlog
5. Start coding
6. Get dragged into troubleshooting prod. incidents
7. Hastily check in new code in as you ran out of time

Things that would help
1. Automation
2. Collaboration
3. Better Tooling
4. Business Metrics

Things that could justify
them
1. Baseline the starting point
2. Measure progress
3. Calculate Business Impact
4. Promote success not problems
5. Demonstrate value

Modern-day User
Expectations…

3 billion
daily transactions
250
milliseconds
500+
updates/yr
Spot the App…

1 million+ servers
100 million GB
1,000 man years
1,500 miles
Konstantin Karpov
Users Expectations

Web server 1
Internet Firewall
Load
Balancer
Web server 2
Database

Pre$Produc)on+APM+–+“Non+Produc)on+Data”+
Pre-Production Production
Dev Test Staging Live
Profile QA Load Test Monitor & Manage
Development Operations

Produc'on)APM)–)“Produc'on)Data”)
6
Pre-Production Production
Dev Test Staging Live
Monitor & Manage
Profile QA Load Test
Development Operations

right tools
right hands
right use

INFRASTRUCTURE AUTOMATION
How much time and £
do these tools save?

DEPLOYMENT AUTOMATION

LOG AUTOMATION
LogStash

Monitoring
How much time and $

PLAN FOR FAILURE!
be stronger than the weakest link

Traditional monitoring approach is limited
END USER EXPERIENCE
BUSINESS TRANSACTION
APPLICATION
Server
OS
DB
MQ
Web
JVM
EXPANDED
APPROACH
Business transaction
EXISTING
APPROACH
Silo’d domain visibility
99.9% 99.9% 99.9% 99.9%

How many of you
use performance
management tools?

Identify early
!
Troubleshoot fast
!
Resolve quickly
!
Quantify impact
x

IT ENVIRONMENT
1200
servers
300,000
trans/min
MONITORING ENVIRONMENT
700 92 8%
80TB
cores servers storage

smart data
actionable, intelligent, information

IS THIS PERSON PERFORMING WELL?
Blood pressure!
165/100!
Heart rate!
150bpm!

57
are we talking about this person?

What data could we collect?
Attribute Person 1 Person 2
Heart Rate 150 150
Blood Pressure 180/90 180/90
Eye Color Blue Brown
Blood Type O+ O-White
Blood Cell Count 3.5 3.8
Hair Color Brown Blue
Height 180cm 175cm
Shoe size 11 10
Weight 180kg 94kg
Current activity sitting skating

IS PERSON 2 PERFORMING WELL?
Time
Distance
10,000 metres!
Record time: 12min 58sec
12min 44sec!
baseline

New Olympic Record
Jorrit Bergsma 10,000m winner

average response time with historical baseline

monitoring platforms should do the heavy lifting
User & IT perspective
Analytics
Correlation
Intelligent alerting
Resolution path

65
plan ahead
anticipate needs
intended purpose

And remember: Monitoring is not all traffic lights…

Understand the impact of slow performance
10.1 s
* Screenshot from US e-Commerce AppDynamics Customer
Application
Revenue
Application
Response time
Application
Errors
$64,499 per min
$11,987 per min
100 ms

Understand the benefit of an application release
Application
Revenue
Application
Response time
code
release 1
code
release 2
code
release 3
$44,499 per min
$58,237 per min
1.9 s
3.1 sec

Peter holditch devops

Recommandé

Recommandé

Contenu connexe

Similaire à Peter holditch devops

Similaire à Peter holditch devops (20)

Dernier

Dernier (20)

Peter holditch devops