3. Who’s Jason?
Dad, downhill + enduro mountain biker 🚵, music lover 👨🎤🤘, dog owner 🐶, geek
🤘, site reliability engineering, STEM parent, @Concur
Expertise
● DevOps Champion
● Team Building
● Leveling Up
● Recruiting
Follow me @JasonGrimes
4. My Background
● My first computer; C64
● Decade of on-premise datacenter experience
● Transitioned to DevOps + SRE 2015
● STEM Parent
● Growth Mindset
● Open source everything
@jasongrimes
5. What Will You Walk Away With….
● DevOps Journey @Concur
● Commitment, People and Journey
● Building Great Teams Not Toolsets
● Measuring Customer Experience
● Leveling Up
● The Automators
@jasongrimes
7. In the beginning, before there was DevOps
As told to you by the cell phones I carried.
SDLC Waterfall Mixed Agile Agile
Releases 9-18 months 4 months 1 month Continuous
Architecture On Premise
Service
Monolithic Hybrid Microservices
Aspirations
Microservices
Ops Model IT -> Ops Centralized hosted
Ops
DevOps Embedded
DevOps
@jasongrimes
8. Early 2000s
@jasongrimes
● SDLC: Waterfall releases
● Releases: Every 9-18 months
● Hosting: On premise service
● Ops Model: IT -> Ops
Reliable, repeatable, results over time - David Gedye (2000)
9. 2006
● SDLC: Mixed/Agile
● Releases: Every 4 months
● Hosting: Monolithic Hybrid
● Ops Model: Centralized Hosted Ops
@jasongrimes
You build it, you run it - Werner Vogles (2006)
11. 2013
● SDLC: Agile
● Releases: Every month
● Hosting: Moving to Microservices
● Ops Model: DevOps
@jasongrimes
Meet people where they are - Nell Shamrell (2016)
12. In the future...
● SDLC: Agile
● Releases: Continuous
● Hosting: Microservices
● Ops Model: Embedded DevOps
@jasongrimes
Everyone should do everything - Alice Goldfuss (2017)
20. Measuring the Customer Experience
● 9’s don’t matter if customers are unhappy
● 20 years of alert and monitoring bias
Metrics to Observe
● Latency
● Request per second
● Errors
● Saturation
@jasongrimes
21. @jasongrimes
● Measure: Apdex, Disk, CPU/Load,
Memory
● Code is the source of truth
● Alerts are codified
Minimum Viable Ops
22. Leveling Up Strategy
● Sharing and exchanging ideas
● Bring in learning from the field
● WIKI for the future
● Local meetups matter
● Conferences
@jasongrimes
24. Takeaways
Deck - http://bit.ly/DevOpsIRL
● Show up
● Attitude matters
● Team first, no rockstars
● Learn from failure
● Capacity to continue learning
● Be Active: Github, LinkedIn, Twitter
@jasongrimes
25. Follow-up
I think I’ve done enough talking.
Email, DM me or if you must LinkedIn.
I would love to connect.
Email: jason.grimes@gmail.com
Follow me @JasonGrimes
Notes de l'éditeur
Audience: Automators, Engineers, Builders and Janitors
I’m Jason Grimes
I’m a site reliability engineering expert at Concur on the Cloud Services Team in Bellevue WA just outside of Seattle, WA in the states.
Today I will be discussing Concur’s Digital Transformation from Monolith to Microservices; DevOps IRL 🐼
Concur has recently gone through this digital transformation of the business from a monolith application with 2 major releases per year to each 700+ developers who are committed to owning their own code in Production and owning performance from an end-to-end; giving product teams the ownership from design through production release and operations.
As Concur SRE we are here to put the right talent, practices, strategies, change control and ultimately global teamwork together to run today's high availability production operations business environment.
If you were to take away 1 thing from today’s presentation -> DevOps is about Sharing.
I would say each of you likely share this belief as you are spending time LEVELING UP today at DevOpsDays.
To solve today’s infrastructure and software problems we need an inclusive community.
TheITSkeptic (2017) Tech is a closed solved problem. Cool and fun, but done.
Complex systems, culture, and people are the hard problems: open, not necessarily ever solvable.
I tweet, share on LinkedIn, big fan of GIFs as teaching moments, occasional Snaps, but Insta has my heart. I’m into art, media, music and yes, communication. I’m high energy, generation-xer who spent almost 15 years in the industry before I knew what my gift was. I was always good at Operations, but this new movement this DevOps + SRE speaks to my core of radical evangelical change that companies like Concur need as they transition from the monolith to microservices and true ownership of end-to-end performance.
Our secret at Concur is the team and our people. SRE is at the center of supporting this new generation of microservices, automation, higher standards for reliability and reduced latency. In today’s presentation I’ll cover how we have transitioned DevOps teams in one of the largest SaaS companies in the world.
First computer I programmed on was a C64 with a Tape Drive so I could play Frogger.
I have a decade of on-prem datacenter experience - that and a couple of dollars and it will give me a latte.
Transitioned in 2015
I’m convinced there are multiple winners in SRE/DEVOPS; it’s a huge GREENFIELD of opportunity as we march towards automation
I’m committed to Open Sourcing Everything, my approach and always accepting feedback
Why are you here? What will you walk away with?
Mention EACH ONE OF THESE. READ IT if you have too.
How many folks here are familiar with Concur?
Concur allows you to see your travel, expenses, and invoice-driven spending clearly throughout your whole organization → CONCUR is the INDUSTRY STANDARD
So how did we get here….what was our journey and how is it different from your own?
This is a high-level look at the DevOps journey Concur and many other orgs have taken in their relentless pursuit of Microservices.
For context I also posted my Mobile Phone during each of these journeys starting with a Blackberry - who else had a Blackberry? Who are my people?
How many folks here are running microservices or moving to microservices?
The next few slides I will be walking you through this DevOps journey and you will be accompanied by some helpful quotes from some DevOps leaders who have influenced me.
David Gedye was my original mentor when I was fresh into the industry.
Before meeting David, I had served time as an Intern and Full Time employee at Microsoft, David would introduce me to the world of Technical Operations at Apex Learning.
We supported 250,000 students taking courseware online in the early 2000s.
That’s right 9-18 months to deliver a new version of our Online Learning Management Suite. Too much risk. These were waterfall releases and often involved adjusting the ship date. And if you missed customer expectations - you missed big!
Anyone remember what was introduced in Spring 2006? Werner Vogles or Amazon fame would help launch Amazon Web Services and with one simple quote we had our marching orders for the next decade - YOU BUILD IT, YOU RUN IT
Back in 2006 I was Group Director for Jobster.com - > a $42M SaaS employment startup with the charter of killing the resume and slaying the LinkedIn drago...despite our best efforts we would fail miserably. We would pivot too often, focusing on bleeding edge new features and forgot to solve the problem for the enterprise.
It was a challenging time and my world was about to change again with the introduction of VMWARE 3.5. We were virtualizing for the very * FIRST TIME! *
We set out to remove physical machines and putting more and more machines in VMs
CA(L)MS Model
CA(L)MS is an acronym describing the core values of the DevOps Movement: Culture, Automation, Measurement, and Sharing.
Coined by Damon Edwards and John Willis at DevOpsDays Mountainview 2010 // Jezz Humble later added the L
Culture
DevOps is mostly about breaking down barriers between teams. An enormous amount of time is wasted with tickets sitting in queues, or individuals writing handoff documentation for the person sitting right next to them. In pathological organizations it is unsafe to ask other people questions or to look for help outside of official channels. In healthy organizations, such behavior is rewarded and supported with inquiry into why existing processes fail. Fostering a safe environment for innovation and productivity is a key challenge for leadership and directly opposes our tribal managerial instincts.
Automation
Perhaps the most visible aspect of DevOps. Many people focus on the productivity gains (output per worker per hour) as the main reason to adopt DevOps. But automation is used not just to save time, but also prevent defects, create consistency, and enable self-service.
Measurement
LEAN
Added by Jez Humble
Measurement
How can you have continuous improvement without the ability to measure improvement? How do you know if an automation task is worthwhile? Basing decisions on data, rather than instinct, leads to an objective, blameless path of improvement. Data should be transparent, accessible to all, meaningful, and able to be visualized in an ad hoc manner.
Sharing
Sharing
Key the success of DevOps at any organization is sharing the tools, discoveries, and lessons. By finding people with similar needs across the organization, new opportunities to collaborate can be discovered, duplicate work can be eliminated, and a powerful sense of engagement can be created among the staff. Outside the organization, sharing tools and code with people in the community helps get new features implemented in open source software quickly. Conference participation leaves staff feeling energized and informed about new ways to innovate.
In 2013 Concur began their DevOps journey following many of the tenets set forth by the Google SRE program
In 2016 I was sitting out where you are at DevOpsDays Seattle when I met Nell. She was speaking about the intersection of politics and technology and she imparted a very valuable lesson to us that day - LET GO OF BEING RIGHT and MEET PEOPLE WHERE THEY ARE
Wait a minute, you mean, I’m not right? (pause)
So what does the future look like?
I’m heavily influenced by many of the automators mentioned today, but Alice Goldfuss speaks to me. She recently spoke at DockerCon last month giving her Rockstars, Builders and Janitors you are doing it wrong - and she added an epic quote….
EVERYONE should do EVERYTHING! EXACTLY!
In fact, Alice would say that you need to EMPOWER your teams to never get paged 2x for the same event twice….they will be more flexible,
I have described the journey, discussed some important tips from my mentors, but how exactly are we set to build Great Teams NOT Toolsets.
To provide a truly high availability service you need a great team. All the tools, processes and tech in place won’t operate if the team doesn’t have clear direction and focus.
This is a picture of the US Basketball Team the World Champion Golden State Warriors.
While each player has a role, they all play team defense and move the ball better than any team in the modern NBA.
Each player is bringing their experience, perspectives and skill. Many of them overlap and complement each other.
We want to build our DevOps teams in the same way.
Committed to a set of values
Each person know the business and their role
And when you work together you can accomplish almost anything
So if Alice is right and everyone does everything...what does EVERYTHING look like?
9 Keys to End to End Ownership
Design
Code
Build
Test
Secure
Package
Release
Configure
Run
In a growth mindset, people believe that their most basic abilities can be developed through dedication and hard work—brains and talent are just the starting point. This view creates a love of learning and a resilience that is essential for great accomplishment
Transformational leaders share five common characteristics that significantly shape an organization's culture and practices, leading to high performance. The characteristics of transformational leadership — vision, inspirational communication, intellectual stimulation, supportive leadership, and personal recognition — are highly correlated with IT performance. High-performing teams have leaders with the strongest behaviors across these dimensions. Low-performing teams reported the lowest levels of these traits. Teams that reported the least transformative leaders were half as likely to be high performers.
• Establishing and supporting generative and high-trust cultural norms. • Implementing technologies and processes that enable developer productivity, reducing code deployment lead times and supporting more reliable infrastructures. • Supporting team experimentation and innovation, to create and implement better products faster. • Working across organizational silos to achieve strategic alignment.
Transformational leadership that includes five dimensions (Rafferty and Griffin 2004).2 According to this model, the five characteristics of a transformational leader are: • Vision. Has a clear concept of where the organization is going and where it should be in five years. • Inspirational communication. Communicates in a way that inspires and motivates, even in an uncertain or changing environment. • Intellectual stimulation. Challenges followers to think about problems in new ways. • Supportive leadership. Demonstrates care and consideration of followers’ personal needs and feelings. • Personal recognition. Praises and acknowledges achievement of goals and improvements in work quality; personally compliments others when they do outstanding work. 2 Rafferty, A. E., & Griffin, M. A. (2004). Dimensions of transformational leadership: Conceptual and empirical extensions. The Leadership Quarterly, 15(3), 329-354
While we are enabling END to END we must remember that Culture is not a toolset.
We need to cultivate a culture
SRE is truly a global sport. Given I traveled here all the way from Seattle in the States tells you that our mission is global and so are our partners to make it happen.
We couldn’t do it without our geographically located teams. My sister team is in Prague and is almost 12 Hours difference.
Pager Fatigue is real and to address it’s best to have a solid rotation of every 4-8 weeks.
On call coverage is 12 hour shifts in your timezone.
We share Slack, JIRA, Exchange, Zoom and other tech to keep us connected.
Focus on what matters
For Concur this means aligning our practices with customer satisfaction and Net Promoter Score
As a Cloud Services Team we are also responsible for P1s and our Crisis Response
Where possible, CODIFY everything - for example, we use New Relic Monitoring and we take advantage of NRQL
And the number one thing affecting employee happiness is the CI/CD pipeline
Devs wanna Dev
PMs wanna ship
Account Exec want features in the hands of customers
I also want to focus on Minimum Viable Ops
AUTOMATE EVERYTHING
Using Modern Cloud Toolsets to bring CHATOPS to the enterprise.
We want to bring the operations experience inside our chatroom #SLACK - where people are...
GitHub
PagerDuty
Outside-In Monitoring
JIRA
New Relic Notifications
2 Types of Data we store….
Application Performance Monitoring (APM)
Real User Monitoring (RUM)
Customer Experience is all that matters...blinking lights are neat, but don’t keep the bills paid. Net Promoter Score Matters
9’s DON’T MATTER if Customers are unhappy - I took this quote from Charity Majors shirt
Charity Majors has some excellent words of wisdom that I’d like to share.
Latency
Request per second
Errors
Saturation
Monitoring; We are doing it wrong. 20+ year bias in how we solve problems.
The future is
4 metrics to observability
Canaries in production with roll out migrations and rollback. That is always the use case.
Also Known AS - Batteries Included
Enabling Operational Maturity from Day 1
Combine Technical + ChatOps (Be specific)
Define ApDex and why it’s critical. Other common counters DISK, LOAD, Memory
Apdex is a measure of response time based against a set threshold. It measures the ratio of satisfactory response times to unsatisfactory response times. The response time is measured from an asset request to completed delivery back to the requestor.
All alerts are codified.
This project was led by intern on my team to describe Batteries included or that feeling you get when you have what you need to be successful.
REPEAT MISSION - DevOps is Sharing ….
LOOK TO YOUR LEFT LOOK TO YOUR RIGHT MEET THE 3 PEOPLE ON EITHER SIDE
Hiring is not enough, you need your current talent to be LEVELING UP at all times.
Learning from the field can be some of the most impactful since …..COFFEE OPS STORY
For example: We’ve seen full toolset exchanges (say that better)
Wiki for the future - breadcrumbs and notes to your future self when you are troubleshooting
HOW MANY PEOPLE USE TWITTER? Don’t be afraid, I need to know what I’m up against! TWITTER IS A DATASOURCE….
I don’t know what inspires each of you, but comments from folks like these in the field I LIVE AND BREATHE and You can’t not help but
These four automators, especially Alice Goldfuss was the reason I started speaking.
ALICE - She had me at “Builders, Janitors and Rock Stars: You are doing it wrong” - Spoke at DockerCon recently
KELSEY - Making K8+DevOps+SRE accessible for the non-experts and next generation problem solvers, Author of Kubernetes the Hardway. JESSE - ORIGINAL MASTER OF DISASTER from Amazon founded OpsCode which became Chef.JEFF - NERD DAY; DEVOPS JEDI - Saw at DevOpsDays Seattle
FIND THE VOICES YOU LISTEN TO AND SEEK THEM OUT
Changing the world with each Tweet and speaking engagement.
Show up - Each of you are doing this today by attending DevOpsDays
Attitude matters, be the SRE/DevOps Engineer that EVERYONE wants to work with
Welcome failure -- some of our most powerful lessons come from failure
It’s also critical to KEEP LEARNING
And if I leave you with anything - BE ACTIVE, SHARE -> Get involved on Github, LinkedIn and yes TWITTER
Please take a picture of this, would love to connect. - #DADOPS