"Observability is for User Happiness," performance.now() 2019 (Amsterdam)

Observability and User Happiness
@eanakashima

about://
Emily Nakashima (she/her) 
Director, Engineering

Site Reliability Engineering
(SRE) Island
The Isle of Ops
Backend Engineer Island
Fullstackville
Frontend Island
Perf
Island
The Isle of 
Browser 
Vendor
Expansive Domain of 
of Designers & 
Product Managers
Observability
Web Perf

An observable system is one
whose internal state can be
deeply understood just by
observing its outputs.

Observability

Observability vs. Web Perf
Let’s talk about birds.

Hawks vs. Falcons
Red-Tailed Hawk
@bwmaddog21 | https://www.flickr.com/photos/bwmaddog21/38890588682
Peregrine Falcon
@beckymatsubara | https://www.flickr.com/photos/beckymatsubara/42017229951

Parrots && Falcons
Parrot (Lovebird, Agapornis lilianae)
@nikborrow | https://www.flickr.com/photos/nikborrow/31643657028/
Peregrine Falcon

Convergent evolution!
Red-Tailed Hawk
@bwmaddog21 | https://www.flickr.com/photos/bwmaddog21/38890588682
Peregrine Falcon

Same deal with Web Perf & Observability folks
Web Perf practitioner
@solutionist | https://www.flickr.com/photos/solutionist/48227528782/
Observability practitioner

An observable system is one
whose internal state can be
deeply understood just by
observing its outputs.

Observability is a system
property, just like performance.
Observability

“Nines don’t matter if users
aren’t happy.”
- Charity Majors, all the time

Same deal with Web Perf & Observability folks
Web Perf practitioner
Observability practitioner
Worries about developer adoption Thinks these numbers look wrong
Cares deeply about UX Not sure how to balance this,
my real job, with what they think
they pay me for
Debating nine
different ways
to measure
the same thing
Lots of emotional energy
going into some standards
or spec process
Obsessed with numbers

This talk
1. Talk about birds for five minutes
2. Data models
3. SLOs vs. performance budgets
4. Observability for perf optimization
5. Observability for UX design

References!
bit.ly/hawks-vs-falcons

2. Data models 
Events, metrics, logs, traces … oh my!

Two communities, different superpowers
Web Perf:
" Sophisticated tooling, many tool types
" Amazing, mature developer experience
" Lots of experience improving the ecosystem
through specs & new browser APIs
Observability:
" Focused on instrumentation best practices
" Goal is to enable answering any question
about the state of your software
" Just starting on specs

2. Data models
Events are the fundamental
data unit of observability

What is an event?
1 event 
== 
1 unit of work 
~= 
1 request

What’s in an event?
{
"GojiPattern": "/user_event/:event_type",
"Header.Content-Type": "["application/json"]",
"Header.Cookie": "["_ga=GA1.2.2033006133.1516389900;",
"Header.User-Agent": "["Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_1)…”]”,
"Host": "127.0.0.1:8080",
"IsXHR": true,
"Method": "POST",
"RequestURI": "/user_event/page-unload",
"ResponseContentLength": 443,
"ResponseHttpStatus": 200,
"ResponseTime_ms": 123,
"Timestamp": "2018-03-02T06:14:57.206349701Z",
"UserEmail": "nathan@honeycomb.io",
"UserID": 18,
"availability_zone": "us-east-1b",
"build_id": "6552",
"env": "dogfood",
"infra_type": "aws_instance",
"instance_type": "t2.micro",
"memory_inuse": 15450056,
"num_goroutines": 56,
"request_id": "poodle-a38f5e39/5fIUGkX5D1-001814",
"server_hostname": "poodle-a38f5e39",
"type": "request"
},

High cardinality
Fields that may have many unique values
Common examples:
• email address
• username / user id / team id
• server hostname
• IP address
• user agent string
• build id
• request url
• feature flags / flag combinations

What about the Three Pillars?
Logs
Metrics
Traces
207.46.1.2 -
[03/Nov/2016:16:11:43 -0700]
”GET /robots.txt HTTP/1.1"

Events
{
“type”: “db.select”,
“duration”: 23,
“host”: “poodle-437987”,
“query_shape”: “SELECT * FROM teams WHERE id = ?”,
}
Derive all Three Pillars from Events

Derive all Three Pillars from Events
1 (structured) log line ~= 1 event 
metrics can be derived from events 
traces = n events (spans) with parent/child relationships

"Observability is for User Happiness," performance.now() 2019 (Amsterdam)

Wait, maybe you should just use Tracing?

Distributed Tracing
OpenCensus
OpenTelemetry
OpenTracing

Distributed Tracing … in the Browser

Simple trace: just load a page

When we create events (spans)
• On page load
• On history state change (SPA navigation)
• On significant user actions
• On error (also send to error monitoring tools)
• On page unload

What’s in an event?
{
// For the page load event, collect information about the page
“type”: “page-load”,
“duration”: “1278”,
“device_type”: “tablet”,
“connection_type”: “3g”,
“user_agent”: “Mozilla/5.0 (Macintosh)…”,
// ...all feature flag states
// ...all navigation timing measurements
}
{
// For the button click, collect information about the interaction
“type”: “usage-mode-button-click”,
“duration”: “28”,
“location”: “dataset list”,
“animation_render_duration”: “127”,
}

Complicated trace: lots of user interaction

Hawks vs. Falcons
Network Tools (single view, exportable as
HAR)
Distributed trace for the same page view
(RUM data)

What’s next? Measuring React renders!

3. SLOs vs. performance budgets 
How to make sure we’re optimizing what really matters

Service Level Objectives (SLOs)

SLOs, SLIs, SLAs
For each facet of system performance (latency, errors, etc.) ask:
• SLI: Service Level Indicator — what do we measure?
○ response time of web app requests
• SLA: Service Level Agreement — what did we promise our customers?
○ response time will be under 10 seconds, 99% of the time
• SLO: Service Level Objective — what number would keep our users happy?
○ response time should be under 1 second, 99.9% of the time

Two tools, different superpowers
Performance Budgets:
" Many easy ways to get started
" Great tooling support (webpack, lighthouse)
" Easy to understand
" Business stakeholders might not care
SLOs (Service Level Objectives):
" Extremely flexible—use any # you measure!
" Get burn alerts when your budget runs low
" Read multiple book chapters to understand
" You get business stakeholder buy-in up front

5. Observability for Perf Optimization 
Using observability to make things faster

Secure Tenancy product architecture

Observability for Perf Optimization
It’s slow for one team only on
the secure architecture

Secure product performance regression

High-performance browser instrumentation code:
1. Batch requests together so you don’t run down battery & use up resources
2. Use the Beacon API to send events in a non-blocking way
3. Use `requestIdleCallback` or `setTimeout` to handle slower calculations
Don’t shoot yourself in the foot 
while trying to look at your own foot

Another performance regression

5. Observability for UX Design 
Using observability to drive design

Using observability to drive design

We’re all product engineers now!

How to tell a hawk from a falcon
Red-Tailed Hawk
@hmclin | https://www.flickr.com/photos/hmclin/14119319574
Peregrine Falcon
@zonotrichia | https://www.flickr.com/photos/zonotrichia/31001823086

Thank you
@eanakashima

"Observability is for User Happiness," performance.now() 2019 (Amsterdam)

Recommandé

Recommandé

Contenu connexe

Dernier

Dernier (20)

En vedette

En vedette (20)

"Observability is for User Happiness," performance.now() 2019 (Amsterdam)