Is SRE, DevOps and serverless a match made in heaven or is something missing? What about cost when building reliable Serverless systems? To answer this, lets explore SRE and Serverless principals, a new concept called FinDevOps, and along the way make a few predictions about our serverless future
2. @silvexis
Hello
• CEO and Founder of CloudZero
• I’m recovering from the application security
industry, now 100% focused on Cloud
Management and Serverless Computing
• Have been building systems on AWS since 2008
• Previously
• Veracode, HP, SPI Dynamics, GuardedNet,
Sanctum
• United Nations IAEA, US Department of State,
SunTrust, Moody’s Investors
erik@cloudzero.com | @silvexis
14. @silvexis
Serverless
effect on Site
Reliability
Engineering
CLOUD SLA’S & AUTOSCALING
OBSERVABILITY
SERVICE LIMIT PLANNING
COST
AVAILABILITY
LATENCY
PERFORMANCE
EFFICIENCY
CHANGE MANAGEMENT
MONITORING
EMERGENCY RESPONSE
CAPACITY PLANNING
PROVISIONING
Congrats! Your still on call!
Still your problem
Harder to understand
More tracking, less management
Automation
17. @silvexis
Can we get real for a second:
FaaS is NOT Serverless
CloudWatch Logs
$1.79$15
$0.89
$789!!!
$12
FaaS Cost: $(1.79)
Other Cost: $(818.68)
Avg. Per Day Operations
100% Serverless, and 100% not free
18. @silvexis
• Observability is a measure of how well the state of a system can be
determined from the analysis of its outputs.
• Emergent properties will be the bane of your existence.
• Dumping all your logs somewhere will not solve your problems.
• Focus on sampling the outputs of your system to understand
connective tissue & be able to do analysis without a priori
knowledge.
MONITORING -> OBSERVABILITY
20. @silvexis
CAPACITY PLANNING -> SERVICE LIMITS
• Capacity is built in but, Serverless systems have limits and constraints.
• You will hit them once you are in prod and under heavy customer load…on a Friday…at 6pm
• It can be very very hard to figure out when the limits are being hit in a large system with
many moving parts. Here are just a few examples:
• Maximum number of concurrent
executions per AWS account
(1000, changeable)
• Immediate Concurrency Increase
(500 or more per min, depends on
region, fixed)
AWS Lambda API Gateway
• Integration timeout (29
sec max, fixed)
• Max Payload size (10mb,
fixed)
• S3 will asynchronously
call Lambda
• Lambda polls DynamoDB
Streams only once per
second, per shard
Serverless Invocation Limits
Examples: Examples: Examples:
22. @silvexis
Stop Faking DevOps
• DevOps wasn’t a merger, it
was a hostile takeover
• If you see an Ops team, you
blew it
• Effective Serverless
engineering teams must take
ownership of operations
DEV
OPS
25. @silvexis
It’s time we had the talk about cost
If you have infinite scale, it follows you must also have infinite wallet
At what point will you degrade your customers experience to save
your wallet?
A system that puts your company out of business is not a reliable
system
26. @silvexis
Thinking about Cost and Architecture
• Lets come back to this chart for a second
CloudWatch Logs
$1.79$15
$0.89
$789
$12
This is could be a big problem
Question: Could it be worse?
27. @silvexis
Thinking about Cost and Architecture
• Of course it can be worse!
Writes 100
files
Invoked
100x
1000
records
Written
Invoked
100x
Invoked 3x per
transaction Writes 1000 files
What happens at step 7?
1
2
3
4
5
6
7
Hint: It’s both costly and catastrophic
28. @silvexis
Thinking about Cost and Architecture
• Denial of Wallet is a very real problem
• What if you are only responsible for a small part of this system?
• What if your part is in a separate AWS Account or from a 3rd party?
• How do you detect when this is even happening?
7
30. @silvexis
What is a First-Class Metric?
It’s real-time It has context
You can measure it
You have a clear
definition of good and
bad
31. @silvexis
DevOps -> FinDevOps
• Cost becomes a first class operational
metric
• First suggested by Simon Wardley @
ServerlessConf in 2018 as FinDev
• A merger of financial, development and
operations practices
• Understands the tight correlation
between cost and well architected
systems
@silvexis
33. @silvexis
FinDevOps
will replace
DevOps
Ultimately FinDevOps isn’t just
about monitoring cost
Your cloud spend is an investment,
and you should be tracking the
performance of this investment
Serverless enables a clear path to
mapping revenue generating IT
activities against engineering and
IT costs
34. @silvexis
Tracking gross margins and COGS are what really matters
Serverless makes this possible
CloudWatch Logs
$(1.79)$(15)
$(0.89)
$(789)
$(12)
FaaS Cost: $(1.79)
Other Cost: $(818.68)
Avg. Per Day Operations
Where FinDevOps will take us
Revenue: $9324.04
Profit: $8503.57
Who cares? We are RICH!
$9324.04