SlideShare une entreprise Scribd logo
1  sur  32
© Predictable Network Solutions Ltd 2017 www.pnsol.com
PEnDAR Focus Group
Performance Ensurance by Design, Assuring Requirements
Second Webinar
© Predictable Network Solutions Ltd 2017 www.pnsol.com
Recap
Previous webinar recording at https://attendee.gotowebinar.com/register/1615742612853821699
Also on SlideShare: https://www.slideshare.net/pnsol-slides
© Predictable Network Solutions Ltd 2017 www.pnsol.com
3
Goals
• Consider how to enable Validation & Verification
of cost and performance
• For distributed and hierarchical systems
• via sophisticated but easy-to-use tools
• Supporting both initial and ongoing incremental
development
• Provide early visibility of cost/performance
hazards
• Avoiding costly failures
• Maximising the chances of successful in-budget
delivery of acceptable end-user outcomes
Partners
Supported by:
PEnDAR project
© Predictable Network Solutions Ltd 2017 www.pnsol.com
4
Our offer to you
Learn what we know about the laws of the
distributed world:
• Where the realm of the possible lies;
• How to push the envelope in a way that
benefits you most;
• How to innovate without getting your fingers
burnt.
Get an insight into the future:
• Understand more about achieving
sustainability and viability;
• See where competition may lie in the
future.
Our ‘ask’
Help us to find out:
• In which markets/sectors/application
domains will the benefits outweigh the
costs?
• Which are the major benefits?
• Which are just ‘nice to have’?
• How could this methodology be
consumed?
• Tools?
• Services?
Role of the focus group
© Predictable Network Solutions Ltd 2017 www.pnsol.com
Outline of a coherent methodology
© Predictable Network Solutions Ltd 2017© Predictable Network Solutions Ltd 2017 www.pnsol.com
6
Outcomes
Delivered
Resources
Consumed
Variability
Exception
/failure
Externally created
Mitigation
System impact
Propagation
Scale
Schedulability
Capacity
Distance
Number
Time
Space
Density
© Predictable Network Solutions Ltd 2017 www.pnsol.com
7Performance/resource analysis
Sub-
system
Sub-
system
Sub-
system
∆Q
Shared resources
Starting with a functional decomposition:
• Take each subsystem in isolation
• analyse performance
• modelling remainder of system as ∆Q
• quantify resource consumption
• may be dependent on the ∆Q
• Examine resource sharing
• within system – quantify resource costs
• between systems – quantify opportunity cost
• Successive refinements
• consider couplings
• iterate to fixed point
Quantitative
Timeliness
Agreement
Nottoday
© Predictable Network Solutions Ltd 2017
8
www.pnsol.com
Recap of ∆Q
• ∆Q is a measure of the ‘quality impairment’ of an outcome
• The extent of deviation from ‘instantaneous and infallible’
• Nothing in the real world is perfect so ∆Q always exists
• ∆Q is conserved
• A delayed outcome can’t be ‘undelayed’
• A failed outcome can’t be ‘unfailed’
• ∆Q can be traded
• E.g. accept more delay in return for more certainty of completion
• ∆Q can be represented as an improper random variable
• Combines continuous and discrete probabilities
• Thus encompasses normal behaviour and exceptions/failures in one model
© Predictable Network Solutions Ltd 2017 www.pnsol.com
9
• Decompose the performance
requirement following system
structure
• Using engineering judgment/best
practice/cosmic constraints
• Creates initial subsystem requirements
• Validate the decomposition by re-
combining via the behaviour
• Formally and automatically checkable
• Can be part of continuous integration
• Captures interactions and couplings
• Necessary and sufficient:
• IF all subsystems function correctly
and integrate properly
• AND all subsystems satisfy their
performance requirements
• THEN the overall system will meet
its performance requirement
• Apply this hierarchically until
• Behaviour is trivially provable OR
• Have a complete set of testable
subsystem verification/acceptance
criteria
Validating performance requirements
© Predictable Network Solutions Ltd 2017 www.pnsol.com
Quantitative worked example
© Predictable Network Solutions Ltd 2017 www.pnsol.com
11Generic RPC
Front end Back endNetwork
6
Transmit
Response
2
Send
request
4
Process
request
5
Commit
results
1
Construct
transaction
7
Receive
Response
0
Idle
8
Complete
transaction
3
Receive
request
© Predictable Network Solutions Ltd 2017 www.pnsol.com
12Generic RPC - observables
Front end Back endNetwork
6
Transmit
Response
2
Send
request
4
Process
request
5
Commit
results
1
Construct
transaction
7
Receive
Response
0
Idle
8
Complete
transaction
3
Receive
request
A
E
D
CB
F
∆Q(A⇝B) ∆Q(B⇝C)
∆Q(C⇝D)
∆Q(D⇝E)∆Q(E⇝F)
© Predictable Network Solutions Ltd 2017 www.pnsol.com
13Generic RPC – abstract performance
Front end
1
Construct
transaction
0
Idle
8
Complete
transaction
A
E
B
F
Network
6
Transmit
Response
2
Send
request
7
Receive
Response
3
Receive
request
D
C
∆Q(C⇝D
)
© Predictable Network Solutions Ltd 2017 www.pnsol.com
14Generic RPC – abstract performance
Front end
1
Construct
transaction
0
Idle
8
Complete
transaction
A
E
B
F
∆Q(B⇝E
)
© Predictable Network Solutions Ltd 2017 www.pnsol.com
15Generic RPC – abstract performance
0
Idle
A
F
∆Q(A⇝F)
© Predictable Network Solutions Ltd 2017 www.pnsol.com
16
• User performance requirement is
a bound on ∆Q(A⇝F), e.g.
• 50% within 500ms
• 95% within 750ms
• etc.
• Decompose this as
∆Q(A⇝F) = ∆Q(A⇝B) ⊕
∆Q(B⇝E) ⊕ ∆Q(E⇝F)
• How these terms combine
depends on the behaviour
• Will examine the interaction of
performance and behaviour next
Performance requirement
© Predictable Network Solutions Ltd 2017 www.pnsol.com
17
• Since the request or its
acknowledgement may be lost,
the process retries after a 333ms
timeout
• Failure mitigation
• After three attempts the
transaction is deemed to have
failed
• Failure propagation
• Can unroll the loop for a clearer
picture
Front-end behaviour (1)
A
E
B
F
© Predictable Network Solutions Ltd 2017 www.pnsol.com
18Front-end behaviour (2)
© Predictable Network Solutions Ltd 2017 www.pnsol.com
19
• We have a non-preemptive first-
to-finish synchronisation between
receiving the response and the
timeout
• Which execution path is taken is a
probabilistic choice depending on
the ∆Q(B ⇝E)
• We combine the ∆Qs of the
different paths with these
probabilities
Assumptions:
For an initial customer deployment
over UK broadband with the server
in the US East Coast:
• ∆Q(B⇝C) and ∆Q(D⇝E) the
same:
• Minimum delay 95ms
• 50ms variability
• 1-2% loss
• ∆Q(C⇝D) distributed between
25ms and 60ms
Performance analysis (1)
© Predictable Network Solutions Ltd 2017 www.pnsol.com
20Performance analysis (2)
∆Q(C⇝D)
∆Q(B⇝E)
∆Q(A⇝F)
Requirement
© Predictable Network Solutions Ltd 2017
21
www.pnsol.com
Verification
• So far, the verification task seems straightforward:
• Provided the network and server performance are within the given bounds,
then the front end will deliver the required end-user experience
• However, this seems to have plenty of slack
• Can we validate a different set of requirements?
• How far can we relax them?
• With this model we can explore:
• Varying the network performance
• Varying the server performance (e.g. as a result of load)
© Predictable Network Solutions Ltd 2017 www.pnsol.com
22Performance analysis (3)
∆Q(C⇝D
)
∆Q(B⇝E)
∆Q(A⇝F)
Requirement
© Predictable Network Solutions Ltd 2017 www.pnsol.com
23
• Virtualised server has a limit on
the rate of I/O operations
• So that the underlying infrastructure
can be reasonably shared
• Such parameters affect the OPEX
• Typically access is controlled via a
token-bucket shaper
• Allows small bursts
• Limit long-term average use
• This constitutes a QTA
• Need to verify both sides
• Delivered performance of the
virtualised I/O thus depends on
recent usage history
• Load is a function of number of
front-ends accessing the service and
the proportion of re-tries
• Re-tries depend on the ∆Q(B ⇝E) –
which in turn depends on the
performance of the I/O system!
• This has been modeled here via a
probability of taking a ‘slow path’
in the server
Server resource consumption
© Predictable Network Solutions Ltd 2017 www.pnsol.com
24Performance analysis (4)
∆Q(C⇝D
)
∆Q(B⇝E)
Requirement
∆Q(A⇝F)
© Predictable Network Solutions Ltd 2017© Predictable Network Solutions Ltd 2017 www.pnsol.com
25
Outcomes
Delivered
Resources
Consumed
Variability
Exception
/failure
Externally created
Mitigation
System impact
Propagation
Scale
Schedulability
Capacity
Distance
Number
Time
Space
Density
Impact of retry
behaviour on
delivered ∆Q
Impact of retry
behaviour on
server load
Effect of server
loading on
delivered ∆Q
Server I/O
constraint
triggered by load
Coupling of
retries/load/resp
onse delay
© Predictable Network Solutions Ltd 2017 www.pnsol.com
Relation to challenges in ICT
E.g. virtualisation
© Predictable Network Solutions Ltd 2017 www.pnsol.com
27
Creates:
• New digital supply chain hazards
• Loss of mitigation opportunities
• Aspects of performance are no longer
under direct control
• Additional ∆Q
• As seen in example
• This will be ‘traded’ to extract value
• Who benefits? The user or the
infrastructure provider?
Now need to consider:
• Ongoing resource costs
• OPEX rather than (sunk) CAPEX
• Opportunity costs
• Amazon Web Service pricing of compute
instances, SQS and lambda micro-services
shows engagement with opportunity cost
issues
Virtualisation
© Predictable Network Solutions Ltd 2017 www.pnsol.com
28
Need quantitative intent
• At the very least, some indication
of what is really not acceptable
• Comparison with something else
(that is measurable) will do
• But not “the best possible” or
“whatever will sell”
Can approximate best possible ∆Q
• Use to quickly check degree of
feasibility
• Add up minimum times
• Means and variances also add
• Puts bounds on future
improvement
• Early visibility of performance
hazards
• Early mitigation
General ICT application
© Predictable Network Solutions Ltd 2017
29
www.pnsol.com
Summary
• System performance validation consists of:
• Analysing the interaction of system behaviour and subsystem performance
requirements
• Showing that this ‘adds up’ to meet the quantified user requirements (QTA)
• System performance verification consists of showing that subsystems
meet their QTAs
• By analysis and/or measurement of ∆Q of the subsystems’ observable
behaviour
• This provides acceptance and/or contractual criteria for third-party
subsystems or services
• Substantially reduces performance integration risks and hence re-work
© Predictable Network Solutions Ltd 2017 www.pnsol.com
What now ?
© Predictable Network Solutions Ltd 2017
31
www.pnsol.com
The questions
Given the right inputs, this sort of analysis takes relatively little time
and effort, however:
• Quantitative intent is fundamental
• Where can this be found?
• Which parts of the system decomposition are forced?
• E.g. use of existing infrastructure, such as a network
• How to handle subsystems with undocumented characteristics?
• Can mitigate this with in-life measurement of ∆Q driven by the validation
• Forewarned is forearmed
• Inexpensive, focused data rather than wide-angle ‘big data’ approach
© Predictable Network Solutions Ltd 2017 www.pnsol.com
Thank you!

Contenu connexe

Tendances

Estimation – a waste of time master 2013 sdc gothenburg w hp rules
Estimation – a waste of time master 2013 sdc gothenburg w hp rulesEstimation – a waste of time master 2013 sdc gothenburg w hp rules
Estimation – a waste of time master 2013 sdc gothenburg w hp rules
tom gilb
 
Cevn Vibert Testimonials
Cevn Vibert   TestimonialsCevn Vibert   Testimonials
Cevn Vibert Testimonials
cevn
 

Tendances (18)

Running successful agile projects
Running successful agile projectsRunning successful agile projects
Running successful agile projects
 
Estimation – a waste of time master 2013 sdc gothenburg w hp rules
Estimation – a waste of time master 2013 sdc gothenburg w hp rulesEstimation – a waste of time master 2013 sdc gothenburg w hp rules
Estimation – a waste of time master 2013 sdc gothenburg w hp rules
 
Illinois Technology Association Tech Talk
Illinois Technology Association Tech TalkIllinois Technology Association Tech Talk
Illinois Technology Association Tech Talk
 
Building a Compelling Business Case for Continuous Delivery
Building a Compelling Business Case for Continuous DeliveryBuilding a Compelling Business Case for Continuous Delivery
Building a Compelling Business Case for Continuous Delivery
 
Continuous delivery best practices and essential tools
Continuous delivery best practices and essential toolsContinuous delivery best practices and essential tools
Continuous delivery best practices and essential tools
 
Driving Continuous Delivery Transformation in a Data-Driven Way
Driving Continuous Delivery Transformation in a Data-Driven WayDriving Continuous Delivery Transformation in a Data-Driven Way
Driving Continuous Delivery Transformation in a Data-Driven Way
 
Strayer bus 375 final exam part 1
Strayer bus 375 final exam part 1Strayer bus 375 final exam part 1
Strayer bus 375 final exam part 1
 
Strayer bus 375 final exam part 1
Strayer bus 375 final exam part 1Strayer bus 375 final exam part 1
Strayer bus 375 final exam part 1
 
Webinar - Devops platform for the evolving enterprise
Webinar - Devops platform for the evolving enterpriseWebinar - Devops platform for the evolving enterprise
Webinar - Devops platform for the evolving enterprise
 
ODD+PC: How to Get Stuff Right
ODD+PC: How to Get Stuff RightODD+PC: How to Get Stuff Right
ODD+PC: How to Get Stuff Right
 
Webinar: 5 Steps To The Perfect Storage Refresh
Webinar: 5 Steps To The Perfect Storage RefreshWebinar: 5 Steps To The Perfect Storage Refresh
Webinar: 5 Steps To The Perfect Storage Refresh
 
Problem management foundation - Engineering
Problem management foundation - EngineeringProblem management foundation - Engineering
Problem management foundation - Engineering
 
IT Operations Consulting
IT Operations Consulting  IT Operations Consulting
IT Operations Consulting
 
Cevn Vibert Testimonials
Cevn Vibert   TestimonialsCevn Vibert   Testimonials
Cevn Vibert Testimonials
 
2019 State of DevOps Report: Database Best Practices for Strong DevOps
2019 State of DevOps Report: Database Best Practices for Strong DevOps2019 State of DevOps Report: Database Best Practices for Strong DevOps
2019 State of DevOps Report: Database Best Practices for Strong DevOps
 
Telecoms Evangelist no.2
Telecoms Evangelist no.2Telecoms Evangelist no.2
Telecoms Evangelist no.2
 
TOC instuments for IT projects
TOC instuments for IT projectsTOC instuments for IT projects
TOC instuments for IT projects
 
K metrics
K metricsK metrics
K metrics
 

En vedette

En vedette (16)

TA Info Lit Workshop 7 22 09
TA Info Lit Workshop 7 22 09TA Info Lit Workshop 7 22 09
TA Info Lit Workshop 7 22 09
 
How to come up with great startup ideas today.
How to come up with great startup ideas today. How to come up with great startup ideas today.
How to come up with great startup ideas today.
 
Documenting APIs: Sample Code and More (with many pictures of cats)
Documenting APIs: Sample Code and More (with many pictures of cats)Documenting APIs: Sample Code and More (with many pictures of cats)
Documenting APIs: Sample Code and More (with many pictures of cats)
 
Az email marketing reneszánsza - Tartalommarketing Konferencia 2017
Az email marketing reneszánsza - Tartalommarketing Konferencia 2017Az email marketing reneszánsza - Tartalommarketing Konferencia 2017
Az email marketing reneszánsza - Tartalommarketing Konferencia 2017
 
WWBF May event
WWBF May eventWWBF May event
WWBF May event
 
What is Usersnap? An Introduction to bug tracking.
What is Usersnap? An Introduction to bug tracking.What is Usersnap? An Introduction to bug tracking.
What is Usersnap? An Introduction to bug tracking.
 
Principles Into Practice: Co Design in Healthcare
Principles Into Practice: Co Design in HealthcarePrinciples Into Practice: Co Design in Healthcare
Principles Into Practice: Co Design in Healthcare
 
Presentasi Omjay di Universitas Mulawarman, Samarinda
Presentasi Omjay di Universitas Mulawarman, SamarindaPresentasi Omjay di Universitas Mulawarman, Samarinda
Presentasi Omjay di Universitas Mulawarman, Samarinda
 
Doing business in china
Doing business in china Doing business in china
Doing business in china
 
Five Tier for Media Publishers and Advertising Networks
Five Tier for Media Publishers and Advertising NetworksFive Tier for Media Publishers and Advertising Networks
Five Tier for Media Publishers and Advertising Networks
 
1 st year infographics team sports
1 st year infographics team sports1 st year infographics team sports
1 st year infographics team sports
 
Negotiating Digital Academic Writing Literacy Development Using an Open Educa...
Negotiating Digital Academic Writing Literacy Development Using an Open Educa...Negotiating Digital Academic Writing Literacy Development Using an Open Educa...
Negotiating Digital Academic Writing Literacy Development Using an Open Educa...
 
101 استراتيجية التعلم النشط
101 استراتيجية التعلم النشط101 استراتيجية التعلم النشط
101 استراتيجية التعلم النشط
 
Мультимедийная редакция: технология организации работы и интернет-сервисы
Мультимедийная редакция: технология организации работы и интернет-сервисыМультимедийная редакция: технология организации работы и интернет-сервисы
Мультимедийная редакция: технология организации работы и интернет-сервисы
 
Angus @WebAdeptUK BNi Presentation 2017
Angus @WebAdeptUK BNi Presentation 2017Angus @WebAdeptUK BNi Presentation 2017
Angus @WebAdeptUK BNi Presentation 2017
 
お役所風最悪なスライド
お役所風最悪なスライドお役所風最悪なスライド
お役所風最悪なスライド
 

Similaire à PEnDAR webinar 2 with notes

SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
Weaveworks
 

Similaire à PEnDAR webinar 2 with notes (20)

Time-resource v&v for complex systems
Time-resource v&v for complex systemsTime-resource v&v for complex systems
Time-resource v&v for complex systems
 
The Overture ΔQ testbed for design and deployment planning
The Overture ΔQ testbed for design and deployment planningThe Overture ΔQ testbed for design and deployment planning
The Overture ΔQ testbed for design and deployment planning
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdfSRE and GitOps for Building Robust Kubernetes Platforms.pdf
SRE and GitOps for Building Robust Kubernetes Platforms.pdf
 
3 Critical Keys to DevOps Success: Lessons from Forrester Research, Intel, an...
3 Critical Keys to DevOps Success: Lessons from Forrester Research, Intel, an...3 Critical Keys to DevOps Success: Lessons from Forrester Research, Intel, an...
3 Critical Keys to DevOps Success: Lessons from Forrester Research, Intel, an...
 
Network performance optimisation using high-fidelity measures
Network performance optimisation using high-fidelity measuresNetwork performance optimisation using high-fidelity measures
Network performance optimisation using high-fidelity measures
 
Assuring and Troubleshooting Business VPNs with Cisco Orchestrated Assurance ...
Assuring and Troubleshooting Business VPNs with Cisco Orchestrated Assurance ...Assuring and Troubleshooting Business VPNs with Cisco Orchestrated Assurance ...
Assuring and Troubleshooting Business VPNs with Cisco Orchestrated Assurance ...
 
Getting Started With ThousandEyes Proof of Concepts: End User Digital Experience
Getting Started With ThousandEyes Proof of Concepts: End User Digital ExperienceGetting Started With ThousandEyes Proof of Concepts: End User Digital Experience
Getting Started With ThousandEyes Proof of Concepts: End User Digital Experience
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
The Business Case for Cloud Management - RightScale Compute 2013
The Business Case for Cloud Management - RightScale Compute 2013The Business Case for Cloud Management - RightScale Compute 2013
The Business Case for Cloud Management - RightScale Compute 2013
 
Continuous Performance Testing
Continuous Performance TestingContinuous Performance Testing
Continuous Performance Testing
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
Managing a Widely Distributed Network
Managing a Widely Distributed NetworkManaging a Widely Distributed Network
Managing a Widely Distributed Network
 
Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...Automatic Performance Modelling from Application Performance Management (APM)...
Automatic Performance Modelling from Application Performance Management (APM)...
 
Quantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop EcosystemsQuantifying Genuine User Experience in Virtual Desktop Ecosystems
Quantifying Genuine User Experience in Virtual Desktop Ecosystems
 
IBM Software Defined Networking = Brave New World of IT
IBM Software Defined Networking = Brave New World of  ITIBM Software Defined Networking = Brave New World of  IT
IBM Software Defined Networking = Brave New World of IT
 
Neev Load Testing Services
Neev Load Testing ServicesNeev Load Testing Services
Neev Load Testing Services
 
Getting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of ConceptsGetting Started with ThousandEyes Proof of Concepts
Getting Started with ThousandEyes Proof of Concepts
 
A Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROIA Framework to Measure and Maximize Cloud ROI
A Framework to Measure and Maximize Cloud ROI
 

Dernier

Dernier (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 

PEnDAR webinar 2 with notes

  • 1. © Predictable Network Solutions Ltd 2017 www.pnsol.com PEnDAR Focus Group Performance Ensurance by Design, Assuring Requirements Second Webinar
  • 2. © Predictable Network Solutions Ltd 2017 www.pnsol.com Recap Previous webinar recording at https://attendee.gotowebinar.com/register/1615742612853821699 Also on SlideShare: https://www.slideshare.net/pnsol-slides
  • 3. © Predictable Network Solutions Ltd 2017 www.pnsol.com 3 Goals • Consider how to enable Validation & Verification of cost and performance • For distributed and hierarchical systems • via sophisticated but easy-to-use tools • Supporting both initial and ongoing incremental development • Provide early visibility of cost/performance hazards • Avoiding costly failures • Maximising the chances of successful in-budget delivery of acceptable end-user outcomes Partners Supported by: PEnDAR project
  • 4. © Predictable Network Solutions Ltd 2017 www.pnsol.com 4 Our offer to you Learn what we know about the laws of the distributed world: • Where the realm of the possible lies; • How to push the envelope in a way that benefits you most; • How to innovate without getting your fingers burnt. Get an insight into the future: • Understand more about achieving sustainability and viability; • See where competition may lie in the future. Our ‘ask’ Help us to find out: • In which markets/sectors/application domains will the benefits outweigh the costs? • Which are the major benefits? • Which are just ‘nice to have’? • How could this methodology be consumed? • Tools? • Services? Role of the focus group
  • 5. © Predictable Network Solutions Ltd 2017 www.pnsol.com Outline of a coherent methodology
  • 6. © Predictable Network Solutions Ltd 2017© Predictable Network Solutions Ltd 2017 www.pnsol.com 6 Outcomes Delivered Resources Consumed Variability Exception /failure Externally created Mitigation System impact Propagation Scale Schedulability Capacity Distance Number Time Space Density
  • 7. © Predictable Network Solutions Ltd 2017 www.pnsol.com 7Performance/resource analysis Sub- system Sub- system Sub- system ∆Q Shared resources Starting with a functional decomposition: • Take each subsystem in isolation • analyse performance • modelling remainder of system as ∆Q • quantify resource consumption • may be dependent on the ∆Q • Examine resource sharing • within system – quantify resource costs • between systems – quantify opportunity cost • Successive refinements • consider couplings • iterate to fixed point Quantitative Timeliness Agreement Nottoday
  • 8. © Predictable Network Solutions Ltd 2017 8 www.pnsol.com Recap of ∆Q • ∆Q is a measure of the ‘quality impairment’ of an outcome • The extent of deviation from ‘instantaneous and infallible’ • Nothing in the real world is perfect so ∆Q always exists • ∆Q is conserved • A delayed outcome can’t be ‘undelayed’ • A failed outcome can’t be ‘unfailed’ • ∆Q can be traded • E.g. accept more delay in return for more certainty of completion • ∆Q can be represented as an improper random variable • Combines continuous and discrete probabilities • Thus encompasses normal behaviour and exceptions/failures in one model
  • 9. © Predictable Network Solutions Ltd 2017 www.pnsol.com 9 • Decompose the performance requirement following system structure • Using engineering judgment/best practice/cosmic constraints • Creates initial subsystem requirements • Validate the decomposition by re- combining via the behaviour • Formally and automatically checkable • Can be part of continuous integration • Captures interactions and couplings • Necessary and sufficient: • IF all subsystems function correctly and integrate properly • AND all subsystems satisfy their performance requirements • THEN the overall system will meet its performance requirement • Apply this hierarchically until • Behaviour is trivially provable OR • Have a complete set of testable subsystem verification/acceptance criteria Validating performance requirements
  • 10. © Predictable Network Solutions Ltd 2017 www.pnsol.com Quantitative worked example
  • 11. © Predictable Network Solutions Ltd 2017 www.pnsol.com 11Generic RPC Front end Back endNetwork 6 Transmit Response 2 Send request 4 Process request 5 Commit results 1 Construct transaction 7 Receive Response 0 Idle 8 Complete transaction 3 Receive request
  • 12. © Predictable Network Solutions Ltd 2017 www.pnsol.com 12Generic RPC - observables Front end Back endNetwork 6 Transmit Response 2 Send request 4 Process request 5 Commit results 1 Construct transaction 7 Receive Response 0 Idle 8 Complete transaction 3 Receive request A E D CB F ∆Q(A⇝B) ∆Q(B⇝C) ∆Q(C⇝D) ∆Q(D⇝E)∆Q(E⇝F)
  • 13. © Predictable Network Solutions Ltd 2017 www.pnsol.com 13Generic RPC – abstract performance Front end 1 Construct transaction 0 Idle 8 Complete transaction A E B F Network 6 Transmit Response 2 Send request 7 Receive Response 3 Receive request D C ∆Q(C⇝D )
  • 14. © Predictable Network Solutions Ltd 2017 www.pnsol.com 14Generic RPC – abstract performance Front end 1 Construct transaction 0 Idle 8 Complete transaction A E B F ∆Q(B⇝E )
  • 15. © Predictable Network Solutions Ltd 2017 www.pnsol.com 15Generic RPC – abstract performance 0 Idle A F ∆Q(A⇝F)
  • 16. © Predictable Network Solutions Ltd 2017 www.pnsol.com 16 • User performance requirement is a bound on ∆Q(A⇝F), e.g. • 50% within 500ms • 95% within 750ms • etc. • Decompose this as ∆Q(A⇝F) = ∆Q(A⇝B) ⊕ ∆Q(B⇝E) ⊕ ∆Q(E⇝F) • How these terms combine depends on the behaviour • Will examine the interaction of performance and behaviour next Performance requirement
  • 17. © Predictable Network Solutions Ltd 2017 www.pnsol.com 17 • Since the request or its acknowledgement may be lost, the process retries after a 333ms timeout • Failure mitigation • After three attempts the transaction is deemed to have failed • Failure propagation • Can unroll the loop for a clearer picture Front-end behaviour (1) A E B F
  • 18. © Predictable Network Solutions Ltd 2017 www.pnsol.com 18Front-end behaviour (2)
  • 19. © Predictable Network Solutions Ltd 2017 www.pnsol.com 19 • We have a non-preemptive first- to-finish synchronisation between receiving the response and the timeout • Which execution path is taken is a probabilistic choice depending on the ∆Q(B ⇝E) • We combine the ∆Qs of the different paths with these probabilities Assumptions: For an initial customer deployment over UK broadband with the server in the US East Coast: • ∆Q(B⇝C) and ∆Q(D⇝E) the same: • Minimum delay 95ms • 50ms variability • 1-2% loss • ∆Q(C⇝D) distributed between 25ms and 60ms Performance analysis (1)
  • 20. © Predictable Network Solutions Ltd 2017 www.pnsol.com 20Performance analysis (2) ∆Q(C⇝D) ∆Q(B⇝E) ∆Q(A⇝F) Requirement
  • 21. © Predictable Network Solutions Ltd 2017 21 www.pnsol.com Verification • So far, the verification task seems straightforward: • Provided the network and server performance are within the given bounds, then the front end will deliver the required end-user experience • However, this seems to have plenty of slack • Can we validate a different set of requirements? • How far can we relax them? • With this model we can explore: • Varying the network performance • Varying the server performance (e.g. as a result of load)
  • 22. © Predictable Network Solutions Ltd 2017 www.pnsol.com 22Performance analysis (3) ∆Q(C⇝D ) ∆Q(B⇝E) ∆Q(A⇝F) Requirement
  • 23. © Predictable Network Solutions Ltd 2017 www.pnsol.com 23 • Virtualised server has a limit on the rate of I/O operations • So that the underlying infrastructure can be reasonably shared • Such parameters affect the OPEX • Typically access is controlled via a token-bucket shaper • Allows small bursts • Limit long-term average use • This constitutes a QTA • Need to verify both sides • Delivered performance of the virtualised I/O thus depends on recent usage history • Load is a function of number of front-ends accessing the service and the proportion of re-tries • Re-tries depend on the ∆Q(B ⇝E) – which in turn depends on the performance of the I/O system! • This has been modeled here via a probability of taking a ‘slow path’ in the server Server resource consumption
  • 24. © Predictable Network Solutions Ltd 2017 www.pnsol.com 24Performance analysis (4) ∆Q(C⇝D ) ∆Q(B⇝E) Requirement ∆Q(A⇝F)
  • 25. © Predictable Network Solutions Ltd 2017© Predictable Network Solutions Ltd 2017 www.pnsol.com 25 Outcomes Delivered Resources Consumed Variability Exception /failure Externally created Mitigation System impact Propagation Scale Schedulability Capacity Distance Number Time Space Density Impact of retry behaviour on delivered ∆Q Impact of retry behaviour on server load Effect of server loading on delivered ∆Q Server I/O constraint triggered by load Coupling of retries/load/resp onse delay
  • 26. © Predictable Network Solutions Ltd 2017 www.pnsol.com Relation to challenges in ICT E.g. virtualisation
  • 27. © Predictable Network Solutions Ltd 2017 www.pnsol.com 27 Creates: • New digital supply chain hazards • Loss of mitigation opportunities • Aspects of performance are no longer under direct control • Additional ∆Q • As seen in example • This will be ‘traded’ to extract value • Who benefits? The user or the infrastructure provider? Now need to consider: • Ongoing resource costs • OPEX rather than (sunk) CAPEX • Opportunity costs • Amazon Web Service pricing of compute instances, SQS and lambda micro-services shows engagement with opportunity cost issues Virtualisation
  • 28. © Predictable Network Solutions Ltd 2017 www.pnsol.com 28 Need quantitative intent • At the very least, some indication of what is really not acceptable • Comparison with something else (that is measurable) will do • But not “the best possible” or “whatever will sell” Can approximate best possible ∆Q • Use to quickly check degree of feasibility • Add up minimum times • Means and variances also add • Puts bounds on future improvement • Early visibility of performance hazards • Early mitigation General ICT application
  • 29. © Predictable Network Solutions Ltd 2017 29 www.pnsol.com Summary • System performance validation consists of: • Analysing the interaction of system behaviour and subsystem performance requirements • Showing that this ‘adds up’ to meet the quantified user requirements (QTA) • System performance verification consists of showing that subsystems meet their QTAs • By analysis and/or measurement of ∆Q of the subsystems’ observable behaviour • This provides acceptance and/or contractual criteria for third-party subsystems or services • Substantially reduces performance integration risks and hence re-work
  • 30. © Predictable Network Solutions Ltd 2017 www.pnsol.com What now ?
  • 31. © Predictable Network Solutions Ltd 2017 31 www.pnsol.com The questions Given the right inputs, this sort of analysis takes relatively little time and effort, however: • Quantitative intent is fundamental • Where can this be found? • Which parts of the system decomposition are forced? • E.g. use of existing infrastructure, such as a network • How to handle subsystems with undocumented characteristics? • Can mitigate this with in-life measurement of ∆Q driven by the validation • Forewarned is forearmed • Inexpensive, focused data rather than wide-angle ‘big data’ approach
  • 32. © Predictable Network Solutions Ltd 2017 www.pnsol.com Thank you!

Notes de l'éditeur

  1. PEnDAR - Performance ENsurance by Design, Analysing Requirements TSB REFERENCE: 132304 Why? Seeing cost/performance hazards becoming visible late in the development process – too late to save some projects! Multi-$B problem worldwide Pressure to re-purpose commodity infrastructure for safety/mission-critical objectives; need to be able to articulate a safety case.
  2. Constraints on system developers: cosmic, ludic and ecological. Constraints on applicability of new approaches: established procedures, market inertia etc.
  3. We went over this picture in the first webinar. The core is to understand the relationship between the delivery of outcomes and consumption of resources. Other aspects (exceptions/failures, scaling, variability/correlations) can then be incorporated into the model.
  4. A Quantitative Timeliness Agreement (QTA) is a relationship between the demand (the applied load, including its pattern) and the delivered quality impairment (as a probability distribution, ∆Q) Opportunity cost between one system and another sharing the same resources, and successive refinements won’t be considered in this webinar.
  5. This can be combined with a corresponding analysis of the resource consumption
  6. We consider a very generic remote procedure call in which a front end (which could be a web browser, an embedded sensor, a smart meter etc.) engages with a transaction across a network with a back end that includes an interaction with a database of some sort. This is illustrated in this slide, which shows the system components in blue and the typical sequence of events as numbered circles. RPC is a common design pattern: DNS lookup Web request IoT sensor system Billing system e.g. smart meters
  7. We measure the performance of the system on the basis of passage times between the labelled observation points A - F, which we characterise using improper random variables, ∆Q.
  8. From a quality impairment perspective, we can ‘roll up’ the last stage of the process into a ∆Q – from the viewpoint of the rest of the system, how the transition from C to D occurs is irrelevant, we are just interested in how long it might take and how likely it is to fail (its ∆Q).
  9. In the same way we can combine the network portions with the back end to give a composite ∆Q.
  10. Finally we can roll up the front end behaviour to give the ∆Q for the whole system. This is the quality impairment from the ‘user’ point of view, which is where we can start to impose requirements.
  11. This is where we need to bring in a quantitative intent: how rapidly and reliably we want the system to perform. Here we choose some numbers for the sake of an example, which are plotted on the right as a CDF.
  12. We show the state diagram for the front-end process. Note that not every situation has an explicit notification of failure as we do here. We’ll get round to C and D later!
  13. If N=4 we can unroll the process like this. We have a non-preemptive first-to-finish synchronisation between receiving the response and the timeout. Which route is taken is a probabilistic choice, shown here with blue and green examples. Which path is taken depends on the ∆Q of the B – E path, represented here by the ‘wait for response’ states.
  14. The network characteristics are assuming communication between a UK-broadband-connected front-end and a US East Coast cloud-located server. Combining ∆Qs means convolving the corresponding probability distributions – this is mathematically quite straightforward and computationally inexpensive.
  15. The first curve shows the server response ∆Q, the second its convolution (twice) with the network transport ∆Q. On the right we see the result of combining this with the behaviour of the front-end, compared with the performance requirement.
  16. Having constructed a model, it is now easy to experiment with varying some parameters.
  17. The server performance is the same but the network has higher delays – equivalent to moving the server from London the the US West Coast, with more variety of routing choices. We can see on the right that the resulting overall ∆Q no longer meets the requirement.
  18. We now add in a probability of the server following a ‘slow path’ (which could be as a result of load exceeding an aspect of the virtualisation constraint), modeled as a uniform delay of between 15ms and 150ms. On the left we have five ∆Q curves for the server response, with the slow path probability varying from 0% (the original curve) to 25%, 50%, 75% and 100%. On the right we see the ∆Q resulting from combining this with the front-end behaviour; even a 25% chance of following the slow path breaks the performance requirement.
  19. In this short example we have touched on several aspects of the problem space we described in the first webinar.
  20. Virtualisation forces detailed costing of resource usage – different from the typical situation of ‘sunk cost’ in a piece of hardware. If a server has been paid for, we don’t care whether its memory is 25% or 75% used, but if we have to pay for memory in a virtual server such a factor of 3 may be significant!
  21. Applying a performance methodology absolutely requires some bounds on what is acceptable. ‘Better than the competition” or ‘no worse than last week’ will do provided these things can be measured. Minimum possible ∆Qs can easily be estimated and added up to determine whether requirements are at all feasible (for example whether it’s possible to have the server in another continent). This puts bounds on how much slack any real implementation has, and thus indicates where performance hazards might appear (and hence where mitigation might be helpful).
  22. Preparation of the model used here was a day’s work. Creating all the graphs shown takes < 1s of processing on a laptop. Decomposing subsystem requirements can also be used to establish operating limits that can be continually measured.