PAC 2019 virtual Alexander Podelko

The Forgotten Art of Performance Modeling
By Alexander Podelko

AbouttheSpeaker
• Alexander Podelko
• Specializing in performance since 1997
• Currently Consulting Member of Technical Staff at Oracle
(Stamford, CT)
• Performance testing and optimization of Enterprise
Performance Management (EPM) a.k.a. Hyperion products
• Board director at Computer Measurement Group (CMG) – non-
profit organization of performance and capacity professionals
Disclaimer: The views expressed here are my personal views only and do not necessarily represent those of my current or previous employers. All brands
and trademarksmentioned are the property of their owners.

Why do we need modeling and where are we?
How much math do we need here?
How may we approach the challenge?

Starting Points
• Importance of performance
• Preaching to the choir
• Importance of modeling / large picture
• A part of performance engineering
• Provide a large picture
• Validate results
• Complement testing
• Much cheaper (when a model is already built)

Modeling Cases
• Always was: What-if?
• Workload growth
• Configuration change
• Projecting performance test results
• Now: Continuous performance engineering
• Projecting continuous / component / service performance testing
• Discussing “white box” modeling here
• To be able to make changes in underlying architecture
• Not touching data extrapolation: time series forecasting, trending, ML, etc.

Two OppositeViews
• Authors describe in detail how you can model performance using
sophisticated math
• You feel that as soon as you comprehend that math, you won’t have any problem with
modeling
• Authors say that it is a black magic and you’d better stay away from it [or
do it in a minimal way with simple trending]
• While you probably won’t see that view in serious books, it is often can be seen in
Internet discussions

1966:Instrumentation
• 1966 – SMF (System Management Facilities) released as part of OS/360
• Still in use
• Big Data ?
• Deep Diagnostics ?
• IT Operations Analytics ?
• Observability ?

1977: Performance Analysis Tool
• 1977 – BEST/1 was released by BGS Systems, capacity and performance
management tool
• the first commercial package for computer performance analysis to be based on
analytic models.
• BGS Systems was acquired
by BMC Software in 1998

Two Main Challenges
• Modeling works well for known resource limitations
• You can’t model what you don’t know
• Lack of performance-related metrics of hardware to use in modeling
• Hardware specifications won’t tell you how fast your systems would work on this
hardware

Tools - Analytical
• BEST/1 [acquired by BMC, TrueSight Capacity Optimization now (??)]
• TeamQuest [acquired by HelpSystems]
• Metron Athene [acquired by SyncSort]
• PDQ - open source by Dr. Neil Gunther
• SPE· ED – by Dr. Connie U. Smith
• JMT – Java Modelling Tools

Tools - Simulation
• HyPerformix [acquired by CA ]]
• Palladio
• QPME (Queueing Petri net Modeling Environment)
• Not IT-specific / Discrete Event Simulation
• MathWorks SimEvents
• OMNeT++
• Many other commercial and open source tools

Decline ofModeling Tools
• No modeling tool in active development [according to my knowledge]
• Existing tools doesn’t keep up with modern trends
• Scaling out
• Lower need for queueing models
• Drastic increase in sophistication
• Need to model very complex architectures
• Configuration discovery
• Large variations in systems, no generic approach

Resources
• CPU, I/O, memory, and network
• And software objects
• Connection and thread pools, caches, etc.
• Resource Utilization
• Related to a particular configuration
• Often generic policies like CPU below 70%
• Relative values (in %) are not useful if configuration is not given
• Commercial Off-the-Shelf (COTS) software
• Virtual environments

Resources: Absolute Values
• Absolute values
• # of instructions, I/O per transaction
• Seen mainly in modeling
• MIPS in mainframe world
• IOPS, MiB/s
• Importance increases again with the trends of virtualization, cloud
computing, and SOA
• VMware: CPU usage in MHz
• Microsoft: Megacycles
• Amazon: EC2 Compute Units (ECU)
• Oracle: OCPU

QueueingTheory
• M/M/1
• Distribution of arrival/service /number of servers
• Kendall’s notation
• Analytical models / Simulation

Operational Laws
• Utilization Law
• Utilization = Service Time x Throughput
• Service Demand Law
• Service Time = Utilization / Throughput
• Forced Flow Law
• Component Throughput = Visits x Throughput
• Little’s Law
• Number in system = Response Time x Throughput

Universal ScalabilityLaw
• By Dr. Neil Gunther
• Concurrency - ideal parallelism
• Contention (α) – for shared resources
• Coherency (β) – crosstalk delay
• Extension of Amdahl’s Law (β=0)

More Flexible Approach
Introductory Books

SometimesLinear Model Works
• Linear model may often work for multi-processor machines
• Considering the working range of CPU utilization
• Most IT shops don't want more than 70-80%
• Throughput and CPU utilization increase proportionally
• Response times increase insignificantly

Example
• Simple queuing model was built using TeamQuest
• Specific workload executed on a four-way server
• Eight different load levels
• Step 1 represents 100-user workload, each next step adds 200 users, step 8 represents
1,500 users
• Linear model would be good through step 6
• With CPU utilization 87%

CPU Utilization
System (Test_Windows) CPU (Application) Utilization by Workload Percentages
0
10
20
30
40
50
60
70
80
90
100
Step: 1 Step: 2 Step: 3 Step: 4 Step: 5 Step: 6 Step: 7 Step: 8
Percentage
Request Medium

Throughput
Throughput
0
5
10
15
20
25
Completions/Second
Request Medium

Response Time
Response Time
0
2
4
6
8
10
12
14
Seconds
Request Medium

Buildinga Model
• Significantly increases the value of performance testing
• One more way to validate tests and help to identify problems
• Answers questions about sizing the system
• Doesn't need to be a formal model
• May be just simple observations as to how much resources are needed by each
component for the specific workload

May BeasSimpleasThat
• Software needs X units of hardware power per request
• Processor has Y units of hardware power
• The number of such processors N needed for processing Z requests as
N=Z*X/Y
• See the need in absolute values?
• We ignore all non-linear effects here – to be verified

Principles
• “All models are wrong, but some are useful”
• by George Box
• "Everything should be made as simple as possible, but not simpler"
• by Albert Einstein

Today’sChallenges
• System became very sophisticated
• Chances to find an appropriate tool are limited
• Some tools disappeared, other rather basic
• Chances to find an appropriate “template” are limited
• Depends on the degree of non-linearity, etc.
• Probably at least a combination of existing approaches
• Use ideas
• Not much else available 

HyPerformix Approach
• HyPerformix docs / papers by Richard Gimarc, Amy Spellmann
• Three types of data (Type 1, 2, and 3)
• T1 Single Business Function Trace
• tier-to-tier workflow of each of the application’s business functions
• T2 Single Business Function Load Test
• system resources required to process each business function on each server
• T3 Load Test
• typical load test with a mixed workload
• model validation
• WIFR – Workload, Infrastructure, Flow, Resource

T1 - Architecture
• HyPerformix: Flow data
• Communications and flow between tiers
• Visit counts to tiers
• Message sizes for each communication
• Architecture and data / processing flow
• Configuration discovery
• Already exists in some tools (APM, AIOps, etc)
• Communication between tiers/component

T2 –Resources
• Run independent tests for each business function (T2)
• To see how much resources each function requires
• Per tier / component
• Steady load
• Load shouldn't be too light
• To be not distorted by noise
• Load shouldn't be too heavy
• To avoid non-linear effects

T3 - Realistic Workload
• Realistic load test
• Mix of business functions (of T2)
• Way to validate model
• Steady load
• Not too low
• Not too high

TeamQuest Approach
• Steady production or test load
• Not too light, not too heavy
• Realistic mix (T3 in HyPerformix terms)
• Get model parameters from it
• Model change of load / config what-if
• No easy way to change the mix

Positive Attitude
• In modeling concentrate on what you can predict
• Value of the model
• Understanding what is outside your models
• Not a reason not to do it at all

The Future ofPerformance Modeling
Configuration
Discovery
Performance
Testing
Workload
Analysis
Modeling
Production
Monitoring
Continuous Performance Testing –
Modeling as quality gates
[similar to Keptn / Pitometer]
Performance / Capacity /
Costs What-If Analysis

Summary
• Modeling is an important part of performance engineering
• Even more so when full-scale tests become more rare
• You do not necessarily need sophisticated math
• Minimal math and common sense often enough
• Good performance engineers do have non-formalized models in their mind
• It becomes too complex to keep it in mind
• Let’s revive the art !

Questions?
alex.podelko@oracle.com
alexanderpodelko.com/blog
@apodelko

PAC 2019 virtual Alexander Podelko

Recommandé

Recommandé

Contenu connexe

Similaire à PAC 2019 virtual Alexander Podelko

Similaire à PAC 2019 virtual Alexander Podelko (20)

Plus de Neotys

Plus de Neotys (20)

Dernier

Dernier (20)

PAC 2019 virtual Alexander Podelko