a 86 slides presentation, derived from years of practical experience, covering following topics:
- understanding HTTP, the web protocol (and other related standards, e.g., XML)
- assessment of web performance (bytes and turns, Apdex)
- recommendations in web performance (web fat index, caching strategy, compression strategy)
- troubleshooting the web (user experience, access topologies, service variations, toolbox)
- golden rules for web robustness
2. why do we have to learn on web performance?
Response times of corporate
distributed applications
Network? Hosts? Application?
The network team has to take the initiative
• Bandwidth • Client • Too many
• Latency • Server exchanges
• Congestion • Too much data
• Processing Time • Serialized
• Sending Time requests …
3. training on web performance
q Understanding HTTP, the web protocol
§ specifications and implementations
q Best practices in web performance management
§ assessment of web performance
§ recommendations to web designers and to webmasters
q Troubleshooting the web
§ practical steps and methodology
§ mastering the toolbox
q This training is for you if:
§ you have to interact with web designers or web masters,
§ you would like to refine ITM positioning,
§ you are curious about improving or troubleshooting web transactions.
6. what the w3 has defined:
q the idea of a boundless world in which all items have a
reference by which they can be retrieved,
q the address system (URL) implemented to make this
world possible, despite many different protocols,
q a network protocol (HTTP) used by native web servers
giving features not otherwise available,
q a markup language (HTML) which every web client is
required to understand, and is used for the
transmission of basic things,
q the body of data available on the Internet using all or
some of the preceding listed items.
7. most related standards are freely available
q IETF
§ RFC2616 – HTTP 1.1
§ RFC2617 – HTTP authentication
§ RFC2518 – WebDAV extensions
§ RFC2246 – TLS 1.0
§ RFC2818 – HTTP over TLS
§ RFC2109 - cookies
§ RFC1952 – gzip
§ RFC2045, 2046, 2047 – MIME types
§ RFC4287 – ATOM 1.0
q W3C
§ XML
8. HTTP is about clients talking to servers
web server
some HTTP request
web browser
server answer
IP
network
11. HTML mixes data and their presentation
Content and related resources
Document structure
Style sheets
<html>
<head>
</head>
<body>
<h1>Le titre de ma page</h1>
<hr>
<table border="1" width="80%" align="center">
<tr>
<td><img src="bouquetin.gif"></td>
<td>Ceci est le contenu de cette cellule. </td>
</tr>
</table>
<hr>
</body>
</html>
12. XML structures data
Extensible Markup Language, or XML for short, is a new technology for web
applications.
XML is a World Wide Web Consortium standard that lets you create your own tags.
XML simplifies business-to-business transactions on the web
With XML, you can understand the meaning of the tags.
More importantly, a computer can understand them as well.
It's easier for a computer to understand that the tag <zipcode>34829</zipcode> is
a zip code.
A DTD – Data Type Definition is used to identify the tags to be used in the XML
message.
13. Example: from SPEC2000 to XML
<S1CPOXMTcmd>
<CCD>S1BOOKED</CCD>
<CIC>QF2</CIC>
<SPL>81205</SPL>
<ICR>USD</ICR>
<POC>1</POC>
<BNO>3</BNO>
CAM
<S1CPOXMT_ORDER>
S1BOOKED/QF2/81205/USD/1/BNO 3/ <TNC>341</TNC>
341/EOIJ1234567/65B2468/1/EA/25.20/15073 <CPO>EOIJ1234567</CPO>
<PNR>65B2468</PNR>
<QTO>1</QTO>
<UNT>EA</UNT>
<UNP>25.20</UNP>
<SSD>20030715</SSD>
</S1CPOXMT_ORDER>
</S1CPOXMTcmd>
</S1CPOXMT>
14. DTD is used to identify XML tags
<S1CPOXMTcmd>
<CCD>S1BOOKED</CCD>
<CIC>QF2</CIC>
<SPL>81205</SPL>
<ICR>USD</ICR>
<?xml version="1.0" encoding="UTF-8"?>
<POC>1</POC>
<!ELEMENT S1CPOXMT (SPEC_HDR?,
S1CPOXMTcmd)> <BNO>3</BNO>
<!ELEMENT S1CPOXMTcmd (CCD, CIC, SPL, <S1CPOXMT_ORDER>
ICR, POC, BNO?, QTL?, EVT?, <TNC>341</TNC>
S1CPOXMT_ORDER+)>
<CPO>EOIJ1234567</CPO>
<!ELEMENT S1CPOXMT_ORDER (TNC, CPO,
PNR, QTO, UNT, UNP, SSD, SSM?, SHT?, QTN?, <PNR>65B2468</PNR>
SDC?, MFR?, PRI?, ACN?, DIS?, PKG?, LSE?, <QTO>1</QTO>
PBR?, PDP?, VRN?, ACK?, POU?, CTN?, REM*)>
<UNT>EA</UNT>
Include SPEC HDR
<UNP>25.20</UNP>
Include Common Support Data Dictionary
<SSD>20030715</SSD>
</S1CPOXMT_ORDER>
</S1CPOXMTcmd>
</S1CPOXMT>
15.
16. The writeable web is a very highly fragmented environment:
business (micro-transactions), process (on-demand interactions),
people (user-driven innovation), information (un-correlated data)
18. web server performance
End-to-end Response Time
· Service Time ¹ Transfer Time º User Time
Service Security Distance Last mile Rendering
IP infrastructure
data center user
19. (bare) performance model
Response Time = Service Time + Transfer Time + User Time
Bytes / Throughput + Turns * Latency / Parallelism
Reduce turns (web designer)
Check available bandwidth
Reduce latency
Avoid drops
Check size of TCP window
20. bandwidth vs. distance
Number of lanes = bandwidth impact (throughput)
130
Maximum speed = distance impact (latency)
21. also consider TCP built-in mechanisms
Server (sender) Client (receiver)
• The window size limits the bursts
sent by the sender
Time
SYN
SYN ACK
PAC
TCP
window
KE T1
• The window size is defined by the
ACK 1
PACKE
T 2
receiver (=size of buffer in memory)
PACKE
T 3
ACK 3
PACKE
T 4
• ‘Slow-start’ is a mechanism to
TCP
window
PACKE
PACKE
T 5 ensure network condition may
T 6
increases PACKE
T 7 support burstiness
ACK 7
•
•
•
FIN
• The window is almost closed on
FIN ACK
any network problem
22. what is a good performance level?
q Look at what others are doing:
§ benchmark against legacy competition (e.g., Boeing vs. Airbus),
§ also consider consumer sites (e.g., lastminute.com vs. airfrance.com)
q Manage user expectations
§ Where do they live?
§ From which network do they visit?
§ What connection speed do they enjoy?
§ What are the peak usage times and patterns of site visitation?
q Integrate technical constraints
§ Static or dynamic pages?
§ Web content, or multimedia?
§ Pure HTML, or Flash, or Java, or AJAX?
23. How Users View Application Task Performance
q Satisfied
§ User maintains concentration
§ Performance is not a factor in the user experience
§ Time limit threshold is unknowingly set by users and is consistent
q Tolerating
§ Concentration is impaired
§ Performance is now a factor in the user experience
§ User will notice how long it is taking
q Frustrated
§ Performance is typically called unacceptable
§ Casual user may abandon the process
§ Production user is very likely to stop working
24. how Apdex works
1. Define T for the application
T = the application target time (threshold between satisfied and tolerating users).
Existing Task F = threshold between tolerating and frustrated users is calculated (F = 4T).
Response Time 2. Define a Report Group (details available are tool dependent).
Measurement 3. Extract data set from existing measurements for Report Group.
Samples
2 4. Count the number of samples in three performance zones.
5. Calculate the Apdex formula. 1.00T
Report Group: 6. Display Apdex result (T is always shown as part of the result). 0.94T
Excellent
Application
User Group 3 Good 0.85T
Time Period
Fair
0.70T
Poor
Frustrated 5 0.50T
4 Satisfied +
Tolerating 6
F 2
ApdexT=
Unacceptable
Total samples
Tolerating
1 T
Satisfied
0.00T
25. Apdex examples
On-line stock trading company
Within the United States 0.793 Fair 3
Outside the United States 0.433 Unax 3
Retirement funds manager by their customers’ ISP
SBC (many dial-up users) 0.623 Poor 3
Sprint (mix of access types) 0.723 Fair 3
MCI (only corporate broadband access) 0.923 Good 3
Supply chain management system
Original system across the United States 0.714 Fair 4
Adding content compression 0.904 Good 4
Adding transparent turns reduction 0.944 Excel 4
27. web deployment framework
Application End-user
Origin Server Deployment Infrastructure
Internet
Production Bandwidth
Interface
Content overhead (fat index)
HTTP requests
SSL/TLS sockets
End-to-end latency
28. the web Fat Index
q It is a method to assess the noise / signal ratio, that is, a
measurement of actual overhead
q Fat Index= 10 log( Total Bytes / Useful Bytes )
§ Total Bytes = Sum for all network bytes exchanged during transaction
§ Useful Bytes = Displayed to end-user
q Typical values:
§ 1.2 db for a raw text file
§ 13 db to 17 db for popular dynamic Internet sites
§ 22 db for intranet web site
29. how to use the web Fat Index?
q A typical portal page:
§ 250 kbytes (inclusive all web objects)
§ 2 kbyte actually useful to end user
§ Fat Index = 10 log( 250 / 2 ) = 21 dB
q Caching may help a lot, except on first access
q gzip compression may save up to 5 dB
q Shape a strategy, e.g.:
§ 10 dB on front page (first contact with brand)
§ 20 dB on authentication
§ 10 dB during navigation after login
30. where can data be cached?
Application End-user
Origin Server
Internet
Reverse CDN Proxy Browser
proxy (Akamai, cache cache
cache Netli)
31. how to define a caching strategy
q The impact on usability is huge
§ End-user can admit delays on first access
§ Usually, expectations are higher for subsequent access
§ Navigating back has to be almost instantaneous
q How to not transmit same bytes again and again?
§ Browser validation
§ Server validation
§ Expiration scheme
q What cannot be cached?
§ Normally, web objects fetched through SSL/TLS
§ The outcome of POST requests
32. regular population of cache
network
Web Browser Web Server
GET /path/object
200 OK
Last-Modified: <date>
ETag: <unique string>
Content-Length: <bytes>
Whole content is transmitted
33. browser (poor) validation Observed when server
does not validate
static content
network
Web Browser Web Server
GET /path/object
If-Modified-Since: …
If-None-Match: … 200 OK
Socket Reset 1- Whole content
is transmitted
despite browser
3 – The connection indications
cannot be reused
2 - Browser tries for other requests
to break the (HTTP 1.1)
transfer
34. server (normal) validation
network
Web Browser Web Server
GET /path/object
If-Modified-Since: …
If-None-Match: … 304 Use local copy
One RTT per request
35. cache dynamic objects (1/2)
network
Web Browser Web Server
GET /path/object
200 OK
ETag: <unique string>
Content-Length: <bytes>
Program steps:
1- compute page content
2- ETag = md5(content)
36. cache dynamic objects (2/2)
network
Web Browser Web Server
GET /path/object
If-None-Match: …
304 Use local copy
Program steps:
No data is transmitted if 1- compute page content
client and server have 2- ETag = md5(content)
same values for ETag 3- if ETags are equals,
sent 304 Use local copy
4- else sent content and
code 200 OK
37. cache through expiration
network
Web Browser Web Server
GET /path/object
200 OK
Content-Length: <bytes>
Expires: <deadline>
Cache-Control: <max age>
Browser will use cached objects
until time limits defined by server
38. how to define a compression strategy?
q The impact on usability is huge
§ Sensitivity to network conditions is reduced
§ Impact of TCP Receive Window Size (RWIN) is reduced
q How to avoid transmit overhead?
§ Compression is an effective way to approximate minimum message size
q Technology limitations
§ HTTP headers are not compressed
§ Compression may require more horse power on servers
§ Some browsers may badly process compressed content
39. how to compress dynamic objects?
network
Web Browser Web Server
GET /path/object
Accept-Encoding:
200 OK
Content-Encoding: gzip
Content-Length: <bytes>
Browser has to
accept compression Data is uncompressed
explicitly on client side according
to Content-Encoding Program steps:
1- compute page content
2- if Accept-Encoding,
compress content
3- sent content
43. start from the end-user
1. Monitor the User
Experience
Security Web Database Web
Server Service
2. Explain overall response time by subsequent
analysis on connected components
44. overview of past assignments
- Internet sub-optimal routing - failure on the intranet
- polluting trafic (P2P) - proxy delay
- poor DNS performance - 1st hop congestion
Web servers Airline
Origin Server Proxy
routing DNS
Internet
content P2P LAN
1st hop
45. the methodology for web troubleshooting
q Ensure end users have done their homework
q Agree on reference transactions to be audited
§ Documented sequence of URLs, clicks, etc.
q Profile reference transactions
§ Play transactions in front of servers, and compute footprint figures.
q Capture data based on different scenarios
§ Play transactions in different situations and capture data.
§ Correlate these observations to estimate response time breakdown.
q Identify main sources of latency
§ This will be provided by a graphical “trace route” utility.
§ Also consider proxies and gateways
46. questions that end-user should answer
q Do browsers support HTTP/1.1 on workstation?
q Do corporate proxies support HTTP/1.1 as well?
q Less than 50 broadcast packet/s on the LAN?
q Network delay and loss rate from workstation to the
proxy? (provided by ping)
q Network delay and loss rate from intranet border to web
servers? (provided by ping)
q Time requested to retrieve a public fixed web object
from servers? (found in proxy log)
q Bandwidth utilisation of the Internet connection?
48. sample scenarios
q Client topologies:
§ Direct connection from behind the firewall
§ Direct connection in front of the firewall
§ Connection via Proxy-server
§ Connection via ADSL link (to benchmark the corporate Internet link)
§ Connection via SITA private infrastructure
q Service variations:
§ Direct connection to servers.
§ Connection to alternate servers (e.g., HTTP/1.1 instead of HTTP/1.0)
§ Connection to third party, for instance Akamai or equivalent.
49. start with internal tests
q Play reference transactions with your laptop, from
within the intranet, through the proxy and directly
q In the background, record data
§ At the network level, with Application Vantage or Sniffer or …
§ At the application level, with TracePlus or Tamper Data or HTTPWatch, …
§ At the screen level, with SnagIt or Camcorder or …
q In case of security concerns, you may import Sniffer
captures made by the Third Party into Application
Vantage
50. Sample internal tests
Airbus
Legacy
Singapore Proxy
AOLS
Airlines
TCP/80 & 443
intranet Internet
SITA
laptop SITA ATEX
New
Access
Layer
51. Border tests
q First, launch PingPlotter to capture data related to
network routes to servers.
q Play reference transactions with your laptop, from the
demilitarized zone, or just in front of the ISP router
q In the background, record data
§ At the network level, with Application Vantage or Sniffer or …
§ At the application level, with TracePlus or Tamper Data or HTTPWatch, …
§ At the screen level, with SnagIt or Camcorder or …
52. Sample border tests
Airbus
Singapore
Airlines
intranet Legacy
AOLS
Corporate
PACNet Internet
New
Access
Layer
SITA
laptop
53. Benchmark test (optional)
q First, launch PingPlotter to capture data related to
network routes to servers.
q Play reference transactions with your laptop, from the
demilitarized zone, or just in front of the ISP router
q In the background, record data
§ At the network level, with Application Vantage or Sniffer or …
§ At the application level, with TracePlus or Tamper Data or HTTPWatch, …
§ At the screen level, with SnagIt or Camcorder or …
54. Sample benchmark test
Airbus
Singapore
Airlines
premises Legacy
AOLS
SITA
laptop
Standalone
Internet
Broadband
New
Access
Layer
55. Service variation (optional)
q Play reference transactions with your laptop against
various hosts
q In the background, record data
§ At the network level, with Application Vantage or Sniffer or …
§ At the application level, with TracePlus or Tamper Data or HTTPWatch, …
§ At the screen level, with SnagIt or Camcorder or …
56. Sample tests related to web acceleration
Akamai
Proxy data center
Singapore Airbus
Airlines
intranet Internet
SITA New Access
laptop Layer
57. useful tools
q Trace routes through the Internet
§ PingPlotter, from Nessoft
q Network Capture and Analysis
§ Application Vantage, from Compuware
§ Sniffer
§ Ethereal
q Web Capture and Analysis
§ TracePlus Web Detective, from SST
§ HTTPWatch
§ Tamper Data
q Screen capture
§ SnagIt, from TechSmiths Corporation
§ Camcorder
58. PingPlotter
q Use it during border or benchmark tests, to capture
data related to end-to-end network path.
q Check every increase of round-trip-time. Is it expected
(an ocean to cross) or suspect (a congestion)?
q High-level of latency on first hop proves undersizing of
the Internet link to the ISP.
q If you suspect a congestion, capture data early in the
morning, during business hours and late at night.
q If the tool reports on changing routes, augment
duration of capture to prove flapping.
q Drops of ICMP packets are rarely correlated to
application poor performance, except above 30% loss.
59. CPA to Europe, through PCCW
260 ms due
to the
Pacific
China Ocean
and
USA size
USA
100 ms due
France to the
Atlantic
Ocean
60. Regular link to Toulouse
175 ms
+ loss
U.A.E.
expected
USA
Europe
61. Air China Infrastructure Assessment Packet loss:
Path from Beijing to Europe I
Average
Round
Trip Time
To
Airbus
= 354 ms
No
Packet
Loss
62. Air China Infrastructure Assessment Packet loss:
Path from Beijing to Europe II
Average
Round
Trip Time
To
Airbus
= 465 ms
Almost
No
Packet
Loss
Round Trip Time has increased with 100 ms
from day 1 to day 3
64. Application Vantage
q Capture network packets (= Sniffer)
q Allows for graphical handling of them (# Sniffer)
§ Clean the capture (remove packets afterwards)
§ Compute footprints
q Shows network errors and retransmissions
§ Configure the software to change some errors to warnings
q Allows for a number of complementary tests
§ Number of sockets open
§ Time to open a socket (SYN-ACK) compared to RTT
§ Size of TCP receive window
65.
66.
67.
68.
69.
70.
71.
72.
73.
74. TracePlus or HTTPWatch
q Capture web transactions, as seen at browser level
q Useful to assess proxy efficiency
§ Compare time to first byte through the proxy with time to first byte on direct
access
q Look at response header to check:
§ Cacheability of web objects
§ Compression
§ Web errors (404, 500, …)
75. Explain proxy impact
Emirates Airbus
corporate ADSL
Proxy Server
33 seconds
49 seconds
76. Indirect proof of proxy impact
Cathay Pacific Airbus
corporate Internet
Proxy/NTLM Server
Pilot Response Time Network Delay
50 s
323 ms
25 s
9:00 9:15 9:30 9:00 9:15 9:30
77. SnagIt
q A straightforward capture tool
q Make it visual to help motivate decisions
§ 90 seconds, does it mean something to you?
§ Provide a 90 seconds video, this will make a big difference
§ Example: transaction using free ADSL link compared to transaction through
corporate saturated link
§ Put several videos on same slide, and trigger all of them at the same time
78. Knowledge test: how would you prove:
- Internet sub-optimal routing - failure on the intranet
- polluting trafic (P2P) - proxy delay
- poor DNS performance - 1st hop congestion
web servers users
Origin Server Proxy
routing DNS
Internet
content P2P LAN
1st hop
80. golden rules for web robustness
Web definition Web development Web deployment
Where is the < 20 objects/page HTTP/1.1
competition?
< 100 kbytes/HTML Compress dynamic
Set objectives web pages
for end-user 20 dB Fat Index
response times on 1st access, Validate dynamic
10 dB afterwards web pages
Ask for XHTML/CSS
Minimum Javascript Expire static
Set data URLs for No Java applet objects
next 10 years
Fat Index= 10 log( Total Bytes / Useful Bytes )
81. takeaways
q Understanding HTTP, the web protocol
§ do your homework, and read specifications
q Best practices in web performance management
§ ask the network team to sponsor you to enter web teams
q Troubleshooting the web
§ if it is not in the IETF specification, it will happen…
§ use your brain and ask your peers