2. Introduction
• The Web’s growth has transformed communications and business
services such that speed, accuracy, and availability of network-
delivered content has become absolutely critical.
• Typically caches is the solution for the content delivery.
• But to what extent?
5. Problems with Caching
• Significant fraction (>50%?) of HTTP objects uncachable
• Sources of dynamism?
– Dynamic data: Stock prices, scores, web cams
– CGI scripts: results based on passed parameters
– Cookies: results may be based on passed data
– SSL: encrypted data is not cacheable
– Advertising / analytics: owner wants to measure # hits
• Random strings in content to ensure unique counting
• But…much dynamic content small, while static content large (images, video,
.js, .css, etc.)
5
6. Content Distribution Networks (CDN)
• A content provider such as CNN or Yahoo pays a CDN company
(such as Akamai) to get its content to the requesting users with
short delays.
• A CDN provides a mechanism for
• Replicating content on multiple servers in the Internet
• Providing clients with a means to determine the servers that can
deliver the content fastest.
7. CDN Distribution
• Content providers are CDN
customers
Content replication
• CDN company installs thousands of
servers throughout Internet
– In large datacenters
– Or, close to users
• CDN replicates customers’ content
• When provider updates content,
CDN updates servers
origin server
in North America
CDN distribution node
CDN server
in S. America CDN server
in Europe
CDN server
in Asia
7
8. Server Selection Policy
• Live server
– For availability
• Lowest load
– To balance load across the servers
• Closest
– Nearest geographically, or in round-trip time
• Best performance
– Throughput, latency, …
• Cheapest bandwidth, electricity, …
8
Requires continuous monitoring
of liveness, load, and
performance
9. Server Selection Mechanism
• Application
– HTTP redirection
• Advantages
– Fine-grain control
– Selection based on
client IP address
• Disadvantages
– Extra round-trips for
TCP connection to
server
– Overhead on the
server
GET
Redirect
GET
OK
9
10. Server Selection Mechanism
• Routing
– Anycast routing
• Advantages
– No extra round trips
– Route to nearby server
• Disadvantages
– Does not consider
network or server load
– Different packets may
go to different servers
– Used only for simple
request-response apps
1.2.3.0/24
1.2.3.0/24
10
11. Server Selection Mechanism
• Naming
– DNS-based server
selection
• Advantages
– Avoid TCP set-up delay
– DNS caching reduces
overhead
– Relatively fine control
• Disadvantage
– Based on IP address of
local DNS server
– “Hidden load” effect
– DNS TTL limits adaptation
11
1.2.3.4
1.2.3.5
DNS
query
local DNS server
12. CDNs scale Web servers by having clients
get content from a nearby CDN node
(cache)
13. Directing clients to nearby CDN nodes with DNS:
– Client query returns local CDN node as response
– Local CDN node caches content for nearby clients
and reduces load on the origin server
14. Origin server rewrites pages to serve
content via CDN
Page that distributes content via CDN
Traditional Web page on server
16. 11/02/2001 IMW'2001 16
Conclusions
• There is a clear increase in the number and percentage of
popular origin sites using CDNs
• CDNs performed significantly better than origin sites,
although caching options reduce the performance difference
• Content distribution is hard
– Many, diverse, changing objects
– Clients distributed all over the world
– Reducing latency is king