SlideShare une entreprise Scribd logo
1  sur  12
Télécharger pour lire hors ligne
Accelerated Web Content Delivery


Sanjeet Joshi
Architecture Technology Services
HCL Technologies Ltd.
Accelerated Web Content Delivery

© 2010, HCL Technologies Ltd.

November 2010

The author would like to thank Dr. Usha Thakur of ATS for her valuable help in
content formatting and content enhancement.




                                        NON-DISCLOSURE OBLIGATIONS AND DISCLAIMER

                                        The data, information or material provided herein is
                                        confidential and proprietary to HCL and shall not be
                                        disclosed, duplicated or used in whole or in part for any
                                        purpose other than as approved by an authorized official of
                                        HCL in writing. The recipient agrees to maintain complete
                                        confidentiality of the information; data received and shall
                                        take all reasonable precautions/steps in maintaining
                                        confidentiality of the same, however in any event not less
                                        than the precautions/steps taken for its own confidential
                                        material. If you are not the intended recipient of this
                                        information, you are not authorized to read, forward, print,
                                        retain, copy or disseminate this document or any part of it.
                                        Any statements in this presentation that are not historical
                                        facts may include forward-looking statements that involve
                                        risks and uncertainties; actual results may differ from the
                                        forward-looking statements.




                                             Page 2 of 12
Background

In 1995 when the Internet was still in its infancy there were about 16 million users worldwide
using it, compared to about 2 billion worldwide users today1. Over the last fifteen years not only
has the number of people using the Internet grown exponentially, but we have also witnessed
an evolution of technology standards, protocols, and information consumption patterns. The
Internet is no longer limited to desktop/laptop computers. An increasing number of people on
the go are using handheld devices to access their preferred websites. The easy access of
websites has resulted in a significant increase in Web traffic.

Today while designing a Web application or a website that is expected to generate a lot of
interest, one has to ensure that the Web application has the right design and infrastructure to
handle the extra load, failing which websites are likely to experience difficulties. For instance,
the highly popular micro-blogging website twitter.com faced stability issues for a long time after
its launch, since it was not designed to handle a large amount of traffic.

The performance of a Web application is determined by multiple factors such as design and
application architecture, quality of code and hardware infrastructure. Performance needs to be
built at every layer of the technology stack to get a solid finished product.

This paper focuses on the Web content caching aspect of website performance.

Purpose

Web caching is not a new idea. It has been in use for quite some time and current browsers,
caching proxies, and Web servers provide support for it. However Web caching is most often an
ignored aspect while designing a technology stack of a Web application.

Web content caching can be implemented by content consumers (end users) to improve their
Internet browsing experience or by content providers to reduce the load on their origin
infrastructure, as well as to give their customers a better Web surfing experience.

Caching at content consumer’s end is handled by Web browsers such as Internet Explorer,
Firefox etc. This is done automatically and end users have limited control over how and what
will be cached. Some organizations also install caching proxies to cache incoming Web content
and to apply security policies.

This paper will focus on caching solutions from a content provider’s point of view and the
various ways in which content caching can enhance a website’s performance. The paper



1
 See http://www.internetworldstats.com/stats.htm [November 2010] -> indicates when this site was
accessed

                                            Page 3 of 12
assumes that the reader is familiar with Web standards like HTTP, HTML and is technical in
nature. It is targeted towards technology architects and solution designers.

Web Caching Concepts

The concept of caching has been widely used since the early days of computing and
implemented at various layers in a technology stack. For example the processor chip layer has
a hardware cache that is used for storing most frequently accessed instructions. Irrespective of
where a cache is used, its main function is to store the most frequently accessed data
(information or instructions) and its main goal is to improve performance by reducing
read/computation cycle times.

It is common knowledge that application level caching can be extremely beneficial in saving
multiple expensive database reads or expensive repetitive computations thereby improving the
overall application performance.

HTTP caching or Web caching goes one layer above and caches entire static Web resources
(e.g., HTML pages, CSS files etc) either at the client side (browser cache) or at the server side
(origin cache infrastructure).

Let us take a quick look at some of the common terms used with respect to caching in general
and Web caching in particular.

Origin server or origin infrastructure is the server infrastructure where Web servers or
application servers are hosted. These servers are responsible for serving fresh content upon
request.

Time to live (TTL) Cacheable data has a validity period beyond which it is considered stale.
This is referred to as TTL. It is a critical parameter because a very low TTL makes caching
ineffective and a very high TTL results in stale data being served to clients.

Cache hit occurs each time an HTTP request is served from cache.

Cache hit ratio is the percentage of all requests that result in cache hits.

A cache miss occurs when a request cannot be served from cache.




                                           Page 4 of 12
Controlling Caching Behavior of Your Content

Web browsers and caching proxies depend on the HTML and HTTP headers of the delivered
content for determining if the content can be cached, and if so, for how long it can be cached.
These cache headers can be tuned to define the cache behavior of a Web application/website.

Cache Headers

HTML authors can use tags in the <HEAD> section of the HTML page to dictate the caching
behavior of that page. However, header tags for caching do not have defined standards and
hence not all browsers or caches honor them. For example using <Pragma: no-cache> does
not guarantee that the content will never be cached. Hence it is not advisable to use HTML
cache headers.

A more reliable approach is to use HTTP headers. HTTP headers are created by the Web
server and sent in response to a request. The headers help the caching layer decide if the
content can be cached, for how long it can be served, and when it needs to be refreshed from
the origin server. Some important HTTP headers that control caching are as follows:

       Expires: Gives the date and time after which response is considered stale. For
       example,

              Expires: Sun, 06 Aug 2011 10:00:00 GMT.

       Cache-Control: Provides multiple options for controlling cache mechanism. They are
       as follows

              max-age=[seconds] — specifies the maximum time for which a resource will be
              considered fresh. Similar to Expire, this directive is relative to the time of the
              request, and not absolute.

              s-maxage=[seconds] — similar to max-age, except that it only applies to shared
              (e.g., proxy) caches.

              public — marks authenticated responses as cacheable; normally, if HTTP
              authentication is required, responses are automatically private.

              private — allows caches that are specific to one user (e.g., in a browser) to
              store the response; shared caches (e.g., in a proxy) may not.

              no-cache — forces cache to submit each request back to the origin server for
              validation before releasing a cached copy. This is useful for ensuring that
              authentication has been respected (in combination with public) and for
              maintaining freshness without sacrificing all of the benefits of caching.

              no-store — instructs caches not to keep a copy of the representation under any
              conditions.

                                         Page 5 of 12
must-revalidate — tells cache that it must obey any freshness information
                   user gives about a representation. HTTP allows cache to serve stale
                   representations under special conditions.

                   proxy-revalidate — similar to must-revalidate, except that it only applies to
                   proxy caches.


      Note: One important point to remember here is that not all type of content can be
       cached. For instance, dynamic content generated using server side scripting cannot be
       cached under normal conditions. However, dynamically assembled content that does not
       change frequently can be cached by making those scripts return valid cache headers.

Content Delivery Networks

Content Delivery Networks (CDN) are established commercial solutions on the market that
provide a Web content caching layer. These networks provide a transparent caching layer
between Web clients and the origin infrastructure, and intercept every request going to the
origin server. Typically CDNs have their cache servers distributed around the world and have
smart algorithms for delivering cached content from the nearest (in terms of network hops)
cache location. CDNs take a major chunk of content serving load away from the origin
infrastructure thus reducing its load. CDNs are also used for delivering rich multimedia content
such as audio and video files.

Figure 1 illustrates where a CDN fits in the overall workflow.

                                                                              Origin Server Infrastructure
     Web clients




                                   www                     CDN
                         http                  http                    http




                                         Figure 1: Positioning a CDN

Although CDNs deliver huge value they may not be suitable for small organizations with limited
budget because they are expensive to hire. CDNs are recommended mostly for organizations
that want more control over the caching behavior of their content. In such cases, a custom CDN
not only works out to be cheaper to implement but also gives immense control over caching.


                                               Page 6 of 12
SQUID Proxy in Server Acceleration Mode

Squid is an open source caching proxy product licensed under the GNU GPL. It is one of the
most widely used, robust and feature-rich open source products available on the market. Squid
is used by websites such as Wikipedia.org that witness very high traffic volumes.

Squid can be installed as a proxy to improve client side Web surfing performance, apply security
and filtering mechanism and apply organizational policies by monitoring outgoing requests.

Squid can also be installed in a reverse proxy mode to improve server side content delivery
performance. This is also known as server accelerator mode. A reverse proxy is setup close to
the origin Web servers to serve incoming requests rather than outgoing requests.

                                                              Origin Server Infrastructure
                                                     Squid
           Web clients                              Reverse
                                                    Proxies




                                www
                         http           http




                                 Figure 2: Squid as Reverse Proxy

A reverse proxy acts as an intermediary between a Web client and the origin Web server(s). It
receives all content requests and delivers valid content available in cache. If the requested
content is not available, the reverse proxy requests the origin server for the content. This
reduces TCP connection and content rendering load on the origin servers making them
available for other important tasks.

Some key benefits of the afore-mentioned architecture are as follows.

   1. LOAD BALANCING: If the Web server infrastructure requires expensive server hardware,
      Squid can be installed on a number of inexpensive commodity hardware boxes, thereby
      reducing the number of expensive origin servers.
   2. SECURITY: This can also provide an effective security solution because the origin server
      infrastructure is hidden behind the Squid infrastructure layer. Hence any attack on the
      website is limited to the squid infrastructure, and any damage is limited to the cached
      content.
   3. PERFORMANCE: A correctly tuned Squid installation can provide significant performance
      gains as the proxy is meant for serving cached content at very high speeds. It uses in-
      memory caching for better performance. Squid also provides various cache replacement
      policies that play a major role in determining the performance of a Squid server.
                                           Page 7 of 12
Squid Cache Replacement Policies

Cache replacement policy determines which objects in the cache can be replaced by other new
objects that are most likely to be served and thereby improve the cache hit ratio. This is an
important choice because it helps in disk and memory usage optimization. For example, the
most popular objects should not be removed from the cache and least accessed cached objects
should be replaced by more popular objects.

There are various replacement policies offered by Squid. Below we provide a brief introduction
to all of them. There is no single recommended or best policy. The right policy is chosen after
studying the content and how it is accessed.

LRU (Least Recently Used)

LRU is a common and effective choice for most cache implementations. It removes objects with
the greatest last accessed timestamp i.e. cached objects that are not accessed for a long time
are the prime candidates for replacement. LRU works well when objects that are most recently
accessed have a greater likelihood of being accessed again in the near future.

LFUDA (Least Frequently Used with Dynamic Aging)

LFU is another commonly used policy that keeps count of object references and then removes
the least used objects.

LFUDA is a variant of LFU that uses a dynamic aging policy to accommodate shifts in the set of
popular objects. In the dynamic aging policy, the cache age factor is added to the reference
count when an object is added to the cache or an existing object is modified. This prevents
previously popular documents from polluting the cache.

GDSF (Greedy Dual-Size Frequency)

GDSF is an enhancement of GDS which takes into account the size of the cached object and
the cost associated to retrieve it. GDFS takes into account frequency of reference. This policy is
optimized for more popular, smaller objects in order to maximize object hit rate.

Squid Deployment Topologies

Multiple Squid servers can be configured to work together to improve cache hit ratios or to
handle additional load. Squid caches, when installed in such a group, share either a sibling
relationship or a parent relationship. Squid servers running as parents can have multiple sibling
nodes communicating with it essentially forming a hierarchy. A flat topology may include Squid
servers with only sibling relationships.

If a request results in cache miss on a sibling node, it is transferred to the parent node. If parent
also returns a cache miss then the parent contacts the origin server for fresh content.
                                           Page 8 of 12
Squid Capacity Planning

Squid's hardware requirements are generally modest. Memory is often the most important
resource. A memory shortage significantly reduces performance. Higher hit ratios are obtained
by caching more objects. Caching more objects requires more disk space. Therefore disk space
is also an important factor that needs to be considered. Fast disks and interfaces are also
beneficial in improving disk access time. SCSI performs better than ATA, and may be chosen if
the higher cost can be justified. While fast CPUs are nice, they are not critical to good
performance.

Squid allocates a small amount of memory for each cached resource (up to 24 bytes per
resource). As a rule of thumb it requires 32MB RAM for each GB disk space. So a server with
512MB RAM can serve a disk cache of 16GB, or for a 300GB disk cache, approximately 10GB
RAM will be needed.



Conclusion

    Using reverse proxies for Web caching is a non-intrusive way of improving content
     delivery performance.
    Reverse proxy based Web caching can be implemented as a cost effective replacement
     for commercial CDNs.
    A customized CDN gives better control over the caching infrastructure and helps meet
     the specific performance needs of an enterprise as compared to an expensive
     commercial CDN which may provide limited configuration options.
    CDNs can reduce considerable load from the origin servers thus freeing up the origin
     server resources for other tasks.




                                        Page 9 of 12
Appendix A – Case Study

“Squid Implementation for a Leading Global Entertainment Content
Company”

The customer uses Akamai Edge Server Platform for improved content delivery. Edge Server
Platform’s design helps in improving content availability and reducing request response time.
This ideally translates into less Web traffic coming directly to the Web servers (origin servers)
thus improving the overall efficiency of the infrastructure and reducing infrastructure costs.

Ironically though, it was observed that origin servers are receiving increased Web traffic from
Akamai Edge servers themselves. A solution had to be put in place to tackle that problem with
minimal impact on existing applications and content.

Problem Context

The Akamai Edge Platform offers a robust design for highly efficient content delivery across the
globe. This is achieved by deploying several thousand servers at data centers all over the world
(edge servers) and then replicating the content to be delivered on appropriate servers. The key
then is to route all content requests from clients to the nearest (in terms of network hops)
available server resulting in minimal response time and higher availability. Here the edge server
act as a caching proxy that requests content from the origin server and then serves the cached
copy until its expiry, at which point a fresh copy is again requested from the origin server.
Akamai uses a hierarchical architecture for its edge platform to avoid thousands of edge servers
making multiple refresh requests to the origin server.

The problem is that the ‘innermost’ edge servers still need to make a refresh request to get the
new content from the origin server. This results in the origin server having to serve each of the
requests separately. This was the root cause of the problem.


                                                               Origin Server Infrastructure
                          Akamai CDN                 Foo.htm

                                                     Foo.htm

                                                     Foo.htm

                                                     Foo.htm




                            Figure 3 – High-level Problem Representation



                                          Page 10 of 12
The customer summarized the problem at hand thus:

   -   High traffic documents such as home pages were being requested from their origin
       servers as many as 70 times within a single TTL interval. This meant that there were that
       many innermost Akamai servers in the hierarchy.
   -   Far too many requests were being received for pages, XML documents, dynamically
       generated JS, CSS etc.

Customer felt that if the above-mentioned problems were addressed, the availability of the origin
servers would rise close to to 99.99%.

Solution Approaches Considered by HCL

Below is a brief summary of the approaches evaluated by the HCL team and its assessment of
those approaches.

Approach 1: Custom Solution - Application Server Side

The first approach called for intercepting incoming content refresh requests from the Akamai
servers to the origin servers, queuing and prioritizing them, and then rendering the highest
priority content.

HCL Assessment of Approach 1

    Solution was a workable one but complex and many race conditions would have to be
     considered before the solution’s effectiveness became known.
    Robustness and performance of such a solution was not obvious.
    Solution mandated changes to the application layer which could have resulted in a
     cascading effect on the underlying layers.

Approach 2: Using Pre-fetch Settings Provided by Akamai

The second approach called for asynchronous content refresh. When this feature is enabled in
Akamai, the content refresh requests are sent even before the content becomes stale. Akamai
servers continue to serve the existing content even after sending refresh requests, thereby
refreshing content asynchronously.

HCL Assessment of Approach 2

    Solution seemed like it was a perfect fit for the problem at hand, but it would not provide
     a complete solution.
    Solution would work well only when content was requested during the threshold set by
     pre-fetch settings. For example if pre-fetch was set to 90%, Akamai servers would send
     refresh requests to origin after 90% of TTL were over.
    Core problem of receiving multiple requests for the same content would remain
     unaddressed

                                         Page 11 of 12
HCL’s Squid Reverse Proxy-Based Solution

The HCL solution was based on the following design principles:

   1. Minimal or no changes to the application layer
   2. No rework for content producers or brand owners
   3. Once installed, solution should work transparently (without any other layers being aware
      of its existence)
   4. Solution should be repeatable/reusable

Using Squid as a Reverse Proxy

The goal of HCL’s solution was to minimize the number of requests going to the origin servers
while still serving as fresh content as possible.

As a first step, the HCL team proposed the installation of Squid in the reverse proxy mode on a
separate infrastructure. This introduced an additional caching layer between Akamai servers
and the origin servers. Upon setup, it cached all the relevant content and served it whenever
requested by Akamai. The team used advanced cache control setting provided by Squid (v 2.7)
to control the number of redundant requests for a single resource and to also support
asynchronous refresh.

Goals Achieved

The solution proposed by the HCL team passed the rigorous performance checks with over
90% load reduction.




                                        Page 12 of 12

Contenu connexe

En vedette

Energy & Utilities Case Study: Truck to boardroom integration for a leading e...
Energy & Utilities Case Study: Truck to boardroom integration for a leading e...Energy & Utilities Case Study: Truck to boardroom integration for a leading e...
Energy & Utilities Case Study: Truck to boardroom integration for a leading e...HCL Technologies
 
HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...
HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...
HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...HCL Technologies
 
Wise Men Sap Analytics Brochure
Wise Men Sap Analytics BrochureWise Men Sap Analytics Brochure
Wise Men Sap Analytics BrochureWise Men
 
Capgemini SAP Cloud People - Global capabilities and offerings
Capgemini SAP Cloud People - Global capabilities and offeringsCapgemini SAP Cloud People - Global capabilities and offerings
Capgemini SAP Cloud People - Global capabilities and offeringsCapgemini
 
Zensar SAP Practice
Zensar SAP PracticeZensar SAP Practice
Zensar SAP PracticeNiraj Singh
 
Paisajismo nº 31 _ Natureza Em Risco
Paisajismo nº 31 _ Natureza Em RiscoPaisajismo nº 31 _ Natureza Em Risco
Paisajismo nº 31 _ Natureza Em RiscoLara Plácido
 

En vedette (8)

Energy & Utilities Case Study: Truck to boardroom integration for a leading e...
Energy & Utilities Case Study: Truck to boardroom integration for a leading e...Energy & Utilities Case Study: Truck to boardroom integration for a leading e...
Energy & Utilities Case Study: Truck to boardroom integration for a leading e...
 
HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...
HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...
HCLT Whitepaper: Responsive Efficient Customer- focused Creating Value Across...
 
Wise Men Sap Analytics Brochure
Wise Men Sap Analytics BrochureWise Men Sap Analytics Brochure
Wise Men Sap Analytics Brochure
 
Capgemini SAP Cloud People - Global capabilities and offerings
Capgemini SAP Cloud People - Global capabilities and offeringsCapgemini SAP Cloud People - Global capabilities and offerings
Capgemini SAP Cloud People - Global capabilities and offerings
 
Zensar SAP Practice
Zensar SAP PracticeZensar SAP Practice
Zensar SAP Practice
 
20090913 Announcements
20090913 Announcements20090913 Announcements
20090913 Announcements
 
Music Camp
Music Camp Music Camp
Music Camp
 
Paisajismo nº 31 _ Natureza Em Risco
Paisajismo nº 31 _ Natureza Em RiscoPaisajismo nº 31 _ Natureza Em Risco
Paisajismo nº 31 _ Natureza Em Risco
 

Similaire à HCLT Whitepaper: Accelerated Web Content Delivery

The Most Frequently Used Caching Headers
The Most Frequently Used Caching HeadersThe Most Frequently Used Caching Headers
The Most Frequently Used Caching HeadersHTS Hosting
 
Web Site Optimization
Web Site OptimizationWeb Site Optimization
Web Site OptimizationSunil Patil
 
Web site optimization
Web site optimizationWeb site optimization
Web site optimizationSunil Patil
 
Basic Caching Terminology
Basic Caching TerminologyBasic Caching Terminology
Basic Caching TerminologyHTS Hosting
 
IWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy Issues
IWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy IssuesIWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy Issues
IWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy IssuesIWMW
 
Restful web-services
Restful web-servicesRestful web-services
Restful web-servicesrporwal
 
Introduction to the World Wide Web
Introduction to the World Wide WebIntroduction to the World Wide Web
Introduction to the World Wide WebAbdalla Mahmoud
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
A COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICES
A COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICESA COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICES
A COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICESijcsit
 
Caching in Drupal 8
Caching in Drupal 8Caching in Drupal 8
Caching in Drupal 8valuebound
 
Cdn technology overview
Cdn technology overviewCdn technology overview
Cdn technology overviewYoohyun Kim
 
Implementing a Caching Scheme for Media Streaming in a Proxy Server
Implementing a Caching Scheme for Media Streaming in a Proxy ServerImplementing a Caching Scheme for Media Streaming in a Proxy Server
Implementing a Caching Scheme for Media Streaming in a Proxy ServerAbdelrahman Hosny
 
Secure Distributed Deduplication Systems with Improved Reliability
Secure Distributed Deduplication Systems with Improved ReliabilitySecure Distributed Deduplication Systems with Improved Reliability
Secure Distributed Deduplication Systems with Improved Reliability1crore projects
 
Secure distributed deduplication systems
Secure distributed deduplication systemsSecure distributed deduplication systems
Secure distributed deduplication systemsPvrtechnologies Nellore
 
6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report
6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report 6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report
6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report c-tac
 
How to Measure Your CDN’s Cache Hit Ratio and Increase Cache Hits
How to Measure Your CDN’s Cache Hit Ratio and Increase Cache HitsHow to Measure Your CDN’s Cache Hit Ratio and Increase Cache Hits
How to Measure Your CDN’s Cache Hit Ratio and Increase Cache HitsMedianova
 

Similaire à HCLT Whitepaper: Accelerated Web Content Delivery (20)

The Most Frequently Used Caching Headers
The Most Frequently Used Caching HeadersThe Most Frequently Used Caching Headers
The Most Frequently Used Caching Headers
 
Web Site Optimization
Web Site OptimizationWeb Site Optimization
Web Site Optimization
 
Web site optimization
Web site optimizationWeb site optimization
Web site optimization
 
Basic Caching Terminology
Basic Caching TerminologyBasic Caching Terminology
Basic Caching Terminology
 
IWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy Issues
IWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy IssuesIWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy Issues
IWMW 2003: C7 Bandwidth Management Techniques: Technical And Policy Issues
 
Restful web-services
Restful web-servicesRestful web-services
Restful web-services
 
Rest ful security
Rest ful securityRest ful security
Rest ful security
 
Introduction to the World Wide Web
Introduction to the World Wide WebIntroduction to the World Wide Web
Introduction to the World Wide Web
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Browser Caching
Browser CachingBrowser Caching
Browser Caching
 
A COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICES
A COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICESA COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICES
A COMPARISON OF CACHE REPLACEMENT ALGORITHMS FOR VIDEO SERVICES
 
Caching in Drupal 8
Caching in Drupal 8Caching in Drupal 8
Caching in Drupal 8
 
Cdn technology overview
Cdn technology overviewCdn technology overview
Cdn technology overview
 
Swapnil_Chaudhari_paper
Swapnil_Chaudhari_paperSwapnil_Chaudhari_paper
Swapnil_Chaudhari_paper
 
Implementing a Caching Scheme for Media Streaming in a Proxy Server
Implementing a Caching Scheme for Media Streaming in a Proxy ServerImplementing a Caching Scheme for Media Streaming in a Proxy Server
Implementing a Caching Scheme for Media Streaming in a Proxy Server
 
Secure Distributed Deduplication Systems with Improved Reliability
Secure Distributed Deduplication Systems with Improved ReliabilitySecure Distributed Deduplication Systems with Improved Reliability
Secure Distributed Deduplication Systems with Improved Reliability
 
Secure distributed deduplication systems
Secure distributed deduplication systemsSecure distributed deduplication systems
Secure distributed deduplication systems
 
Cookie
CookieCookie
Cookie
 
6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report
6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report 6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report
6 Week / Month Industrial Training in Hoshiarpur Punjab- PHP Project Report
 
How to Measure Your CDN’s Cache Hit Ratio and Increase Cache Hits
How to Measure Your CDN’s Cache Hit Ratio and Increase Cache HitsHow to Measure Your CDN’s Cache Hit Ratio and Increase Cache Hits
How to Measure Your CDN’s Cache Hit Ratio and Increase Cache Hits
 

Plus de HCL Technologies

Emergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and ManagementEmergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and ManagementHCL Technologies
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSHCL Technologies
 
HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...
HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...
HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...HCL Technologies
 
HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...
HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...
HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...HCL Technologies
 
Noise Control of Vacuum Cleaners
Noise Control of Vacuum CleanersNoise Control of Vacuum Cleaners
Noise Control of Vacuum CleanersHCL Technologies
 
Cost-effective Video Analytics in Smart Cities
Cost-effective Video Analytics in Smart CitiesCost-effective Video Analytics in Smart Cities
Cost-effective Video Analytics in Smart CitiesHCL Technologies
 
A novel approach towards a Smarter DSLR Camera
A novel approach towards a Smarter DSLR CameraA novel approach towards a Smarter DSLR Camera
A novel approach towards a Smarter DSLR CameraHCL Technologies
 
Security framework for connected devices
Security framework for connected devicesSecurity framework for connected devices
Security framework for connected devicesHCL Technologies
 
Connected Cars - Use Cases for Indian Scenario
Connected Cars - Use Cases for Indian ScenarioConnected Cars - Use Cases for Indian Scenario
Connected Cars - Use Cases for Indian ScenarioHCL Technologies
 
A Sigh of Relief for Patients with Chronic Diseases
A Sigh of Relief for Patients with Chronic DiseasesA Sigh of Relief for Patients with Chronic Diseases
A Sigh of Relief for Patients with Chronic DiseasesHCL Technologies
 
Painting a Social & Mobile Picture in Real Time
Painting a Social & Mobile Picture in Real TimePainting a Social & Mobile Picture in Real Time
Painting a Social & Mobile Picture in Real TimeHCL Technologies
 
A Novel Design Approach for Electronic Equipment - FEA Based Methodology
A Novel Design Approach for Electronic Equipment - FEA Based MethodologyA Novel Design Approach for Electronic Equipment - FEA Based Methodology
A Novel Design Approach for Electronic Equipment - FEA Based MethodologyHCL Technologies
 
Intrusion Detection System (IDS)
Intrusion Detection System (IDS)Intrusion Detection System (IDS)
Intrusion Detection System (IDS)HCL Technologies
 
Manufacturing Automation and Digitization
Manufacturing Automation and DigitizationManufacturing Automation and Digitization
Manufacturing Automation and DigitizationHCL Technologies
 
Managing Customer Care in Digital
Managing Customer Care in DigitalManaging Customer Care in Digital
Managing Customer Care in DigitalHCL Technologies
 
Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...
Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...
Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...HCL Technologies
 
The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...
The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...
The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...HCL Technologies
 
Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...
Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...
Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...HCL Technologies
 
Transform and Modernize -UK's leading specialists in Pension and Employee Ben...
Transform and Modernize -UK's leading specialists in Pension and Employee Ben...Transform and Modernize -UK's leading specialists in Pension and Employee Ben...
Transform and Modernize -UK's leading specialists in Pension and Employee Ben...HCL Technologies
 

Plus de HCL Technologies (20)

Emergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and ManagementEmergence of ITOA: An Evolution in IT Monitoring and Management
Emergence of ITOA: An Evolution in IT Monitoring and Management
 
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICSUSING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
USING FACTORY DESIGN PATTERNS IN MAP REDUCE DESIGN FOR BIG DATA ANALYTICS
 
HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...
HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...
HCL HELPS A US BASED WIRELINE TELECOM OPERATOR FOR BETTER LEAD-TO-CASH AND TH...
 
HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...
HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...
HCL HELPS A LEADING US TELECOM PROTECT ITS MARKET SHARE AND MAINTAIN HIGH LEV...
 
Noise Control of Vacuum Cleaners
Noise Control of Vacuum CleanersNoise Control of Vacuum Cleaners
Noise Control of Vacuum Cleaners
 
Comply
Comply Comply
Comply
 
Cost-effective Video Analytics in Smart Cities
Cost-effective Video Analytics in Smart CitiesCost-effective Video Analytics in Smart Cities
Cost-effective Video Analytics in Smart Cities
 
A novel approach towards a Smarter DSLR Camera
A novel approach towards a Smarter DSLR CameraA novel approach towards a Smarter DSLR Camera
A novel approach towards a Smarter DSLR Camera
 
Security framework for connected devices
Security framework for connected devicesSecurity framework for connected devices
Security framework for connected devices
 
Connected Cars - Use Cases for Indian Scenario
Connected Cars - Use Cases for Indian ScenarioConnected Cars - Use Cases for Indian Scenario
Connected Cars - Use Cases for Indian Scenario
 
A Sigh of Relief for Patients with Chronic Diseases
A Sigh of Relief for Patients with Chronic DiseasesA Sigh of Relief for Patients with Chronic Diseases
A Sigh of Relief for Patients with Chronic Diseases
 
Painting a Social & Mobile Picture in Real Time
Painting a Social & Mobile Picture in Real TimePainting a Social & Mobile Picture in Real Time
Painting a Social & Mobile Picture in Real Time
 
A Novel Design Approach for Electronic Equipment - FEA Based Methodology
A Novel Design Approach for Electronic Equipment - FEA Based MethodologyA Novel Design Approach for Electronic Equipment - FEA Based Methodology
A Novel Design Approach for Electronic Equipment - FEA Based Methodology
 
Intrusion Detection System (IDS)
Intrusion Detection System (IDS)Intrusion Detection System (IDS)
Intrusion Detection System (IDS)
 
Manufacturing Automation and Digitization
Manufacturing Automation and DigitizationManufacturing Automation and Digitization
Manufacturing Automation and Digitization
 
Managing Customer Care in Digital
Managing Customer Care in DigitalManaging Customer Care in Digital
Managing Customer Care in Digital
 
Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...
Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...
Digital Customer Care Solutions, Smart Customer Care Solutions, Next Gen Cust...
 
The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...
The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...
The Internet of Things. Wharton Guest Lecture by Sandeep Kishore – Corporate ...
 
Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...
Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...
Be Digital or Be Extinct. Wharton Guest Lecture by Sandeep Kishore – Corporat...
 
Transform and Modernize -UK's leading specialists in Pension and Employee Ben...
Transform and Modernize -UK's leading specialists in Pension and Employee Ben...Transform and Modernize -UK's leading specialists in Pension and Employee Ben...
Transform and Modernize -UK's leading specialists in Pension and Employee Ben...
 

Dernier

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

HCLT Whitepaper: Accelerated Web Content Delivery

  • 1. Accelerated Web Content Delivery Sanjeet Joshi Architecture Technology Services HCL Technologies Ltd.
  • 2. Accelerated Web Content Delivery © 2010, HCL Technologies Ltd. November 2010 The author would like to thank Dr. Usha Thakur of ATS for her valuable help in content formatting and content enhancement. NON-DISCLOSURE OBLIGATIONS AND DISCLAIMER The data, information or material provided herein is confidential and proprietary to HCL and shall not be disclosed, duplicated or used in whole or in part for any purpose other than as approved by an authorized official of HCL in writing. The recipient agrees to maintain complete confidentiality of the information; data received and shall take all reasonable precautions/steps in maintaining confidentiality of the same, however in any event not less than the precautions/steps taken for its own confidential material. If you are not the intended recipient of this information, you are not authorized to read, forward, print, retain, copy or disseminate this document or any part of it. Any statements in this presentation that are not historical facts may include forward-looking statements that involve risks and uncertainties; actual results may differ from the forward-looking statements. Page 2 of 12
  • 3. Background In 1995 when the Internet was still in its infancy there were about 16 million users worldwide using it, compared to about 2 billion worldwide users today1. Over the last fifteen years not only has the number of people using the Internet grown exponentially, but we have also witnessed an evolution of technology standards, protocols, and information consumption patterns. The Internet is no longer limited to desktop/laptop computers. An increasing number of people on the go are using handheld devices to access their preferred websites. The easy access of websites has resulted in a significant increase in Web traffic. Today while designing a Web application or a website that is expected to generate a lot of interest, one has to ensure that the Web application has the right design and infrastructure to handle the extra load, failing which websites are likely to experience difficulties. For instance, the highly popular micro-blogging website twitter.com faced stability issues for a long time after its launch, since it was not designed to handle a large amount of traffic. The performance of a Web application is determined by multiple factors such as design and application architecture, quality of code and hardware infrastructure. Performance needs to be built at every layer of the technology stack to get a solid finished product. This paper focuses on the Web content caching aspect of website performance. Purpose Web caching is not a new idea. It has been in use for quite some time and current browsers, caching proxies, and Web servers provide support for it. However Web caching is most often an ignored aspect while designing a technology stack of a Web application. Web content caching can be implemented by content consumers (end users) to improve their Internet browsing experience or by content providers to reduce the load on their origin infrastructure, as well as to give their customers a better Web surfing experience. Caching at content consumer’s end is handled by Web browsers such as Internet Explorer, Firefox etc. This is done automatically and end users have limited control over how and what will be cached. Some organizations also install caching proxies to cache incoming Web content and to apply security policies. This paper will focus on caching solutions from a content provider’s point of view and the various ways in which content caching can enhance a website’s performance. The paper 1 See http://www.internetworldstats.com/stats.htm [November 2010] -> indicates when this site was accessed Page 3 of 12
  • 4. assumes that the reader is familiar with Web standards like HTTP, HTML and is technical in nature. It is targeted towards technology architects and solution designers. Web Caching Concepts The concept of caching has been widely used since the early days of computing and implemented at various layers in a technology stack. For example the processor chip layer has a hardware cache that is used for storing most frequently accessed instructions. Irrespective of where a cache is used, its main function is to store the most frequently accessed data (information or instructions) and its main goal is to improve performance by reducing read/computation cycle times. It is common knowledge that application level caching can be extremely beneficial in saving multiple expensive database reads or expensive repetitive computations thereby improving the overall application performance. HTTP caching or Web caching goes one layer above and caches entire static Web resources (e.g., HTML pages, CSS files etc) either at the client side (browser cache) or at the server side (origin cache infrastructure). Let us take a quick look at some of the common terms used with respect to caching in general and Web caching in particular. Origin server or origin infrastructure is the server infrastructure where Web servers or application servers are hosted. These servers are responsible for serving fresh content upon request. Time to live (TTL) Cacheable data has a validity period beyond which it is considered stale. This is referred to as TTL. It is a critical parameter because a very low TTL makes caching ineffective and a very high TTL results in stale data being served to clients. Cache hit occurs each time an HTTP request is served from cache. Cache hit ratio is the percentage of all requests that result in cache hits. A cache miss occurs when a request cannot be served from cache. Page 4 of 12
  • 5. Controlling Caching Behavior of Your Content Web browsers and caching proxies depend on the HTML and HTTP headers of the delivered content for determining if the content can be cached, and if so, for how long it can be cached. These cache headers can be tuned to define the cache behavior of a Web application/website. Cache Headers HTML authors can use tags in the <HEAD> section of the HTML page to dictate the caching behavior of that page. However, header tags for caching do not have defined standards and hence not all browsers or caches honor them. For example using <Pragma: no-cache> does not guarantee that the content will never be cached. Hence it is not advisable to use HTML cache headers. A more reliable approach is to use HTTP headers. HTTP headers are created by the Web server and sent in response to a request. The headers help the caching layer decide if the content can be cached, for how long it can be served, and when it needs to be refreshed from the origin server. Some important HTTP headers that control caching are as follows: Expires: Gives the date and time after which response is considered stale. For example, Expires: Sun, 06 Aug 2011 10:00:00 GMT. Cache-Control: Provides multiple options for controlling cache mechanism. They are as follows max-age=[seconds] — specifies the maximum time for which a resource will be considered fresh. Similar to Expire, this directive is relative to the time of the request, and not absolute. s-maxage=[seconds] — similar to max-age, except that it only applies to shared (e.g., proxy) caches. public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically private. private — allows caches that are specific to one user (e.g., in a browser) to store the response; shared caches (e.g., in a proxy) may not. no-cache — forces cache to submit each request back to the origin server for validation before releasing a cached copy. This is useful for ensuring that authentication has been respected (in combination with public) and for maintaining freshness without sacrificing all of the benefits of caching. no-store — instructs caches not to keep a copy of the representation under any conditions. Page 5 of 12
  • 6. must-revalidate — tells cache that it must obey any freshness information user gives about a representation. HTTP allows cache to serve stale representations under special conditions. proxy-revalidate — similar to must-revalidate, except that it only applies to proxy caches.  Note: One important point to remember here is that not all type of content can be cached. For instance, dynamic content generated using server side scripting cannot be cached under normal conditions. However, dynamically assembled content that does not change frequently can be cached by making those scripts return valid cache headers. Content Delivery Networks Content Delivery Networks (CDN) are established commercial solutions on the market that provide a Web content caching layer. These networks provide a transparent caching layer between Web clients and the origin infrastructure, and intercept every request going to the origin server. Typically CDNs have their cache servers distributed around the world and have smart algorithms for delivering cached content from the nearest (in terms of network hops) cache location. CDNs take a major chunk of content serving load away from the origin infrastructure thus reducing its load. CDNs are also used for delivering rich multimedia content such as audio and video files. Figure 1 illustrates where a CDN fits in the overall workflow. Origin Server Infrastructure Web clients www CDN http http http Figure 1: Positioning a CDN Although CDNs deliver huge value they may not be suitable for small organizations with limited budget because they are expensive to hire. CDNs are recommended mostly for organizations that want more control over the caching behavior of their content. In such cases, a custom CDN not only works out to be cheaper to implement but also gives immense control over caching. Page 6 of 12
  • 7. SQUID Proxy in Server Acceleration Mode Squid is an open source caching proxy product licensed under the GNU GPL. It is one of the most widely used, robust and feature-rich open source products available on the market. Squid is used by websites such as Wikipedia.org that witness very high traffic volumes. Squid can be installed as a proxy to improve client side Web surfing performance, apply security and filtering mechanism and apply organizational policies by monitoring outgoing requests. Squid can also be installed in a reverse proxy mode to improve server side content delivery performance. This is also known as server accelerator mode. A reverse proxy is setup close to the origin Web servers to serve incoming requests rather than outgoing requests. Origin Server Infrastructure Squid Web clients Reverse Proxies www http http Figure 2: Squid as Reverse Proxy A reverse proxy acts as an intermediary between a Web client and the origin Web server(s). It receives all content requests and delivers valid content available in cache. If the requested content is not available, the reverse proxy requests the origin server for the content. This reduces TCP connection and content rendering load on the origin servers making them available for other important tasks. Some key benefits of the afore-mentioned architecture are as follows. 1. LOAD BALANCING: If the Web server infrastructure requires expensive server hardware, Squid can be installed on a number of inexpensive commodity hardware boxes, thereby reducing the number of expensive origin servers. 2. SECURITY: This can also provide an effective security solution because the origin server infrastructure is hidden behind the Squid infrastructure layer. Hence any attack on the website is limited to the squid infrastructure, and any damage is limited to the cached content. 3. PERFORMANCE: A correctly tuned Squid installation can provide significant performance gains as the proxy is meant for serving cached content at very high speeds. It uses in- memory caching for better performance. Squid also provides various cache replacement policies that play a major role in determining the performance of a Squid server. Page 7 of 12
  • 8. Squid Cache Replacement Policies Cache replacement policy determines which objects in the cache can be replaced by other new objects that are most likely to be served and thereby improve the cache hit ratio. This is an important choice because it helps in disk and memory usage optimization. For example, the most popular objects should not be removed from the cache and least accessed cached objects should be replaced by more popular objects. There are various replacement policies offered by Squid. Below we provide a brief introduction to all of them. There is no single recommended or best policy. The right policy is chosen after studying the content and how it is accessed. LRU (Least Recently Used) LRU is a common and effective choice for most cache implementations. It removes objects with the greatest last accessed timestamp i.e. cached objects that are not accessed for a long time are the prime candidates for replacement. LRU works well when objects that are most recently accessed have a greater likelihood of being accessed again in the near future. LFUDA (Least Frequently Used with Dynamic Aging) LFU is another commonly used policy that keeps count of object references and then removes the least used objects. LFUDA is a variant of LFU that uses a dynamic aging policy to accommodate shifts in the set of popular objects. In the dynamic aging policy, the cache age factor is added to the reference count when an object is added to the cache or an existing object is modified. This prevents previously popular documents from polluting the cache. GDSF (Greedy Dual-Size Frequency) GDSF is an enhancement of GDS which takes into account the size of the cached object and the cost associated to retrieve it. GDFS takes into account frequency of reference. This policy is optimized for more popular, smaller objects in order to maximize object hit rate. Squid Deployment Topologies Multiple Squid servers can be configured to work together to improve cache hit ratios or to handle additional load. Squid caches, when installed in such a group, share either a sibling relationship or a parent relationship. Squid servers running as parents can have multiple sibling nodes communicating with it essentially forming a hierarchy. A flat topology may include Squid servers with only sibling relationships. If a request results in cache miss on a sibling node, it is transferred to the parent node. If parent also returns a cache miss then the parent contacts the origin server for fresh content. Page 8 of 12
  • 9. Squid Capacity Planning Squid's hardware requirements are generally modest. Memory is often the most important resource. A memory shortage significantly reduces performance. Higher hit ratios are obtained by caching more objects. Caching more objects requires more disk space. Therefore disk space is also an important factor that needs to be considered. Fast disks and interfaces are also beneficial in improving disk access time. SCSI performs better than ATA, and may be chosen if the higher cost can be justified. While fast CPUs are nice, they are not critical to good performance. Squid allocates a small amount of memory for each cached resource (up to 24 bytes per resource). As a rule of thumb it requires 32MB RAM for each GB disk space. So a server with 512MB RAM can serve a disk cache of 16GB, or for a 300GB disk cache, approximately 10GB RAM will be needed. Conclusion  Using reverse proxies for Web caching is a non-intrusive way of improving content delivery performance.  Reverse proxy based Web caching can be implemented as a cost effective replacement for commercial CDNs.  A customized CDN gives better control over the caching infrastructure and helps meet the specific performance needs of an enterprise as compared to an expensive commercial CDN which may provide limited configuration options.  CDNs can reduce considerable load from the origin servers thus freeing up the origin server resources for other tasks. Page 9 of 12
  • 10. Appendix A – Case Study “Squid Implementation for a Leading Global Entertainment Content Company” The customer uses Akamai Edge Server Platform for improved content delivery. Edge Server Platform’s design helps in improving content availability and reducing request response time. This ideally translates into less Web traffic coming directly to the Web servers (origin servers) thus improving the overall efficiency of the infrastructure and reducing infrastructure costs. Ironically though, it was observed that origin servers are receiving increased Web traffic from Akamai Edge servers themselves. A solution had to be put in place to tackle that problem with minimal impact on existing applications and content. Problem Context The Akamai Edge Platform offers a robust design for highly efficient content delivery across the globe. This is achieved by deploying several thousand servers at data centers all over the world (edge servers) and then replicating the content to be delivered on appropriate servers. The key then is to route all content requests from clients to the nearest (in terms of network hops) available server resulting in minimal response time and higher availability. Here the edge server act as a caching proxy that requests content from the origin server and then serves the cached copy until its expiry, at which point a fresh copy is again requested from the origin server. Akamai uses a hierarchical architecture for its edge platform to avoid thousands of edge servers making multiple refresh requests to the origin server. The problem is that the ‘innermost’ edge servers still need to make a refresh request to get the new content from the origin server. This results in the origin server having to serve each of the requests separately. This was the root cause of the problem. Origin Server Infrastructure Akamai CDN Foo.htm Foo.htm Foo.htm Foo.htm Figure 3 – High-level Problem Representation Page 10 of 12
  • 11. The customer summarized the problem at hand thus: - High traffic documents such as home pages were being requested from their origin servers as many as 70 times within a single TTL interval. This meant that there were that many innermost Akamai servers in the hierarchy. - Far too many requests were being received for pages, XML documents, dynamically generated JS, CSS etc. Customer felt that if the above-mentioned problems were addressed, the availability of the origin servers would rise close to to 99.99%. Solution Approaches Considered by HCL Below is a brief summary of the approaches evaluated by the HCL team and its assessment of those approaches. Approach 1: Custom Solution - Application Server Side The first approach called for intercepting incoming content refresh requests from the Akamai servers to the origin servers, queuing and prioritizing them, and then rendering the highest priority content. HCL Assessment of Approach 1  Solution was a workable one but complex and many race conditions would have to be considered before the solution’s effectiveness became known.  Robustness and performance of such a solution was not obvious.  Solution mandated changes to the application layer which could have resulted in a cascading effect on the underlying layers. Approach 2: Using Pre-fetch Settings Provided by Akamai The second approach called for asynchronous content refresh. When this feature is enabled in Akamai, the content refresh requests are sent even before the content becomes stale. Akamai servers continue to serve the existing content even after sending refresh requests, thereby refreshing content asynchronously. HCL Assessment of Approach 2  Solution seemed like it was a perfect fit for the problem at hand, but it would not provide a complete solution.  Solution would work well only when content was requested during the threshold set by pre-fetch settings. For example if pre-fetch was set to 90%, Akamai servers would send refresh requests to origin after 90% of TTL were over.  Core problem of receiving multiple requests for the same content would remain unaddressed Page 11 of 12
  • 12. HCL’s Squid Reverse Proxy-Based Solution The HCL solution was based on the following design principles: 1. Minimal or no changes to the application layer 2. No rework for content producers or brand owners 3. Once installed, solution should work transparently (without any other layers being aware of its existence) 4. Solution should be repeatable/reusable Using Squid as a Reverse Proxy The goal of HCL’s solution was to minimize the number of requests going to the origin servers while still serving as fresh content as possible. As a first step, the HCL team proposed the installation of Squid in the reverse proxy mode on a separate infrastructure. This introduced an additional caching layer between Akamai servers and the origin servers. Upon setup, it cached all the relevant content and served it whenever requested by Akamai. The team used advanced cache control setting provided by Squid (v 2.7) to control the number of redundant requests for a single resource and to also support asynchronous refresh. Goals Achieved The solution proposed by the HCL team passed the rigorous performance checks with over 90% load reduction. Page 12 of 12