This document describes how caching was implemented to improve the performance of a complex travel booking engine built on Joomla. It took an average of 45-65 seconds to display search results initially due to multiple remote XML requests. Caching was applied at various levels including pages, XML queries, images, and descriptions, which sped up performance by 10 times, reducing average load times to 5-6 seconds. The document also outlines major changes to Joomla's caching framework in version 1.6 that improved its functionality and APIs.
3. HOW IT ALL CAME ABOUT
// How I got involved with cache //
4. Quite an amazing story
of a willpower
and turning a failure into a sucess
with contributions to common good
as a final consequence
5. In 2007 approached by a client to build travel portal for
a client.
No clearly defined business strategy, exact site scope was
unknown
Joint venture starting with visibility study
6. NOT AT ALL AN EASY TASK!
especially if you look unimportant in the eyes of big players
10. Vacations, Hotels, Travel and Airtickets
booking management system html integration
Already made airticket
booking engine
Website search box
XML search
CORE
XML results
Routing searches
suppliers
German
Traffics
Combining results Rendering results
Tibet
temporary booking Routing booking requests
XML over XML
static XML Booking request form
XML
Other SLO suppliers
Traffics
Tibet ADMINISTRATION
Cosmo
Palma
Own inventory management
Manual booking Bookings management
16. Probably one of the most complex and technologically advanced
sites ever built on Joomla (it is simple..but in a Google way)
17. Combined power of Joomla php framework
and ExtJS javascript framework used on a presentation layer
18. Results from own database merged with results from multiple
different on-demand XML data sources
Descriptions & images 1
Descriptions & images 2
Website search box
XML search
CORE
suppliers
German
Traffics XML results
Routing searches
Tibet Combining results
XML Rendering results
temporary booking
Routing booking requests
over XML
Own database Booking requests
19. Initially it took on average 45-65 seconds to perform a search &
render results.
Site badly needed a speed boost.
29. Typical web site is displaying the same content over and
over again
No cache:
everything has to be generated for each and every page
view
Caching:
Ÿ some or all of the information your code generates
store
in a cache object
Ÿ it when next user requests the same page or
serve
particular piece of information.
33. Takes snapshots of entire pages including everything -
component, modules, plugins and a template.
The widest and the least flexible approach of all caching
options.
// enabled by core system plugin -> site administrators choice //
35. A group as they they both create a static copy of complete
output of a component or a module
J1.5.
Records based on the calling URL // UNSAFE, DOS //
J1.6.
Changes // more later //
36. Important difference // J1.5., changed in J1.6! //
module caching can only be set on/off for all instances,
can’t be controlled from within the module
37. Most widespread cache type, sometimes equaled with J
caching in general.
Performs well in the speed terms
Disables any
user<->extension<->framework
interaction until a cached copy has expired
Not suitable for any components or modules that react to
user actions or render frequently changing content.
38. Side effect
Cached copy includes only modules or components own
output -> any external file that is called by using
methods like $document->addStyleSheet() won't get included
Workarounds performed to come around this limitations
Catch22
Workarounds require computing time -> diminish the
effect
40. The first of flexible caching types that enable us to
differentiate between various parts of the extension and
cache only those parts that are cacheable, while keeping
dynamic parts uncached.
41. Caches results of function calls
Records (cacheID) based on the arguments passed to the
function
42. Often useful to cache model methods and keep views
uncached.
Example use
Model performing expensive queries or similar operations
is run only once to create a larger pool of data which is
then further manipulated inside the view (sorting,
pagination, calculations etc.)
48. Fully controlled by the coder – what, when to store, how
to classify stored units (cache id).
Highly useful when we are dealing with finite number of
reusable data units
49. Example use
High number of possible combinations of relatively small
number of units – e.g. products in online store. No point
to cache multiple parameter searches, cache each product
separately.
Other examples
Expensive queries, remote XML, thumbnails, reusable
texts or any reusable data set.
50. Also useful to pass large amounts of data between
different parts of application (e.g. steps in click flow)
Used in Joomla core: list of components, list of modules,
menu tree, available languages, user groups, html
renderers etc.
53. DIAGNOSIS
Largest portion of time spent on remote XML requests,
waiting for and receiving replies.
Other two performance hogs were resizing images and
internal database queries.
2nd and other pages of paginated results displayed
instantly
54. Descriptions & images 1
Descriptions & images 2
Each result from badly designed
primary XML paired with description
and image that come from separate
XML sources (one by one)
Website search box
XML search
CORE
suppliers
German
Traffics XML results
Routing searches
Tibet Combining results
XML Rendering results
temporary booking
Routing booking requests
over XML
Own database Booking requests
55. Descriptions & images 1
Descriptions & images 2
SLOW
Website search box
XML search
CORE
suppliers
German
Traffics XML results
Routing searches
Tibet Combining results
XML Rendering results
temporary booking
Routing booking requests
over XML
Own database Booking requests
56. SOLUTION
Cache data on multiple levels:
Ÿ pages that were fully rendered - a lot of users
Cache
search with default parameters or click lastminute
links, most just check one or two pages of paginated
results
Ÿ XML queries (with longer lifetime). Pass data
Cache
from short results to detail pages
Long-term caching of images and descriptions -> most
Ÿ
important one
57. ALSO..
Add more indexes to Mysql tables
Query and render images and descriptions per page:
primary source queried once, secondary separately for
each subpage (pagination).
58. RESULT
Site runs 10 times faster
Average first page loads in 5-6 seconds
Subpages (paginated) load in 0.5 - 1 seconds
Those timings further drop dramaticaly (50% or more) on
peak hours, when most popular searches are returned
from the cache.
The bigger number of users on a site, the faster it goes.
// within the hosting limits //
62. Most important framework changes
Cache handlers are now known as cache controllers (page,
view, output, callback)
Parent JCacheController was added - among other things
it controls raw get and store calls.
63. New cachelite and wincache storage handlers (drivers).
File cache handler heavily optimised
All other handlers fixed with missing functions (gc, clean)
added, their code cleaned and tested and should now be
working properly.
Semaphore locking was added for reliability and
improved performance
65. CMS FRAMEWORK LEVEL CHANGES
Caching implemented in all components and modules that
can potentially gain from using it
Caching added to some most expensive and frequent
framework calls:
JComponentHelper::_load(), JModuleHelper::_load(), JMenuSite::load(),
JDocumentHTML::getBuffer()..
66. USER LEVEL FUNCTIONAL CHANGES
Cache administration (Clean cache, Purge cache) now
works with all storage handlers
New standalone garbage collect script to be run from a
crontab added
libraries/joomla/utilities/garbagecron.php
68. COMPONENT VIEW CACHE
Takes an array of url parameters and their types to create
Cacheid.
A replacement for a previous unsafe way which took the
whole URL and so opened the doors for DOS attacks via
random parameters/values added to request
// Old cacheid created from URL retained for backwards compatibility if there are no
$safeurlparams (to be removed in 1.7) //
70. MODULE CACHE
5 different modes of operation, 3 of them are to be set
from module XML file, while 2 are meant to be used from
within the module itself.
Default is backwards compatible oldstatic mode that
requires no changes to a module.
71. MODES TO BE SET IN XML
Static - one cache file for all pages with the same module
parameters
Oldstatic - 1.5. definition of module caching, one cache
file for all pages with the same module id and user aid.
Default for backwards compatibility
Itemid - changes on itemid change
72. In addition to cache field that was required in 1.5 there is
now new hidden field called cachemode that sets any of
the above modes.
<field name="cachemode" type="hidden" default="static">
<option value="static"></option>
73. MODES TO BE CALLED FROM INSIDE THE MODULE:
Safeuri - id is created from URL params array, as in
component view cache
Id - module sets own cache id's
To use this modes rename 'cache' field in xml to
'owncache' field and call JModuleHelper::ModuleCache from within
// actually a shortcut to cache callback to avoid code duplication in every module //
74. An example that uses safeuri mode and replaces uncached
$list = modRelatedItemsHelper::getList($params) :
$cacheparams->modeparams = array('id'=>'int','Itemid'=>'int');
$cacheparams->methodparams = $params;
$cacheparams->method = 'getList';
$cacheparams->class = 'modRelatedItemsHelper';
$cacheparams->cachemode = 'safeuri';
$cacheparams = new stdClass;
$list = JModuleHelper::ModuleCache ($module, $params, $cacheparams);
75. RAW CACHE
Raw cache get and store are easily accesed by passing ''
(empty string) as cache controller to JFactory::getCache
Data is auto serialized / deserialized
Locking & unlocking are performed automaticaly