Counting more than 6 million pageviews each month and being listed by alexa.com as one of the 200 most visited websites from Portugal, SerBenfiquista.com is probably the most popular website in Portugal powered by Drupal. It is built and maintained since 2001 by a community of fans that only have scarce resources available for running it, so the performance of the site must be planned with caution to tolerate usage peaks that can reach on match days about 2000 online users. In this session every step that we took during the migration to a Drupal architecture will be detailed. Starting by the design and architecture and passing by the configuration performed on several cache levels and database and server tuning. Pressflow, AuthCache, Memcached, nginx, APC and lots of imagination are active stake holders and will be stared in the credits. Unfortunately not everything has the same glory as Benfica, so it is imperative to talk about the problems we had since the opening date and how little details can become critical when exposed to an unexpected traffic load and how we learnt from our experience to prevent them.
Serbenfiquista.com Drupal Performance Case Study, Drupal Camp Lisbon 2011
1. Drupal
Performance
-‐
Case
Study
Hernâni
Borges
de
Freitas
hernani@acquia.com
@hernanibf
DrupalCamp
Lisboa
2011
2. Biggest
online
fan
community
about
Benfica.
About
10
years
old.
Done
as
hobby
by:
Staff
around
20
people
Members
(~31
000
users,
8000
Active
Members)
Articles,
blog
aggregation,
image
gallery,
forum,
press
aggregation,
matches
and
players’
profiles.
3. January
2011
7M
Pageviews
total
▪ 225k
/
Day
590,848
Visits
▪ 19k
Visits
/
Day
▪ ~8k
Unique
Visits
/
Day
12
pages
per
Session
(!)
5500
forum
messages
per
day.
50
blogs
aggregated.
4. According
to
Alexa.com
(Fev
2011)
170th
most
visited
website
in
Portugal.
On
top
50
portuguese
websites.
Most
popular
portuguese
website
made
in
DRUPAL!
5. Technology
was
slowing
us
down
Custom
designed
CMS,
with
10
years
legacy
code
and
cache
control
based
in
Smarty.
Site
must
be
done
by
community
Workflow
and
revision
process
was
weak
and
unsafe
Developments
took
to
much
time
Hard
to
implement
or
change
existing
features.
Performance
problems
on
traffic
peaks
Around
1700
users
at
same
time.
6. Dedicated
Server
Quad
Xeon
2,4
Ghz
4
Gb
Memory
Lighttpd/Apache
as
App
Server
Php
with
eAccelerator.
Most
of
pageviews
are
seen
by
registered
users.
Most
pageviews
are
generated
by
forum
(based
on
Simple
Machines
Forum).
Cache
control
was
made
using
smarty
and
forum
cache
system.
7. Development
started
in
August
2010
Done
on
spare
time
by
1
Drupalista.
We
went
live
in
24th
October
2010
(2
months
of
dev...)
Website
redone
from
scratch
Data
migration
done
used
custom
scripts
60k
nodes,
30k
users,
2,5k
terms,
16
content
types
100
modules
Portuguese/English
,
Web/Mobile
Site
9. Use
Pressflow!
High
speed
drupal
fork
▪ Optimized
for
PHP5
and
Mysql
(No
wrapper
functions)
▪ Designed
for
Database
replication
and
Reverse
proxies.
▪ Squid
/
Varnish
▪ Optimized
in
session
and
path
handling
▪ Non
Anon-‐Sessions
(Lazy
session
creation)
▪ Fast
path
alias
detection
Avoid
tons
of
calls
to
drupal_lookup_path.
10. Use
Devel
Module
Use
xdebug.profiling.
Identify
heavier
pages,
functions
and
queries.
Start
by
most
visited
pages
Try
to
identify
which
functions
are
taking
most
time.
You’ll
find
a
non
pleasant
detail
Drupal
Bootstrap
in
a
normal
site
can
be
slow.
Great
to
understand
how
drupal
core
works!
Great
to
measure
cost
vs
importance
11. Generic
Drupal
cache
handling
Page
Cache
Content
Cache
Function
Cache
• Only
for
• Block
Cache
• Static
Anonymous
• Filter
Cache
variables
Users
• Modules
already
Cache
fetched
from
db
12. If
user
is
anonymoys
and
cache
system
is
active
On
Bootstrap
▪ Check
if
we
have
a
valid
cached
page
for
that
url
and
delivery
it
without
load
all
modules,
render
all
regions...
FAST!
Blocks
and
some
content
is
also
cached,
and
can
be
served
to
authenticated
users.
Cached
content
is
stored
in
database
tables
Tables
are
flushed
Nodes
and
comments
are
posted
Cron
runs
Explicits
calls
to
cache_clear_all
13. However
most
of
our
traffic
is
authenticated
We
can’t
use
drupal
base
cache
There’s
a
module
for
that
!
=>
authcache
AuthCache
Register
cookie
variables
on
login
like
roles,
login
name
and
login
profile.
On
page_early_cache
bootstrap
verify
if
there’s
a
cached
version
of
that
page
to
the
roles
the
user
belongs.
If
there
isn’t,
do
full_bootstrap
,
render
the
page
and
save
a
rendered
version
on
cache
to
future
usage.
14. Include
a
setting
in
settings.php:
$conf['cache_inc']
=
'./sites/all/modules/authcache/authcache.inc';
Configure
roles
and
pages
to
cache
We
are
not
caching
anything
to
editors/
moderators,
neither
any
page
in
admin
section
or
content
edition.
Be
Careful
with
ajax
stuff..
15. Small
problem:
all
page
looks
the
same
to
everyone.
We
want
to
customize
the
header
with
a
pleasant
message.
Authcache
recommendation
is
to
do
page
replacements
using
ajax
calls
=>
More
http
calls
To
avoid
http
traffic
I
tweaked
authcache
module
to
do
a
str_replace
of
a
certain
zone,
and
start
to
store
cached
pages
not
gzipped.
16. MySql
is
not
designed
to
be
used
as
a
cache
backend.
Fortunately
Drupal
allows
pluggable
cache
backends.
We
started
by
using
cache
router,
using
memcached
One
bin
for
most
cache
table
We
ended
up
using
memcache
module
because
of
several
crashs
and
white
screens.
18. On
popular
pages
use
custom
cache!
if($content=cache_get('artigos:home:cronicas','cache_block'))
{
$block-‐>content=$content-‐>data;
}
else
{
$view=views_get_view('artigos');
$view-‐>set_display('panel_pane_1');
$block-‐>content=$view-‐>render();
$block-‐>content.='<span
style="clear:both;float:right">'.l(t("View
all"),'cronicas').'</span>’;
cache_set('artigos:home:cronicas',$block-‐>content,'cache_block');
}
We
are
storing
in
cache_block
to
allow
block
refresh
when
new
content
arrives.
19. UPDATE
sessions
SET
uid
=
1,
cache
=
0,
hostname
=
'10.1.1.2',
session
=
'',
timestamp
=
1243567406
WHERE
sid
=
'74db42a7f35d1d54fc6b274ce840736e'
Sessions
Table
was
heavily
used.
We
replace
it
with
memcache
session
module.
$conf['session_inc']
=
'./sites/all/modules/memcache/memcache-‐session.inc';
Serious
dropdown
on
server
load
Pressflow
already
does
not
store
sessions
for
anon
users
=>
non_anon
module
does
the
same.
20. In
forum
pages
just
call
what
is
needed.
require_once
'./includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_PATH);
$arg0=arg(0);
if($arg0=='forum’)
{
require_once
'./includes/common.inc';
drupal_load('module','serbenfiquista');
drupal_load('module','filter');
drupal_load('module','locale');
require_once
'./includes/theme.inc';
$content=render_forum();
/*
do
some
load
of
modules
when
content
not
cached,
vars
will
be
available
at
theme
….
*/
require_once
('./sites/default/themes/serbenfiquista/page.tpl.php');
}
21. Remember
we
were
redirecting
only
non-‐
static
content
to
apache
$HTTP["url"]
!~
".(js|css|png|gif|jpg|ico|txt|swf)$"
{
proxy.server
=
(
""
=>
(
(
"host"
=>
”localhost
",
"port"
=>
81)
)
)
}
What
about
css/js
aggregated
files
and
imagecache
files
?
=>
They
were
going
to
apache
also..
23. Apache
was
handling
too
much
connections.
We
were
runnning
out
of
memory,
and
no
more
connections
available...
After
that
nightmare
by
night
we
decide
to
switch
to
nginx.
.
24. Using
php
in
php-‐fpm
mode
Configuration
based
on
perusio’s
examples:
https://github.com/perusio/drupal-‐with-‐nginx
Using
APC
3.1.7
as
opcode
cache
to
use
shared
keys
in
memory
used
by
SMF.
apc.enabled
=
1
apc.shm_segments
=
1
apc.shm_size
=
300
apc.max_file_size
=
100M
apc.stat=0
#avoid
to
check
if
file
was
changed
25. Don’t
forget
to
use
Css
and
JS
aggregation
to
avoid
http
connections
Index
customization
(Use
EXPLAIN
to
understand
queries)
select
title,created,nid
from
node
use
index(node_status_type)
where
status=1
and
type='usernews'
order
by
created
desc
limit
0,4
Run
cron
twice
a
day
0
*/12
*
*
*
cd
/var/www/html/
&&
drush
core-‐cron
>>
/var/log/crondrupal
Do
not
use
cron
to
import
feeds
*/10
*
*
*
*
cd
/var/www/html/
&&
drush
feeds-‐refresh
blogs
>>
/var/log/cronblogs
Use
Apache
Solr
to
index
your
content.
26. We
have
load
peaks
when
some
content
is
changed:
most
of
cached
content
is
erased.
Control
in
detail
what
is
cached
and
expire
only
what
is
invalid.
Pre-‐Cache
most
page
details.
Use
Cache
Actions
and
Rules
to
clean
specific
views/blocks
/
panes.
When
page
is
regenerated
its
components
are
already
rendered.
27. SerBenfiquista.com
by:
Alberto
Rodrigues,
André
Garcia,
André
Sabino,
André
Marques,
António
Alves,
António
Costa,
Bernardo
Azevedo,
Diogo
Sarmento,
Élvio
da
Silva,Filipe
Varela,
Francisco
Castro,
Hugo
Manita,
Hernâni
Freitas,
João
Pessoa
Jorge,
João
Cunha,
João
Mariz,
José
Barata,
Isabel
Cutileiro,,
Luis
Correia,Miguel
Ferreira,
Nelson
Vidal,
Nuno
Silva,
Paulo
Paulos,
Pedro
Lança,
Pedro
Neto,
Rafael
Santos,
Rodrigo
Marques,
Ricardo
Martins,
Ricardo
Solnado,
Valter
Gouveia
and
plenty
others
!