Most PHP developers focus on writing code. But creating Web applications is about much more than just wrting PHP. Take a step outside the PHP cocoon and into the big PHP ecosphere to find out how small code changes can make a world of difference on servers and network. This talk is an eye-opener for developers who spend over 80% of their time coding, debugging and testing.
12. Who am I ?
Wim Godden (@wimgtr)
Founder of Cu.be Solutions (http://cu.be)
Open Source developer since 1997
Developer of PHPCompatibility, PHPConsistent, Nginx SLIC, ...
Speaker at PHP and Open Source conferences
13. Cu.be Solutions ?
Open source consultancy
PHP-centered (ZF2, Symfony2, Magento, Pimcore, ...)
Training courses
High-speed redundant network (BGP, OSPF, VRRP)
High scalability development
Nginx + extensions
MySQL Cluster
Projects :
mostly IT & Telecom companies
lots of public-facing apps/sites
14. Who are you ?
Developers ?
Anyone setup a MySQL master-slave ?
Anyone setup a site/app on separate web and database server ?
→ How much traffic between them ?
15. The topic
Things we take for granted
Famous last words : "It should work just fine"
Works fine today
→ might fail tomorrow
Most common mistakes
PHP code ↔ PHP ecosystem
17. Database queries – complexity
SELECT DISTINCT n.nid, n.uid, n.title, n.type, e.event_start, e.event_start AS event_start_orig,
e.event_end, e.event_end AS event_end_orig, e.timezone, e.has_time, e.has_end_date, tz.offset
AS offset, tz.offset_dst AS offset_dst, tz.dst_region, tz.is_dst, e.event_start - INTERVAL IF(tz.is_dst,
tz.offset_dst, tz.offset) HOUR_SECOND AS event_start_utc, e.event_end - INTERVAL IF(tz.is_dst,
tz.offset_dst, tz.offset) HOUR_SECOND AS event_end_utc, e.event_start - INTERVAL IF(tz.is_dst,
tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_start_user,
e.event_end - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0
SECOND AS event_end_user, e.event_start - INTERVAL IF(tz.is_dst, tz.offset_dst, tz.offset)
HOUR_SECOND + INTERVAL 0 SECOND AS event_start_site, e.event_end - INTERVAL
IF(tz.is_dst, tz.offset_dst, tz.offset) HOUR_SECOND + INTERVAL 0 SECOND AS event_end_site,
tz.name as timezone_name FROM node n INNER JOIN event e ON n.nid = e.nid INNER JOIN
event_timezones tz ON tz.timezone = e.timezone INNER JOIN node_access na ON na.nid = n.nid
LEFT JOIN domain_access da ON n.nid = da.nid LEFT JOIN node i18n ON n.tnid > 0 AND n.tnid =
i18n.tnid AND i18n.language = 'en' WHERE (na.grant_view >= 1 AND ((na.gid = 0 AND na.realm =
'all'))) AND ((da.realm = "domain_id" AND da.gid = 4) OR (da.realm = "domain_site" AND da.gid =
0)) AND (n.language ='en' OR n.language ='' OR n.language IS NULL OR n.language = 'is' AND
i18n.nid IS NULL) AND ( n.status = 1 AND ((e.event_start >= '2010-01-31 00:00:00' AND
e.event_start <= '2010-03-01 23:59:59') OR (e.event_end >= '2010-01-31 00:00:00' AND
e.event_end <= '2010-03-01 23:59:59') OR (e.event_start <= '2010-01-31 00:00:00' AND
e.event_end >= '2010-03-01 23:59:59')) ) GROUP BY n.nid HAVING (event_start >= '2010-02-01
00:00:00' AND event_start <= '2010-02-28 23:59:59') OR (event_end >= '2010-02-01 00:00:00' AND
event_end <= '2010-02-28 23:59:59') OR (event_start <= '2010-02-01 00:00:00' AND event_end >=
'2010-02-28 23:59:59') ORDER BY event_start ASC;
18. Database - indexing
'select id from stock where status = 2 order by qty'
→ aggregate index on (status, qty)
'select id from stock where status > 2 order by qty'
→ aggregate index on (status, qty) ?
→ Depends :
- Btree : yes
- Hash : range selection stops use of aggregate index
→ separate index on status and qty (since recent versions)
19. Database - indexing
Indexes make database faster
→ Let's index everything !
→ DON'T :
Insert/update/delete → Index modification
Each select → evaluation of all indexes
"Relational schema design is based on data
but index design is based on queries"
- Bill Karwin,
author of “SQL Antipatterns”
20. Databases – detecting problematic queries
Slow query log
→ SET GLOBAL slow_query_log = ON;
Queries not using indexes
→ In my.cnf/my.ini : 'log_queries_not_using_indexes'
General query log
→ SET GLOBAL general_log = ON;
→ Turn it off quickly !
Percona Toolkit
pt-query-digest
25. Databases – next step : explain
Type of lookup
'system', 'const' and 'ref' = good
'ALL' = bad
Extra info
Using index = good
Using filesort = usually bad
26. For / foreach (N+1 problem)
$customers = CustomerQuery::create()
->filterByState('MN')
->find();
foreach ($customers as $customer) {
$contacts = ContactsQuery::create()
->filterByCustomerid($customer->getId())
->find();
foreach ($contacts as $contact) {
doSomestuffWith($contact);
}
}
28. Better...
10001 → 1 query
Sadly : people still produce code with query loops
Usually :
Growth not anticipated
Internal app → Public app
29. The origins of this talk
Customers :
Projects we built
Projects we didn't build, but got pulled into
Fixes
Changes
Infrastructure migration
15 years of 'how to cause mayhem with a few lines of code'
30. Client X
Jobs search site
Monitor job views :
Daily hits
Weekly hits
Monthly hits
Which user saw which job
31. Client X
Originally : when user viewed job details
Now : when job is in search result
Search for 'php' → 50 jobs = 50 jobs to be updated
→ 50 updates for shown_today
→ 50 updates for shown_week
→ 50 updates for shown_month
→ 50 inserts for shown_user
= 200 queries for 1 search !
32. Client X : the code
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number +
1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
37. Client X : possible cause ?
Code changes ?
→ According to developers : none
Action : turn on general log, analyze with pt-query-digest
→ 50+-fold increase in 4 queries
→ Developers : 'Oops we did make a change'
After 3 days : 2,5 days behind
Every hour : 50 min extra lag
38. Client X : But why is the slave lagging ?
Master Slave
File :
master-bin-xxxx.log
File :
master-bin-xxxx.logSlave I/O thread
Binlog dump
thread
Slave
SQL
thread
41. Client X : fix ?
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number +
1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
42. Client X : the code change
insert into shown_today values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_week values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_month values (5, 1), (8, 1), (12, 1), (18, 1), … on duplicate key … ;
insert into shown_user values (5, 23, "2013-11-12 12:01:00"), (8, 23, "2013-11-12
12:01:00"), … ;
43. Client X : the code change
$todayQuery = "
insert into shown_today(
jobId,
number
) values ";
foreach ($jobs as $job) {
$todayQuery .= "(" . $job['id'] . ", 1),";
}
$todayQuery = substr($todayQuery, 0, strlen($todayQuery) - 1);
$todayQuery .= "
on duplicate key
update
number = number + 1
";
$db->query($todayQuery);
44. Client X : the chosen solution
$db->autocommit(false);
foreach ($jobs as $job) {
$db->query("
insert into shown_today(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_week(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_month(
jobId,
number
) values(
" . $job['id'] . ",
1
)
on duplicate key
update
number = number + 1
");
$db->query("
insert into shown_user(
jobId,
userId,
when
) values (
" . $job['id'] . ",
" . $user['id'] . ",
now()
)
");
}
$db->commit();
45. Client X : conclusion
For loops are bad (we already knew that)
Add master/slave and it gets much worse
Use transactions : it will provide huge performance increase
Better yet : use MariaDB 10 or higher → slave_parallel_threads
Result : slave caught up 5 days later
46. Database → Network
Customer Y
Top 10 site in Belgium
Growing rapidly
At peak traffic :
Unexplicable latency on database
Load on webservers : minimal
Load on database servers : acceptable
49. Client Y : network overload
Cause : Drupal hooks → retrieving data that was not needed
Only load data you actually need
Don't know at the start ? → Use lazy loading
Caching :
Same story
Memcached/Redis are fast
But : data still needs to cross the network
50. Network trouble : more than just traffic
Customer Z
150.000 visits/day
News ticker :
XML feed from other site (owned by same customer)
Cached for 15 min
51. Customer Z – fetching the feed
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
unlink(APP_DIR . '/tmp/cacheFile.xml');
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
file_get_contents('http://www.scrambledsitename.be/xml/feed.xml')
);
}
$xmlfeed = ParseXmlFeed(APP_DIR . '/tmp/cacheFile.xml');
What's wrong with this code ?
52. Customer Z – no feed without the source
Feed source
53. Customer Z – no feed without the source
Feed source
54. Customer Z : timeout
default_socket_timeout : 60 sec by default
Each visitor : 60 sec wait time
People keep hitting refresh → more load
More active connections → more load
Apache hits maximum connections → entire site down
60. Customer Z : process early
$context = stream_context_create(
array(
'http' => array(
'timeout' => 5
)
)
);
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
}
}
61. Customer Z : file_[get|put]_contents atomicity
if (filectime(APP_DIR . '/tmp/cacheFile.xml') < time() - 900) {
$feed = file_get_contents(
'http://www.scrambledsitename.be/xml/feed.xml',
false,
$context
);
if ($feed !== false) {
file_put_contents(
APP_DIR . '/tmp/cacheFile.xml',
ParseXmlFeed($feed)
);
}
}
Relying on user → concurrent requests → possible data corruption
Better : run every 15min through cronjob
62. Network resources
Use timeouts for all :
fopen
curl
SOAP
…
Data source trusted ?
→ setup a webservice
→ let them push updates when their feed changes
→ less load on data source
→ no timeout issues
Add logging → early detection
63. Logging
Logging = good
Logging in PHP using fopen
→ bad idea : locking issues
→ Use monolog : file, syslog, mail, Pushover, HipChat, Graylog,
Rollbar, ElasticSearch (and 50 more)
For Firefox : FirePHP (add-on for Firebug)
Debug logging = bad on production
Watch your logs !
Don't log on slow disks → I/O bottlenecks
64. File system : I/O bottlenecks
Causes :
Excessive writes (database updates, logfiles, swapping, …)
Excessive reads (non-indexed database queries, swapping, small file
system cache, …)
How to detect ?
top
iostat
See iowait ? Stop worrying about php, fix the I/O problem !
Cpu(s): 0.2%us, 3.0%sy, 0.0%ni, 61.4%id, 35.5%wa, 0.0%hi
avg-cpu: %user %nice %system %iowait %steal %idle
0.10 0.00 0.96 53.70 0.00 45.24
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
sda 120.40 0.00 123289.60 0 616448
sdb 2.10 0.00 4378.10 0 18215
dm-0 4.20 0.00 36.80 0 184
dm-1 0.00 0.00 0.00 0 0
65. Much more than code
DB
server
Webserver
User
Network
XML feed
70. Step-by-step : most common issues
Using NFS ? Get rid of it ;-)
iowait on database server
I/O reads (use iostat) ? → missing/wrong indexes
I/O writes ?
→ no transactions ?
→ too many queries ?
→ bad DB engine settings
iowait on webserver (logs ? static files ?)
CPU on database server (missing/wrong/too many indexes)
CPU on webserver (PHP)
Notes de l'éditeur
5kbit/sec or 100Mbit/sec ?
Let&apos;s talk about code
Without : we don&apos;t exist
What are most common mistakes in ecosystem
Let&apos;s start with the database
time spent per query pattern
how many queries of that query pattern
Get back to what I said
Lots of people use ORM
- easier
- don&apos;t need to write queries
- object-oriented
but people start doing this
Imagine 10000 customers → 10001 queries
Not best code
Uses deprecated mysql extension
no error handling
Master : 16 CPU cores
12 cores for SQL
1 core for binlog dump
rest for system
Slave : 16 CPU cores
1 core for slave I/O
1 core for slave SQL
Grouping
Works fine, but :
maximum size of string ?
PHP = no limit
MySQL = max_allowed_packet
Grouping
Works fine, but :
maximum size of string ?
PHP = no limit
MySQL = max_allowed_packet
All in a single commit
Note : transaction has max. size
Possible : combination with previous solution
took few moments to figure out
No network monitoring
→ iptraf
→ 100Mbit/sec limit
→ packets dropped
→ connections dropped
Customer : upgrade switch
Us : why 100Mbit/sec ?
Databases → network
What other network related issues ?
Server on which feed located : crashed
Fine for few minutes (cache)
15 minutes : file_get_contents uses default_socket_timeout
Better, not perfect.
What else is wrong ?
Multiple visitors hit expiring cache
→ file delete
→ xml feed hit a lot
Better, not perfect.
What else is wrong ?
Multiple visitors hit expiring cache
→ file delete
→ xml feed hit a lot
Better, not perfect.
What else is wrong ?
Multiple visitors hit expiring cache
→ file delete
→ xml feed hit a lot
Better, not perfect.
What else is wrong ?
Multiple visitors hit expiring cache
→ file delete
→ xml feed hit a lot
Better, not perfect.
What else is wrong ?
Multiple visitors hit expiring cache
→ file delete
→ xml feed hit a lot
Better, not perfect.
What else is wrong ?
Multiple visitors hit expiring cache
→ file delete
→ xml feed hit a lot
How do you treat your data :
- where do you get it
- how long did you have to wait to get it
- how is it transported
- how is it processed
minimize the amount of data :
retrieved
transported
processed,
sent to db and users