Kill bottlenecks with gearman, sphinx, and memcached, Confoo 2011
1. Killing Bottlenecks
paul reinheimer @preinheimer 1
Thursday, March 10, 2011 1
2. Goals for Today Horror Stories
Introduce Tools
Help you recognize problems
Hints on implementation
2
Thursday, March 10, 2011 2
3. Provides a simple queue service
Workers connect and identify work they can complete, clients
hand off jobs when available
Created by Danga Interactive (livejournal)
3
Thursday, March 10, 2011 3
4. Stores data in memory
really really fast!
Many servers can be used at once to provide a single large
datastore
Created by Danga Interactive (livejournal)
4
Thursday, March 10, 2011 4
5. Excellent search tool
Import data from a variety of sources (commonly a database),
index it quickly, then run searches
5
Thursday, March 10, 2011 5
7. Workers sign up with work they can do, many workers can each
sign up with a different set of jobs.
Clients connect to Gearman and pass off jobs, this is done
asynchronously, they get to keep working
Failed jobs are re-submitted a specified number of times.
7
Thursday, March 10, 2011 7
8. Gearman
Workers sign up with work they can do, many workers can each
sign up with a different set of jobs.
Clients connect to Gearman and pass off jobs, this is done
asynchronously, they get to keep working
Failed jobs are re-submitted a specified number of times.
8
Thursday, March 10, 2011 8
17. Gearman Alternatives
RabitMQ
Amazon Simple Queue Service
ZeroMQ
15
Thursday, March 10, 2011 15
18. Stores data in memory
really really fast!
Many servers can be used at once to provide a single large
datastore
Created by Danga Interactive (livejournal)
16
Thursday, March 10, 2011 16
20. Promises
memcached doesn't lie, it just doesn't
make promises
18
Thursday, March 10, 2011 18
21. Promises
it makes a promise
19
Thursday, March 10, 2011 19
22. Promises
data will no longer be available after it expires
20
Thursday, March 10, 2011 20
23. Promises
When memcached doesn't have the
information you need, get it elsewhere.
Use a read-through caching callback or put it in
your caching object.
memcached is allowed to lose data!
21
Thursday, March 10, 2011 21
25. Hot Items = Hot Servers
23
Thursday, March 10, 2011 23
26. Hot Items = Hot Servers
23
Thursday, March 10, 2011 23
27. Memcached Alternatives
redis
Berkeley DB
Tokyo Cabinet
Tokyo Tyrant
Voldemort
Riak
24
Thursday, March 10, 2011 24
28. Excellent search tool
Import data from a variety of sources (commonly a database),
index it quickly, then run searches
25
Thursday, March 10, 2011 25
29. id title year rating
1 TMNT 1988 9.5
2 Goonies 1989 9.0
3 The Postman 1992 3.5
4 Alien 1994 8.5
26
Thursday, March 10, 2011 26
30. id title year rating studio category
1 TMNT 1988 9.5 x action
2 Goonies 1989 9.0 y adventure
3 The Postman 1992 3.5 z boring
4 Alien 1994 8.5 a awesome
27
Thursday, March 10, 2011 27
31. id title year rating studio category
1 TMNT 1988 9.5 x action
2 Goonies 1989 9.0 y adventure
3 The Postman 1992 3.5 z boring
4 Alien 1994 8.5 a awesome
28
Thursday, March 10, 2011 28
32. id director
film_id tag_id 1 blue
film_id description
1 2 2 red
1 Four awesome turtles...
2 3 3 green
2 Friends, pirates, goonies.
3 4 4 orange 3 Nap
4 5 4 They tried to turn around
id title year rating studio category
1 TMNT 1988 9.5 x action
2 Goonies 1989 9.0 y adventure
3 The Postman 1992 3.5 z boring
4 Alien 1994 8.5 a awesome
id tag
1 kids
id producer
id actor 1 1
2 awesome
1 Mel Gibson 2 2
3 sci-fi
2 Shredder 3 3
3 Sigourney Weaver 4 4
28
Thursday, March 10, 2011 28
33. id director
film_id tag_id 1 blue
film_id description
1 2 2 red
1 Four awesome turtles...
2 3 3 green
2 Friends, pirates, goonies.
3 4 4 orange 3 Nap
4 5 4 They tried to turn around
id title year rating studio category
1 TMNT 1988 9.5 x action
2 Goonies 1989 9.0 y adventure
3 The Postman 1992 3.5 z boring
4 Alien 1994 8.5 a awesome
id tag
1 kids
id producer
id actor 1 1
2 awesome
1 Mel Gibson 2 2
3 sci-fi
2 Shredder 3 3
3 Sigourney Weaver 4 4
29
Thursday, March 10, 2011 29
34. id director
film_id tag_id 1 film_id blue
director_id film_id description
1 film_id 2 tag_id 2 1 red 1 1 film_id desc_id
Four awesome turtles...
2 1 3 1 3 2 green 2 2 Friends, pirates, goonies.
1 1
3 2 4 2 4 3 orange 3 3 2 Nap 2
4 3 5 3 4 4 4 They tried to turn around
3 3
4 4 4 4
id title year rating studio category
1 TMNT 1988 9.5 x action
2 Goonies 1989 9.0 y adventure
3 The Postman 1992 3.5 z boring
4 Alien 1994 8.5 a awesome
id tag
1
id producer
film_id kids tag_id
id actor 1 film_id1 producer_id
2 1 awesome 1
1 film_id
Mel Gibson actor_id 2 1 2 1
3 2 sci-fi 2
2 Shredder
1 1 3 2 3 2
3 3
3 2
Sigourney Weaver 2 4 3 4 3
4 4
3 3 4 4
4 4
29
Thursday, March 10, 2011 29
35. SELECT
`dvd_scenes`.`scene_id`,
`dvds`.`dvd_id`,
`dvds`.`name`,
`dvds`.`name` AS sort_title,
`dvds`.`featured` AS premium,
`dvds`.`description`,
`dvds`.`date_released`,
`dvds`.`channel_id`,
`dvds`.`studio_id`,
`dvd_scenes_dyn`.`times_viewed` AS sub_order,
`dvd_scenes_dyn`.`rating` AS rating,
`channels`.`name` AS channel_name,
`studios`.`name` AS studio_name,
`channels_keywords`.`keywords` AS channel_keywords,
GROUP_CONCAT(DISTINCT `categories`.`name` ORDER BY `categories`.`name` DESC SEPARATOR ', ') AS categories_list,
GROUP_CONCAT(DISTINCT `models`.`name` ORDER BY `models`.`name` DESC SEPARATOR ', ') AS models_list
FROM `dvd_scenes`
INNER JOIN dvds
ON dvds.dvd_id = dvd_scenes.dvd_id
INNER JOIN dvd_scenes_dyn
ON dvd_scenes_dyn.id = dvd_scenes.scene_id
LEFT JOIN scenes_categories
ON dvd_scenes.scene_id = scenes_categories.scene_id
LEFT JOIN categories
ON `scenes_categories`.`category_id` = categories.category_id
LEFT JOIN studios
ON studios.studio_id = dvds.studio_id
LEFT JOIN scenes_models
ON scenes_models.scene_id = dvd_scenes.scene_id
LEFT JOIN `channels`
ON channels.channel_id = dvds.channel_id
LEFT JOIN `channels_keywords`
ON dvds.channel_id = channels_keywords.channel_id
LEFT JOIN `models`
ON scenes_models.model_id = models.model_id
WHERE dvds.is_active = 1
AND dvd_scenes.is_active = 1
AND dvds.date_released < UNIX_TIMESTAMP()
AND dvds.date_released > 0
AND dvds.channel_id NOT IN (21,26) 30
GROUP BY scene_id
Thursday, March 10, 2011 30
36. sphinx
http://www.flickr.com/photos/waldenpond/4237758518/ 31
Thursday, March 10, 2011 31
37. Sphinx Alternatives
Lucene
With Elastic Search
solr
32
Thursday, March 10, 2011 32