Tata AIG General Insurance Company - Insurer Innovation Award 2024
Gearman: A Job Server made for Scale
1. MinneBar April 7, 2012
A Job Server to Scale
By Mike Willbanks
Software Engineering Manager
CaringBridge
2. Housekeeping…
• Talk
Slides will be online later!
• Me
Software Engineering Manager at CaringBridge
MNPHP Organizer
Open Source Contributor (Zend Framework and various others)
Where you can find me:
• Twitter: mwillbanks G+: Mike Willbanks
• IRC (freenode): mwillbanks Blog: http://blog.digitalstruct.com
• GitHub: https://github.com/mwillbanks
2
3. Agenda
• What is Gearman
Yeah yeah…
• Main Concepts
How it really works
• Quick Start
Get it up and running and start playing.
• The Details
How can it be a tech talk without details?
• Some use cases
How you might use it.
• Questions
Although you can bring them up at anytime!
3
5. Official Statement
“Gearman provides a generic application framework to farm
out work to other machines or processes that are better
suited to do the work. It allows you to do work in parallel,
to load balance processing, and to call functions between
languages.”
5
6. What The Hell? Tell me!
• Gearman consists of a daemon, client and worker
At the core, they are simply small programs.
• The daemon handles the negotiation of work
Workers and Clients
• The worker does the work
• The client requests work to be done
6
13. Installation
• Head to gearman.org
• Click Download
• Click on the LaunchPad download
• Download the Binary
• Unpack the binary
• ./configure && make && make install
• Bam! You’re off!
For more advanced configuration see ./configure –help
• Starting
gearmand -d
13
16. PHP – Zend Framework
• So, you know… we all like to talk about ourselves…
Yes, I wrote a layer on top of Zend Framework called
Zend_Gearman; wow unique.
https://github.com/mwillbanks/Zend_Gearman
16
18. Persistence
• Gearman by default is an in-memory queue
Leaving this as the default is ideal; however, does not work in all
environments.
• Persistent Queues
Libdrizzle
Libsqlite3
Libmemcached
Postgres
TokyoCabinet
MySQL
Redis
18
19. Getting Up and Running with Persistence
• Persistent queues require specific configuration during the
compilation of gearman.
• Additionally, arguments to the gearman daemon need to be
passed to talk to the specific persistence layer.
• Each persistence layer is actually built as a plugin to
gearmand
http://bazaar.launchpad.net/~tangent-
org/gearmand/trunk/files/head:/libgearman-
server/plugins/queue/
19
21. Clients
• Clients send work to the gearmand server
This is called the workload; it can be anything that can become a
string.
Utilize an open format; it will make life easier if you chose to use
a different language for processing
• XML, JSON, etc.
• Yes, you can serialize objects if you wanted to… not recommended
although.
21
22. Workers
• Workers are the dudes in the factory doing all the work
• Generally they will run as a daemon in the background
• Workers register a function that they perform
They should ONLY be doing a single task.
This makes them far easier to manage.
• The worker does the work and “can” return results
If you are doing the work asynchronously you generally do not
return the result.
Synchronous work you will return the result.
22
23. Workers – special notes
• Utilizing the Database
If you keep a database connection
• Must have the ability to reconnect to the database.
• Watch for connection timeouts
• Handling Memory Leaks
Watch the amount of memory and detect leaks then kill the
worker.
• Request Languages
PHP for instance, sometimes slows down after hundreds of
executions, kill it off if you know this will happen.
23
24. Keeping the Daemon Running
• Workers sometimes have issues and die, or you need to boot
them back up after a restart
Utilizing a service to watch your workers and ensure they are
always running is a GOOD thing.
• Supervisord
Can watch processes, restart them if they die or get killed
Can manage multiple processes of the same program
Can start and stop your workers.
• When running workers, BE SURE to handle KILL signals such
as SIGKILL.
24
26. Monitoring
• Until recently you were writing something against the
gearman socket interface…
telnet on port 4730
Write “STATUS”
• Gives you the registered functions, number of workers and items in the
queue.
• Gearman Monitor – PHP Project
NOTE: I’ve never actually attempted this; BUT it is referenced on
gearman.org so it must be doing something!
https://github.com/yugene/Gearman-Monitor
26
28. Images
• If you resize images on your web server:
Web servers should serve, not process images.
Images require a lot of memory AND processing power
• They are best to be processed on their own!
• Processing in the Background
Generally will require a change to your workflow and checking the
status with XHR to see if the job has been completed.
• This allows you to process them as you have resources available.
• Have enough workers to process them “quickly enough”
28
30. Email
• Sending email and/or generating templates and processing
variables can take up time, time that is better spent getting
the user to the next page.
• The feedback on the mail doesn’t really make a difference
so it is great to send it to the background.
30
32. Log Analysis / Aggregation
• Get all of your logs to a single place
• Process the logs to produce analytical data
• Impression / Click Tracking
• Why run a cron over your logs nightly?
Real-time data is where it is at!
32