This document discusses Celery, an asynchronous task queue/job queue based on distributed message passing. It describes Celery as a distributed, asynchronous task queue. It provides examples of using Celery for tasks like sending thousands of emails or handling computationally heavy queries. The document outlines some benefits of Celery like making tasks distributed and allowing configuration of queues, routing, and task utilities. It also discusses some pitfalls encountered with Celery and complaints about it.
2. About me
Learn Python since 2004
Was a Chief Technical Director in Softstar.
Now an Architect in Trend Micro
Light Celery user for several months.
3. Celery is X
Distributed
Asynchronous
Task Queue
4. Have you met?
Deliver thousands to millions e-mails to
your members?
Many computation heavy queries at the
same time that blocks your web site or
DB?
Query slow or too many connection to
external API? (like Twitter/Facebook)
Long run tasks that occupied your web
server resources?
5.
6. Why Celery?
POC without Celery
Add a “@celery.task” and the tasks
become distributed tasks.
Multi-queue, routing, all configurable.
Many utility functionality to help task
execution
Group/Chord/Chain/Callback
Retry/timeout
Etc…..
7. from celery import Celery
@celery.task(
max_retries = 3,
default_retry_delay=2 * 60, # retry in 2 minutes.
rate_limit = 20,
time_limit = 10, # hard time limit
soft_time_limit = 5,
ignore_result = True,
acks_late = True, # make sure this task will be done.
)
def send_twitter_status(oauth, tweet):
try:
twitter = Twitter(oauth)
twitter.update_status(tweet)
except (Twitter.FailWhaleError, Twitter.LoginError), exc:
raise send_twitter_status.retry(exc=exc)
8. Understand the personality of
your celery
What happen if the program throw an
exception?
What happen if the program hard/soft
time limit?
What happen if the hard/soft time limit
reached in your group/chord function?
Many others.
9. Our pitfall/trouble cases
Many, too many small tasks
For example 1 billion per day.
Large task argument (workaround => store task at
cache)
Send file directly
Better not wait another task in a task.
Using Redis as AMQP and no HA.
Hard time limit reached in Chord function.
Multi-process and multi-thread problem in Windows
Logging.
Queue management
10. Some internet complain about
Celery
Only Python
API change and break his program
2.5.x => 3.0.x
Pylint