High scale flavour

High scale ﬂavor
(recipes for message
queueing)

Tomas (t0m) Doran
London Perl
Workshop 2010

What is message
queueing.
• In it’s simplest form, it’s just a list.
• 1 (or more) ‘producers’ (writers)
• 1 (or more) ‘consumers’ (readers)
• Queue if rate of production > rate of
consumption

Why do I want it?
• Decouples producers and consumers
(probably across the network).
• Lets you manage load, and spikes in load
(only n consumers).
• In a web environment - lets you serve
more pages, quicker.

What is the problem
with web apps?
• App servers take a lot of RAM
• Context switching expensive
• A lot of apps think like a CGI.
• Making the user wait IS HORRIBLE.
• Anything but very fast pages is fail.

One page request per CPU core
Even if extra context switching has zero overhead
you serve people sooner if you queue requests.

A B A B A B A B

A B

A finishes significantly before B in the lower diagram
B finishes at the same time in both
N.B. This explicitly assumes you never wait on external IO
(e.g. a database)

What are the solutions
• Use a PAL (Page Assembly Layer) - e.g.
Varnish
• Defer doing work inside the web request
to a queue
• AJAX (heavy javascript) apps making many
requests make this even more important.
• If your requests are all small and fast then
you can win by doing multiple small
requests rather than one expensive one

Message queueing

• Many many ﬂavors.
• Going to cover options available right
now in perl
• First, a little theory

Messaging Topologies
• I.e. how producers and consumers interact
together through the message broker
• 3 common patterns - considerably more
complex applications possible.
• Even within these there is additional
complexity to consider, e.g. message
durability.

1: Publish-Subscribe
• ‘Topic’ in ActiveMQ
• Anonymous queues in AMQP
• One (or more) publishers
• Zero (or more) consumers
• Every consumer gets every message
• Messages discarded if no consumers
• E.g. log message listener(s)

2: Queue(s)
• One or more producers
• One or more consumers
• Each message delivered to exactly one
consumer
• Messages ‘queued’ (possibly to disk)
• E.g. Job queue with worker pool allowing
you to work efﬁciently through high load
spikes

3: Request / Response

• Create anonymous queue for replies
• Publish to well known queue(s), include
return address
• Wait for reply.
• Arrange for messages to be discarded if
you stop listening for the reply.

STOMP
• Streamed Text Oriented Messageing
Protocol
• Simple. Interoperable.
• You probably want to use ActiveMQ
• Net::STOMP
• Simple semantics: Queues and Topics
• You can build the 3 simple patterns from
these

ActiveMQ - Topics
• Publish / subscribe semantics.
• Message goes to all the subscribers
• Zero or more subscribers
• Messages thrown away if no subscribers
• Safe (cannot ﬁll up server with undelivered
messages)

ActiveMQ - Queues
• Load balancer semantics.
• Each message - 1 consumer.
• Messages queued.
• Multiple consumers
• Ack required (auto-ack possible)
• Danger will robinson!

AMQP
• More complex than STOMP
• Wiring of message routing is part of the
protocol.
• All your clients know (at least half) of the
wiring.
• Different topologies depending on routing
conﬁguration.
• Nice when your server dies - no ‘current
conﬁg’

AMQP Concepts
RabbitMQ

vhost

Publisher Exchange

Queue
Consumer

Concepts - Exchanges

• Named
• Messages are published (sent) to one
exchange
• Can be durable (forces all queues attached
to be durable)

Queues
• Queues can be named.
• Queues are bound to one (or more)
exchanges.
• Queues can have 0 or more clients
• Queues may persist or be deleted when
they have no clients
• FIFO (if you have 1 consumer)
• Message never delivered to > 1 client

Bindings
• Binding is what joins a queue and an
exchange.
• There can be more than one binding for
each queue, allowing a single consumer to
listen to multiple message sources
• You can bind to topic exchanges selectively
via the message ‘routing key’

Job queue
• Named exchange
• Bound to 1 named and persistent queue
• 0 or more listeners get round-robin
messages
• Messages queue when nobody listens / if
consumers are slow

Publish/Subscribe
• Named exchange
• Each client creates an anonymous
ephemeral queue
• Client binds the queue to that exchange
• All clients get all messages
• Messages go to /dev/null if no clients

AMQP can be complex
• Different exchange types - direct, topic,
fanout (and custom exchange types
possible)
• Messages have a routing key allowing
selective binding.
• You can do a lot using these and a mix of
named and anonymous queues
• Much more complex topologies possible

Implementations:

• So - I want a queue
• What do I use?
• Naive approaches
• Job queue only approaches
• More sophisticated/custom approaches

(Shared) database table
• Have a ‘jobs’ table, with some data, and a
‘status’ column.
• Waiting => Running => Done
• Job workers poll the table and change
statuses.
• NO NO NO NO NO NO NO NO
• No, really, mst will come and break your
legs if you do this (after he stops laughing)

• Job workers poll the table and change
statuses.
• NO NO NO NO NO NO NO NO
• No, really, mst will come and break your
legs if you do this.

• MySQL will get the query plan wrong if
you try joining this table (hint: the
cardinality on your status column is 3).
• You will lose super-hard. HAND.

(Queue) database table

• Have separate queued / running / done tables
• Less bad for performance - at least the ‘ﬁnd
something to do’ query is very cheap.
• Still pretty terrible.
• You still re-invented a big old wheel here,
probably badly.

Gearman
• Around since 2006. Multi platform.
• Not really a message queue - designed as a
job queuing system
• Client, Job Server, Worker
• Failover (multiple job servers)
• NOT persistent
• Simple, works well (if that’s all you need)

TheSchwartz
• Is Persistent
• From the same place as Gearman
• Not as well adopted
• Relies on a MySQL database - SPOF
• Still simple - maybe the easiest way to get
started (if you need reliable)?

Client libs: Net::Stomp
• Apache ActiveMQ.
• Dead simple producers and consumers.
• Just a client - you need to manage / run
your own jobs.
• Blocking.
• Works perfectly well for sending messages
from a web app.

Client libs:
Net::RabbitFoot
• AMQP / RabbitMQ
• AnyEvent based - non blocking.
• Documentation pretty poor (sorry).
• Works well if you have an async app.
• Can be used inside a web app for simple
sending.

Catalyst::Engine::Stomp
• By chrisa @ Venda & yours truly; now
maintained by Paul Moony.
• Simple framework for writing jobs / managing
workers.
• Allows you to ﬁre and forget, or ﬁre and wait
for termination (and pass a message back)
• Achieves many of the same things a Gearman
• Used in anger by several companies.
(http://miltonkeynes.pm.org/talks/2010/06/paul_mooney_stomp_moosex_workers.pdf)

Net::ActiveMQ

• Builds on Catalyst::Engine::STOMP
• Chisel talked about this at YAPC::EU (Friday
PM ‘Going Postal’)
• NetAPorter people - poke him to release it
(or at least put it on github)!

Web::Hippie
• Persistent (potentially bidirectional) web
pipe to applications.
• Cheap connection, no polling needed (on
reasonably modern browsers). Great as the
listener part for ‘replies’
• Downsides - needs to be async - no DBI
(kinda)!
• Plays very nicely with RabbitMQ (hint
MooseX::Storage & Joose.Storage <3)

CatalystX::JobServer
• My current baby.
• Uses AMQP.
• Provides Web::Hippie pipes for jobs - shiny
shiny ajax updates.
• Barely production ready (but useable).
• I’m talking about it later...

Conclusions
• Use JSON for your message payloads.
• You probably want to use Gearman if you
can get away with it.
• STOMP works well and is simple, tried and
tested jobs solution.
• AMQP is nicer and more ﬂexible, but there
are less proven solutions (in perl).

High scale flavour

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à High scale flavour

Similaire à High scale flavour (20)

Plus de Tomas Doran

Plus de Tomas Doran (20)

Dernier

Dernier (20)

High scale flavour

Notes de l'éditeur