Towards Improved Data Dissemination of Publish-Subscribe Systems
1. Towards Improved Data
Dissemination
of
Publish-Subscribe Systems
Srinath Perera, Ramith Jayasinghe
Dinesh Gamage
Lanka Software Foundation
2. Outline
● Outline of Pub/sub Paradigm
● 3 challenges
● Avoiding Blocking IO
● Avoiding Message accumulation through parallel
message delivery
● Working around Slow and unreliable consumers/
publishers
● OGCE workflow suite
● Conclusions
3. Publish/ Subscribe Paradigm and
Message Broker
● Many Event sources that generate events
● Subscribers notify their interest through
subscriptions
● Broker Matches and deliver events to
Subscribers
5. Goals of Message Broker
● High throughput
● Preserving Message Order for messages
generated from same event source
● Reduce publish to delivery time
7. Goals of this Paper
● Three architectural challenges
● Blocking IO
● Parallel Message Delivery
● Unreliable Consumers
● We explore architectural options in addressing
these challenges
● We have improved OGCE Messenger based
on our observations and used that the test bed
for this study.
8. Challenge 1: Blocking IO
● Blocking IO assigns each request to a thread. Then
the number of parallel clients are limited by number of
threads.
● Potential alternative is non-blocking IO, which uses an
event based model and minimize the thread blocking
due to IO.
● Message broker has an IO dominated workload, and
therefore we believed an non-blocking approach can
provide major improvements.
● We took advantage of Axis2 supports a pluggable
transport architecture and setup the broker with NIO –
transport from Apache
9. Experimental setup
● Loaded system with XML messages, over a
constant set of subscriptions
● Loaded the system with out loading the
network.
● Statistics were calculated periodically (e.g. 2
seconds)
● 10 topics, 1000 messages per topic, 200
consumers
10. NIO vs. Blocking IO
● NIO transport increases the throughput of the
system.
● NIO is able to handle more concurrent
connections (publishers) with less resources.
11. Challenge 2: Message
Accumulation
● If message reception rate is lower than message
dissemination rate,
● => messages accumulate => system slows down
and crash
● Often one incoming message need to be delivered to
multiple consumers. With high number of Consumers,
very high chance of above problem.
● Single delivery thread could pause major limitations.
● But Naive Parallel solution will break the order of
message delivery
12. Parallel Message Delivery
● Two parallelization strategies considered
● Topic based
– Each thread is assigned a set of Topic or Xpath
Expressions
– Thread will deliver a message to Consumers if it
matches to a topic/Xpath handled by it.
● Consumer (EPR) based
– Each active consumer is assigned a standing job and
a message queue
– Job delivers messages accumulated in message
queues
13. Topic Based Message Delivery
● Queue for each topic, and matching messages for
the topic are place in the queue.
● A thread assigned to each queue pick up messages
and delivers.
● Concerns
● If a message matches multiple subscriptions submitted
by the same consumer, it will be delivered multiple times.
● If two subscriptions for same consumer is handled by
different threads, how can system preserve order of
messages?
14. Consumer (EPR) based
parallelization
● There is a queue for each consumer.
● Messages for that consumer are placed on that queue.
● A thread assigned to the queue delivers messages in the
queue to the consumer.
● Facts
● Since only thread delivers messages to a consumer, order is
preserved.
● Jobs and queue are created only when messages are
available for a consumer.
● Queue eventually expires when messages are not available
15. Scheduling Standing jobs
● We use read-write locks to maximize concurrency
● We can not assign a static thread to each consumer as that will not
scale with large number of consumers => we use a thread pool.
● Dynamic Thread pool
● Standing Job will drain the queue entirely and try deliver
● Will not release the control as long as messages are in queue.
● Potential starvation
● Static Thread pool (size is configurable)
● Each thread will iterate over standing jobs assigned to it.
● Each thread will drain the queue partially (configurable)
● Standing job will release control after delivering drained messages
● Optimized system by allowing greater concurrency for message filtering
18. Performance: Summary
● Both parallel implementation performance better
than serial ( throughput & round trip time)
● Static thread pool reports better round trip times.
● Dynamic thread pool increases throughput
19. Slow and Unreliable
Consumers/Publishers
● Distributed in a heterogeneous environment and
unpredictable (beyond the control of the system)
● Effects the performance of middleware
● Round trip time
● Throughput
● Delay incurred by delivering to slow consumers
will be propagated to fast consumers as well.
● E.g. connection timeouts will block the thread until
time out period
20. Solution 1: Soft State
Subscriptions
● Forcing consumers to renew
● But the problem persists until the timeout
happens.
21. Solution 2: Blacklisting Schema
● Consumers are uniquely identified by their EPR
● If message delivery fails (e.g. times-out)
repeatedly for a consumer – it will be blacklisted
● System doesn’t try to send subsequent messages
to black-listed consumers (for a configurable time
period)
● Facts
● Minimize the overhead incurred by message delivery
failures
22. WS-Messenger
● Part of NSF funded “Open Grid Computing Environments –
OGCE” project and Full Opensource.
● Implements WS-Eventing and WS-Notifications.
● Supports Topic based and Xpath Based Subscriptions
● New version works on Axis2 http://ws.apache.org/axis2/
● Multiple deployment options
● Standalone distribution
● Embedded in Servlet Container ( e.g. Tomcat)
23. Future Directions
● Improved static thread pool based
parallelization
● Ensure equal thread utilization
● Implement a work stealing mechanism or hand
over jobs to idle threads.
● Analyze performance impact on
parallelization strategies when number
consumers are increased.
● Memory requirements
● Throughput, round trip time
24. Important Info
● Open Grid Computing Environments (
www.collab-ogce.org) provides a SOA
based workflow suite for scientific use
cases.
● WS-Messenger
● http://www.collab-
ogce.org/ogce/index.php/Messaging
● http://www.collab-
ogce.org/ogce/index.php/Messaging_User_G
uide