Wissbi is an open source toolset for building distributed event processing pipelines easily. It provides basic commands like wissbi-sub and wissbi-pub that allow receiving and sending messages. Filters can be written in any language and run in parallel as daemon processes configured through files. This allows constructing complex multi-stage data workflows. The ecosystem also includes tools like a log collector and metric collector that use Wissbi for transport. It aims to minimize operating effort through a simple design that relies mainly on filesystem operations and standard Unix tools and commands.
4. Successful Stories
• "Wissbi is an easy-to-use tool for distributed
event processing"
-- Scott Wang, 奇群科技
• "Wissbi provides great utilities for logging,
debugging, and monitoring your distributed
system"
-- 前趨勢科技,王姓工程師
• "Wissbi lets you easily manage your data
workflow in a pipe and filter style with intuitive
commands"
-- lunastorm, Open Source Developer
5. Wissbi 的前世今生
• Wissbi is a highly available and
scalable distributed message
routing framework like ZeroMQ
that took a different path from
traditional MQ middlewares.
• The toolkits supports an elegant
and intuitive integration model,
allowing writing event driven
applications as easy as "$ tail -f
log | grep error > error.log".
• Scott Wang a.k.a. lunastorm, Sr.
Engineer at Zillians
• Trend Message Exchange (TME) is
a highly available and scalable
distributed message routing
framework that took a different path
from traditional MQ middlewares.
• TME's client toolkit, MIST, supports
an elegant and intuitive integration
model, allowing writing event driven
applications as easy as "$ tail -f log |
grep error > error.log".
• Scott Wang a.k.a. lunastorm, Sr.
Engineer at Trend Micro
12. Deployment Guide
System Requirement
• A working ZooKeeper deployment, version newer than 3.3.2 is required
• You have set your hostname and hostname resolution correctly on each machine. Running hostname -i should return
an IP other than 127.0.0.1
• Create an user account TME on each machine, for example, execute useradd -m TME
Standalone Deployment
By standalone it means that all of the TME components (including ZooKeeper) will be deployed on the same host. Mostly this
is for development purposes. Because all default configurations will work in standalone mode, you only have to install the
packages and bring up the daemons to test and develop.
CentOS
You will need the following 3rd party packages which are not in the official repository:
1. jdk (get RPM from http://www.oracle.com/technetwork/java/javase/downloads/index.html)
2. monit (http://pkgs.org/search/?keyword=monit)
3. nodejs (for portal, http://pkgs.org/search/?keyword=nodejs)
4. ruby (for portal, https://github.com/lunastorm/ruby19_centos/downloads, have to be at least 1.9.2, you can build from
https://github.com/imeyer/ruby-1.9.2-rpm)
5. ruby-bundler (for portal, https://github.com/lunastorm/ruby19_centos/downloads)
Then you can follow the following steps to install:
1. install Sun's JDK first
2. download the dependency RPMs mentioned above
3. download TME RPM binaries and place them in the same folder with the dependencies
4. yum --nogpgcheck install *.rpm
Ubuntu
1. Grab all the deb files you would like to install
2. sudo dpkg -i tme-*.deb
3. sudo apt-get update
4. sudo apt-get -f install
TME web portal only supports Ruby 1.9.2+, and Ubuntu 10.04 only ships Ruby 1.9.1
You have to follow this step to use RVM to install Ruby 1.9.2:
1. aptitude install build-essential libssl-dev libreadline5 libreadline5-dev zlib1g zlib1g-dev
6. bash -s stable < <(curl -s https://raw.github.com/wayneeseguin/rvm/master/binscripts/rvm-installer)
2. source /etc/profile.d/rvm.sh
3. rvm install 1.9.2 ; rvm default 1.9.2
4. Edit /opt/trend/tme/conf/portal-web/portal-web-conf.sh , add "source /etc/profile.d/rvm.sh"
The web portal requires a JavaScript runtime installed. For example, you can install Node.js on the machine.
On All Distribution
Finally, execute /opt/trend/tme/bin/create_zookeeper_nodes.sh 127.0.0.1:2181 /tme2 to initialize the essential information on
ZooKeeper before you start the components.
• Distributed Deployment
You can choose to deploy different components on different machines of different hardware specs. Typically, the brokers are
required to be deployed on more powerful machines than others. They will handle a large amount of client connections and
deliver messages so they consume more memory and use more CPU. On the other hand, the clients need not use too much
computing power, but it depends on the applications' needs. The administration packages can be deployed on multiple
machines for redundancy.
1. First, you will probably have a distributed ZooKeeper set up, for example, zk1.mydomain:2181,zk2.mydomain:
2181,zk3.mydomain:2181
2. Follow the same way described in the standalone guide above to install the packages on the machines of your
choice.
3. Initialize the essential information on ZooKeeper: Execute /opt/trend/tme/bin/create_zookeeper_nodes.sh
zk1.mydomain:2181,zk2.mydomain:2181,zk3.mydomain:2181 /tme_root_prefix The first argument is the ZooKeeper
quorums, and the second argument is a prefix path on ZooKeeper chosen by you. By separating the prefix path on
Service Start / Stop
You can choose one of the following ways to start or stop the components.
service and chkconfig
1. sudo service tme-broker {start / stop / restart}
2. sudo service tme-mistd {start / stop / restart}
3. sudo service tme-graph-editor {start / stop / restart}
4. sudo service tme-portal-collector {start / stop / restart}
5. sudo service tme-portal-web {start / stop / restart}
If you wish to start the services upon boot, the you can use chkconfig to turn on the services.
monit
If you have installed and enabled monit, then you can use it to ensure the services are running.
Under the configuration folders of the components, there are monit watchdog scripts that can be modified to fulfill your need,
for example, send a notification when a daemon stops working.
You can use the helper scripts to start / stop the components:
1. sudo /opt/trend/tme/bin/{install|remove}_tme-broker.sh
2. sudo /opt/trend/tme/bin/{install|remove}_tme-mistd.sh
3. sudo /opt/trend/tme/bin/{install|remove}_tme-graph-editor.sh
4. sudo /opt/trend/tme/bin/{install|remove}_tme-portal-collector.sh
5. sudo /opt/trend/tme/bin/{install|remove}_tme-portal-web.sh
Verification
Messaging
After MIST daemon is started and configured correctly, execute mist-session --list to show session information:
$ mist-session -l
0 sessions
0 connections
You should get the response like above. If MIST daemon is not started correctly, you may receive the following error response:
$ mist-session -l
Error connecting to MIST daemon!
If MIST daemon is running correctly, then you can send your first Hello World message. Execute the script to send a message
to a queue named test:
$ session_id=`mist-session` && echo 'Hello World!' | mist-encode --wrap test --line | mist-sink $session_id --attach ; mist-
session --destroy $session_id
destroyed 1453444792
Then execute the script to receive one message from the queue named test:
$ session_id=`mist-session` && mist-source $session_id --mount test && mist-source $session_id --attach --limit 1 | mist-
decode --line ; mist-session --destroy $session_id
exchange queue:test mounted
Hello World!
destroyed 1453444796
Congratulations! You can now transmit the messages.
Portal
Open a browser to access http://**portal.host**:**portal.port** to see if it shows correctly.
Graph Editor
I don’t always read
the deployment guide
When I screw up
something
I call the developers
21. Using Filesystem for
Directory Service
• Store metadata on the filesystem
• Follows the philosophy "Everything is a file"
• Use standard Unix commands to manage it
• mkdir, touch, rm, ln, ...
• Just like /proc and /sys
22. Using Filesystem for
Directory Service
• You get authorization for free
• chown, chmod, ...
• Good for testing
• Launch any number of cluster just with
different metadata directories
27. Plumber (Program)
• The plumber, in the Plan 9 from Bell Labs and Inferno operating systems, is a
mechanism for reliable uni- or multicast inter-process communication of
formatted textual messages. It uses the Plan 9 network file protocol, 9p, rather
than a special-purpose IPC mechanism.
• Any number of clients may listen on a named port (a file) for messages. Ports
and port routing are defined by plumbing rules. These rules are dynamic. Each
listening program receives a copy of matching messages. For example, if the
data /sys/lib/plumb/basic is plumbed with the standard rules, it is sent to the
edit port. The port will write a copy of the message to each listener. In this case,
all running editors will interpret this message as a file name, and open the file.
• The plumber is the 9P file server that provides this service. Clients may use
libplumb to format messages. Since the messages are 9P, they are network
transparent.
http://en.wikipedia.org/wiki/Plumber_(program)
29. Minimum Dependency
$ ldd wissbi-pub
linux-vdso.so.1
libpthread.so.0
libstdc++.so.6
libm.so.6
libgcc_s.so.1
libc.so.6
/lib64/ld-linux-x86-64.so.2
The only dependency is a compiler
which supports C++11!
35. You Will Have Multiple
Data Pipelines
sub
Filter
pub
sub
Filter
pub
sub
Filter
pub
sub
Filter
pub
sub
Filter
pub
sub
Filter
pub
36. Daemonize Your Filters!
• 用 config 寫程式
• Daemon start / stop / restart / status
• start / stop upon boot / shutdown
• Watchdogs
37. <?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Mort Bay Consulting//DTD Configure//EN" "http://jetty.mortbay.org/configure.dtd">
<!-- =============================================================== -->
<!-- Configure the Jetty Server -->
<!-- -->
<!-- Documentation of this file format can be found at: -->
<!-- http://docs.codehaus.org/display/JETTY/jetty.xml -->
<!-- -->
<!-- =============================================================== -->
<Configure id="Server" class="org.mortbay.jetty.Server">
<!-- =========================================================== -->
<!-- Server Thread Pool -->
<!-- =========================================================== -->
<Set name="ThreadPool">
<!-- Default bounded blocking threadpool
-->
<New class="org.mortbay.thread.BoundedThreadPool">
<Set name="minThreads">10
<Set name="maxThreads">50
<Set name="lowThreads">25
</New>
<!-- New queued blocking threadpool : better scalability
<New class="org.mortbay.thread.QueuedThreadPool">
<Set name="minThreads">10
<Set name="maxThreads">25
<Set name="lowThreads">5
<Set name="SpawnOrShrinkAt">2
</New>
用 Config
寫程式之
Java 篇
38. # How many filter instances will be run in parallel
WISSBI_FILTER_COUNT="1"
# How to run the filter
WISSBI_FILTER_CMD="sed --unbuffered -e "s/^/[ / ; s/$/ ]/""
# If you run multiple instances in parallel, you can use the instance id $i
in the command
# WISSBI_FILTER_CMD="sed --unbuffered -e "s/^/$i: [ / ; s/$/ ]/""
# The message source's name, leave it empty if the filter is a message
generator
WISSBI_FILTER_SOURCE="test.in"
# The message sink's name, leave it empty if the filter is a message
terminal
WISSBI_FILTER_SINK="test.out"
WISSBI_FILTER_LOG_PREFIX="/tmp/filter-example"
WISSBI_FILTER_PID_PREFIX="/tmp/filter-example"
# If WISSBI_DEBUG_DUMP is set, message recording will be enabled, and 50
messages
# before the filter is terminated is dumped to the specified file.
# If WISSBI_DEBUG_DUMP is set to empty, then a random dump filename will be
used
#WISSBI_DEBUG_DUMP=""
wissbi filter
example