We have this variety of data and and we need to build all these products around such data.
We have this variety of data and and we need to build all these products around such data.
Messaging: ActiveMQ
User Activity: In house log aggregation
Logging: Splunk
Metrics: JMX => Zenoss
Database data: Databus, custom ETL
ActiveMQ: they do not fly
Now you maybe wondering why it works so well? For example, why it can be both highly durable by persisting data to disks while still maintaining high throughput?
Topic = message stream
Topic has partitions, partitions are distributed to brokers
Do not be afraid of disks
File system caching
And finally after all these tricks, the client interface we exposed to the users, are very simple.
Now I will switch my gear and talk a little bit about Kafka usage at Linkedin
Non-Java / Scala
C / C++ / .NET
Go
Clojure
Ruby
Node.js
PHP
Python
Erlang
HTTP REST
Command line
etc ..
https://cwiki.apache.org/confluence/display/KAFKA/Clients
Python - Pure Python implementation with full protocol support. Consumer and Producer implementations included, GZIP and Snappy compression supported.
C - High performance C library with full protocol support
C++ - Native C++ library with protocol support for Metadata, Produce, Fetch, and Offset.
Go (aka golang) Pure Go implementation with full protocol support. Consumer and Producer implementations included, GZIP and Snappy compression supported.
Ruby - Pure Ruby, Consumer and Producer implementations included, GZIP and Snappy compression supported. Ruby 1.9.3 and up (CI runs MRI 2.
Clojure - Clojure DSL for the Kafka API
JavaScript (NodeJS) - NodeJS client in a pure JavaScript implementation
stdin & stdout
https://cwiki.apache.org/confluence/display/KAFKA/Clients
Non-Java / Scala
C / C++ / .NET
Go
Clojure
Ruby
Node.js
PHP
Python
Erlang
HTTP REST
Command line
etc ..