SlideShare une entreprise Scribd logo
1  sur  46
Télécharger pour lire hors ligne
Kafka and Storm – event
processing in realtime
Guido Schmutz
WELCOME

Kafka and Storm – event
processing in realtime
Guido Schmutz

Jazoon 2013
23.10.2013

BASEL

2

BERN

LAUSANNE

ZÜRICH

DÜSSELDORF

FRANKFURT A.M.

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

FREIBURG I.BR.

HAMBURG

MÜNCHEN

STUTTGART

WIEN

Guido Schmutz
• 
• 

Working for Trivadis for more than 16 years
Oracle ACE Director for Fusion Middleware and SOA

• 
• 

Co-Author of different books
Consultant, Trainer Software Architect for Java, Oracle, SOA,
EDA, BigData und FastData

• 
• 

Member of Trivadis Architecture Board
Technology Manager @ Trivadis

• 

More than 20 years of software development 

experience

• 

Contact: guido.schmutz@trivadis.com

• 
• 

Blog: http://guidoschmutz.wordpress.com
Twitter: gschmutz

3

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Our company
Trivadis is a market leader in IT consulting, system integration,
solution engineering and the provision of IT services focusing
on
and
technologies in Switzerland,
Germany and Austria.
We offer our services in the following strategic business fields:

OPERATION

Trivadis Services takes over the interacting operation of your IT systems.
4

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
With over 600 specialists and IT experts in your region

Hamburg

Düsseldorf

Frankfurt

Stuttgart

Freiburg

Wien
München

Basel Brugg
Bern
Zurich
Lausanne

5
5

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

12 Trivadis branches and more than
600 employees
 
200 Service Level Agreements
 
Over 4,000 training participants
 
Research and development budget:
CHF 5.0 / EUR 4 million
 
Financially self-supporting and
sustainably profitable
 
Experience from more than 1,900
projects per year at over 800
customers
Agenda
1.  Introduction
2.  What is Apache Kafka?
3.  What is Twitter Storm?
4.  Summary

6

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Big Data Definition (Gartner et al)

Characteristics of Big Data: Its
Volume, Velocity and Variety in
combination

Tera-, Peta-, Exa-, Zetta-, Yota- bytes and constantly growing

Velocity
“Traditional” computing in RDBMS 

is not scalable enough. 

We search for “linear scalability”

“Only … structured information 

is not enough” – “95% of produced data in
unstructured”

+ Veracity (IBM) - information uncertainty
+ Time to action ? – Big Data + Event Processing = Fast Data
7

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
8

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Sample – Show the counts of tweets related to Jazoon (using
term jazoon) 

23.10.2013 – 22:20 not
showing #jazoon
http://54.217.159.208:8484/jazoon-restapi/

9

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Velocity
§  Velocity requirement examples:
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 
§ 

10

Recommendation Engine
Predictive Analytics
Marketing Campaign Analysis
Customer Retention and Churn Analysis
Social Graph Analysis
Capital Markets Analysis
Risk Management
Rogue Trading
Fraud Detection
Retail Banking
Network Monitoring
Research and Development

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Agenda
1.  Introduction
2.  What is Apache Kafka?
3.  What is Twitter Storm?
4.  Summary

11

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Apache Kafka
•  A distributed publish-subscribe messaging system
•  Designed for processing of real time activity stream data (logs, metrics
collections, social media streams, …)
•  Initially developed at LinkedIn, now part of Apache
•  Does not follow JMS Standards and does not use JMS API
•  Kafka maintains feeds of messages in topics

12

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Sample (#1)
1) Use the Twitter Streaming API (Twitter Horsebird Client) to get all the
tweets containing the term jazoon
Twitter
Stream

13

tweet

Twitter to
Kafka

tweet

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Kafka
Topic

Create a Filter Stream for
tracking a list of terms
Twitter
Stream

Sample (#1)

tweet

Twitter to
Kafka

tweet

Kafka
Topic

Create a fully configured
Horsebird Client instance

Start the client in
multiple threads

14

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Twitter
Stream

Sample (#1)

tweet

Twitter to
Kafka

tweet

Kafka
Topic

Convert Twitter status
to Avro format

Create the Kafka producer

Send the message
to the Kafka topic
15

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Agenda
1.  Introduction
2.  What is Apache Kafka?
3.  What is Twitter Storm?
4.  Summary

16

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
What is Twitter Storm ?
A platform for doing analysis on streams of data as they come in, so you
can react to data as it happens.
•  A highly distributed real-time computation system
•  Provides general primitives to do real-time computation
•  To simplify working with queues & workers
•  scalable and fault-tolerant
•  complementary to Hadoop
Originated at Backtype, acquired by Twitter in 2011
Open Sourced late 2011
Part of Apache Incubator since September 2013
17

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Hadoop vs. Storm
Hadoop

Storm

Batch processing
Jobs runs to completion
Stateful nodes
Scalable
Guarantees no data loss
Open Source
=> big batch processing
18

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

!=
=

Real-time processing
Topologies run forever
Stateless nodes
Scalable
Guarantees no data loss
Open Source

=> Fast, reactive, real time procesing
Idea: Simplify dealing with queue & workers
Scaling is painful – queue partitioning & worker deploy
Operational overhead – worker failures & queue backups
No guarantees on data processing

19

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
What does Storm provide?
§  At least once message processing
§  Fault-tolerant
§  Horizontal scalability
§  No intermediate queues
§  Less operational overhead
§  „Just works“
§  Broad set of use cases
§  Stream processing
§  Continous computation
§  Distributed RPC

20

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Main Concepts - Streams
Stream
•  an unbounded sequence of tuples
•  core abstraction in Storm
•  Defined with a schema that names the 

fields in the tuple
•  Value must be serializable
•  Every stream has an id

T

T

T

T

T

T

Subscribes: C & D
Emits: Source of
stream A

Source of
stream B

21

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

T

Subscribes: A
Emits: C

Topology
•  graph where each node is a spout or bolt
•  edges indicating which bolt subscribes 

to which streams

T

Subscribes: A
Emits: D

Subscribes: A & B
Emits: -
Main Concepts – Worker, Executor, Task
Worker
•  executes a subset of a topology, may run one or more executors for one or
more components

Executor
•  thread spawned by worker

Task
•  performs the actual data processing

Worker Process
Task

Task

22

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Task

Task
Main Concepts - Spout
A spout is the component which acts as the source of streams in a
topology
Generally spouts read tuples from an external source and emit them into
the topology
Spouts can emit more than one stream

23

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Main Concepts - Bolt
Bolt is doing the processing of the message
•  filtering, functions, aggregations, joins, talking to databases….
•  Can do simple stream transformations
•  Complex transformations often require multiple steps and therefore
multiple bolts
Bolts can emit one or more streams

24

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Sample (#2)
2) Use a Spout to subscribe to the Kafka Topic to get the related tweets
Twitter
Stream
tweet

Twitter to
Kafka
tweet

Kafka
Topic

25

tweet

Kafka

Spout

tweet

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Kafka
Topic

Sample (#2)

Tweets

Kafka

Spout

tweet

Use existing implementation of the Kafka Spout from storm-kafka project

Create a Kafka configuration

26
2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Create Kafka spout to connect to the
topic
Sample (#3)
3) Use a bolt to retrieve the hashtags for each tweet
Twitter
Stream
tweet

Twitter to
Kafka
tweet

Kafka
Topic

27

tweet

Kafka

Spout

tweet

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Hashtag

Splitter

hashtag
Kafka
Topic

Sample (#3)

Tweets

Kafka

Spout

tweet

Hashtag

Splitter

hashtag

Execute is called for each
tuple arriving on the input
stream

Declares the output fields for the component
and is called when bolt is created

28

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Sample (#4)
4) Use a bolt to count the occurrences of hashtags into a Redis NoSQL DB
Twitter
Stream
Tweets

Twitter to
Kafka
Tweets

Kafka
Topic

29

Tweets

Kafka

Spout

tweet

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Hashtag

Splitter

hashtag

Hashtag

Counter
Sample (#4)

Kafka
Topic

Tweets

Kafka

Spout

tweet

Hashtag

Splitter

hashtag

Hashtag

Counter

Execute is called for each tuple
arriving on the input stream

Prepare is called when bolt
is created

30

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Main Concepts - Topology

Kafka
Topic

Tweets

Kafka

Spout

Hashtag

Splitter
Shuffle

Hashtag

Counter
Fields

Hashtag

Splitter

Hashtag

Counter

Each Spout or Bolt are running N instances in parallel
Shuffle grouping

is random grouping

Fields grouping

is grouped by value, such that equal value results in equal task

All grouping

replicates to all tasks

Global grouping

makes all tuples go to one task

None grouping

makes bolt run in the same thread as bold/spout it subscribes to

Direct grouping

producer (task that emits) controls which consumer will receive

31

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Sample – How does it work ?

My presentation
„Kafka and Storm –
event processing in
realtime“ tomorrow
12:00 at Jazoon
Zurich #jazoon
#storm #kafka CU
there!

Kafka
Topic

32

Kafka

Spout

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Hashtag

Splitter

Hashtag

Counter

Hashtag

Splitter

Hashtag

Counter
Sample – How does it work ?

My presentation
„Kafka and Storm –
event processing in
realtime“ tomorrow
12:00 at Jazoon
Zurich #jazoon
#storm #kafka CU
there!

Kafka
Topic

33

Kafka

Spout

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Hashtag

Splitter
Hashtag

Splitter

jazoon
storm
kafka

Hashtag

Counter
Hashtag

Counter
Sample – How does it work ?

My presentation
„Kafka and Storm –
event processing in
realtime“ tomorrow
12:00 at Jazoon
Zurich #jazoon
#storm #kafka CU
there!

Kafka
Topic

34

Kafka

Spout

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Hashtag

Splitter
Hashtag

Splitter

jazoon
storm
kafka

Hashtag

Counter
Hashtag

Counter

INCR
jazoon

INCR
storm
INCR
kafka

jazoon = 1
storm = 1
kafka = 1
Sample (#5)
5) Setup the topology with the necessary groupings (distribution)
Create Stream called „tweet-stream“
Register Kafka
spout and run by 1
worker
Subscribe to stream „tweetstream“ using shuffle grouping

Run this bolt
by 2 workers
Subscribe to „hashtag-splitter“ using
fields grouping on field „hashtag“
35

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Trident
•  High-Level abstraction on top of storm
•  Makes it easier to build topologies
•  Core data model is the stream
•  Processed as a series of batches
•  Stream is partitioned among nodes in cluster

•  5 kinds of operations in Trident
•  Operations that apply locally to each partition and cause no network
transfer
•  Repartitioning operations that don‘t change the contents
•  Aggregation operations that do network transfer
•  Operations on grouped streams
•  Merges and Joins

36

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Trident
Supports
• 
• 
• 
• 
• 

Joins
Aggregations
Grouping
Functions
Filters

Similar to Hadoop and Pig/Cascading

37

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Trident Concepts - Topology

Kafka
Topic

38

tweet

Kafka

Spout

tweet

Bolt
Hashtag

Splitter

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

hashtag
local

Hashtag

Normalizer

hashtag
groupBy

Bolt
Persistent

Aggregate
Trident Concepts - Function
•  takes in a set of input fields and emits zero or more tuples as output
•  fields of the output tuple are appended to the original input tuple in the
stream
•  If a function emits no tuples, the original input tuple is filtered out
•  Otherwise the input tuple is duplicated for each output tuple

39

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Agenda
1.  Introduction
2.  What is Apache Kafka?
3.  What is Twitter Storm?
4.  Summary

40

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Summary
Kafka

Storm

•  Distributed Scalable Pub/Sub system
for Big Data

•  Express real-time processing naturally

•  Producer => Broker => Consumer of
message topics
•  Persists messages with ability to
rewind
•  Consumer decides what he as
consumed so far

•  Not a Hadoop/MapReduce competitor
•  Supports other languages
•  Hard to debug
•  Object Serialization
•  Didn‘t cover
§  Reliability through guaranteed message
processing
§  Distributed RPC
§  Storm cluster setup and deploy

http://54.217.159.208:8484/jazoon-restapi/
41

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Lambda Architecture

Big Data and Fast Data combined

Precompute

Precomputed
Views
information

All data

Batch
recompute

Incoming
Data

Serving Layer
batch view
batch view
Merge

Batch Layer

Speed Layer
Process stream

Incremented
information

Realtime
increment

query

real time view
real time view

Source: Marz, N. & Warren, J. (2013) Big Data. Manning.

42

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Lambda Architecture

Big Data and Fast Data combined
one possible product/framework mapping

Precompute

Precomputed
Views
information

All data

Batch
recompute

Incoming
Data

batch view
batch view

Speed Layer
Process stream

Incremented
information

Realtime
increment

43

Serving Layer

Merge

Batch Layer

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

real time view
real time view

query
Resources
Sample Code: https://github.com/gschmutz/jazoon-storm-twitter-sample
Twitter Streaming API: https://dev.twitter.com/docs/streaming-apis
Twitter Horsebird Client (HBC): https://github.com/twitter/hbc
Apache Kafka: http://kafka.apache.org/
Storm Website: http://storm-project.net/
Storm Wiki: https://github.com/nathanmarz/storm/wiki
Storm Doc: https://github.com/nathanmarz/storm/wiki/Documentation

44

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013
Thank You!

BASEL

45

BERN

LAUSANNE

ZÜRICH

DÜSSELDORF

Trivadis AG
Guido Schmutz

guido.schmutz@trivadis.com

FRANKFURT A.M.

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

FREIBURG I.BR.

HAMBURG

MÜNCHEN

STUTTGART

WIEN

Lambda Architecture

Big Data and Fast Data combined
Batch Layer

Serving Layer

Immutable
data

Batch
View

B
Incoming
Data

C

D

A

G

Speed Layer
Data
Stream
E

46

2013 © Trivadis
Kafka and Storm – event processing in realtime
22.10.2013

Realtime
View
F

Query

Contenu connexe

Tendances

Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningGuido Schmutz
 
NATS for Modern Messaging and Microservices
NATS for Modern Messaging and MicroservicesNATS for Modern Messaging and Microservices
NATS for Modern Messaging and MicroservicesApcera
 
Gluecon - Kafka and the service mesh
Gluecon - Kafka and the service meshGluecon - Kafka and the service mesh
Gluecon - Kafka and the service meshGwen (Chen) Shapira
 
DDoS Attacks in 2017: Beyond Packet Filtering
DDoS Attacks in 2017: Beyond Packet FilteringDDoS Attacks in 2017: Beyond Packet Filtering
DDoS Attacks in 2017: Beyond Packet FilteringQrator Labs
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaGuido Schmutz
 
What is Reactive programming?
What is Reactive programming?What is Reactive programming?
What is Reactive programming?Kevin Webber
 
SF Python Meetup - Introduction to NATS Messaging with Python3
SF Python Meetup - Introduction to NATS Messaging with Python3SF Python Meetup - Introduction to NATS Messaging with Python3
SF Python Meetup - Introduction to NATS Messaging with Python3wallyqs
 
30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Meetup Microservices Commandments
Meetup Microservices CommandmentsMeetup Microservices Commandments
Meetup Microservices CommandmentsBill Zajac
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformGuido Schmutz
 
Simple Solutions for Complex Problems
Simple Solutions for Complex Problems Simple Solutions for Complex Problems
Simple Solutions for Complex Problems Apcera
 
SSL for SaaS Providers
SSL for SaaS ProvidersSSL for SaaS Providers
SSL for SaaS ProvidersCloudflare
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...Natan Silnitsky
 
Running a Robust DNS Infrastructure with CloudFlare Virtual DNS
Running a Robust DNS Infrastructure with CloudFlare Virtual DNSRunning a Robust DNS Infrastructure with CloudFlare Virtual DNS
Running a Robust DNS Infrastructure with CloudFlare Virtual DNSCloudflare
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayJosef Adersberger
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
NATS Connect Live | SwimOS & NATS
NATS Connect Live | SwimOS & NATSNATS Connect Live | SwimOS & NATS
NATS Connect Live | SwimOS & NATSNATS
 
利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據Yu-Lun Chen
 
How Greta uses NATS to revolutionize data distribution on the Internet
How Greta uses NATS to revolutionize data distribution on the InternetHow Greta uses NATS to revolutionize data distribution on the Internet
How Greta uses NATS to revolutionize data distribution on the InternetApcera
 

Tendances (20)

Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & PartitioningApache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
 
NATS for Modern Messaging and Microservices
NATS for Modern Messaging and MicroservicesNATS for Modern Messaging and Microservices
NATS for Modern Messaging and Microservices
 
Gluecon - Kafka and the service mesh
Gluecon - Kafka and the service meshGluecon - Kafka and the service mesh
Gluecon - Kafka and the service mesh
 
DDoS Attacks in 2017: Beyond Packet Filtering
DDoS Attacks in 2017: Beyond Packet FilteringDDoS Attacks in 2017: Beyond Packet Filtering
DDoS Attacks in 2017: Beyond Packet Filtering
 
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around KafkaKafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
Kafka Connect & Kafka Streams/KSQL - the ecosystem around Kafka
 
What is Reactive programming?
What is Reactive programming?What is Reactive programming?
What is Reactive programming?
 
Next-Gen DDoS Detection
Next-Gen DDoS DetectionNext-Gen DDoS Detection
Next-Gen DDoS Detection
 
SF Python Meetup - Introduction to NATS Messaging with Python3
SF Python Meetup - Introduction to NATS Messaging with Python3SF Python Meetup - Introduction to NATS Messaging with Python3
SF Python Meetup - Introduction to NATS Messaging with Python3
 
30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Meetup Microservices Commandments
Meetup Microservices CommandmentsMeetup Microservices Commandments
Meetup Microservices Commandments
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing Platform
 
Simple Solutions for Complex Problems
Simple Solutions for Complex Problems Simple Solutions for Complex Problems
Simple Solutions for Complex Problems
 
SSL for SaaS Providers
SSL for SaaS ProvidersSSL for SaaS Providers
SSL for SaaS Providers
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
 
Running a Robust DNS Infrastructure with CloudFlare Virtual DNS
Running a Robust DNS Infrastructure with CloudFlare Virtual DNSRunning a Robust DNS Infrastructure with CloudFlare Virtual DNS
Running a Robust DNS Infrastructure with CloudFlare Virtual DNS
 
Dataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice WayDataservices - Processing Big Data The Microservice Way
Dataservices - Processing Big Data The Microservice Way
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
NATS Connect Live | SwimOS & NATS
NATS Connect Live | SwimOS & NATSNATS Connect Live | SwimOS & NATS
NATS Connect Live | SwimOS & NATS
 
利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據利用 SDACK 架構分析資安事件大數據
利用 SDACK 架構分析資安事件大數據
 
How Greta uses NATS to revolutionize data distribution on the Internet
How Greta uses NATS to revolutionize data distribution on the InternetHow Greta uses NATS to revolutionize data distribution on the Internet
How Greta uses NATS to revolutionize data distribution on the Internet
 

En vedette

"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient..."Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...Dataconomy Media
 
Spark with Elasticsearch
Spark with ElasticsearchSpark with Elasticsearch
Spark with ElasticsearchHolden Karau
 
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...huguk
 
Confessions of a Former Agile Methodologist (JFrog Edition)
Confessions of a Former Agile Methodologist (JFrog Edition)Confessions of a Former Agile Methodologist (JFrog Edition)
Confessions of a Former Agile Methodologist (JFrog Edition)Stephen Chin
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsData Con LA
 
"Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor...
"Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor..."Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor...
"Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor...Dataconomy Media
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsMongoDB
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 

En vedette (9)

"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient..."Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
"Hadoop: What we've learned in 5 years", Martin Oberhuber, Senior Data Scient...
 
Spark with Elasticsearch
Spark with ElasticsearchSpark with Elasticsearch
Spark with Elasticsearch
 
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...Today’s reality Hadoop with Spark- How to select the best Data Science approa...
Today’s reality Hadoop with Spark- How to select the best Data Science approa...
 
Confessions of a Former Agile Methodologist (JFrog Edition)
Confessions of a Former Agile Methodologist (JFrog Edition)Confessions of a Former Agile Methodologist (JFrog Edition)
Confessions of a Former Agile Methodologist (JFrog Edition)
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
"Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor...
"Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor..."Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor...
"Startupbootcamp and Data startups", Angel Garcia, Co-Founder and Tech Mentor...
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 

Similaire à JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime

Twitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in EchtzeitTwitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in EchtzeitGuido Schmutz
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline DevelopmentTimothy Spann
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentTimothy Spann
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023ssuser73434e
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkTimothy Spann
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table NotesTimothy Spann
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Timothy Spann
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoTJim Haughwout
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedGuido Schmutz
 
INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...
INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...
INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...apidays
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloITCamp
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Guido Schmutz
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basFlorent Ramiere
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructuremattlieber
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?Carlos Santana
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
 

Similaire à JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime (20)

Twitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in EchtzeitTwitter Storm: Ereignisverarbeitung in Echtzeit
Twitter Storm: Ereignisverarbeitung in Echtzeit
 
Meetup: Streaming Data Pipeline Development
Meetup:  Streaming Data Pipeline DevelopmentMeetup:  Streaming Data Pipeline Development
Meetup: Streaming Data Pipeline Development
 
Meetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline DevelopmentMeetup Streaming Data Pipeline Development
Meetup Streaming Data Pipeline Development
 
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
Future of Data Milwaukee Meetup Streaming Data Pipeline Development 28 June 2023
 
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and FlinkDBA Fundamentals Group: Continuous SQL with Kafka and Flink
DBA Fundamentals Group: Continuous SQL with Kafka and Flink
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Unconference Round Table Notes
Unconference Round Table NotesUnconference Round Table Notes
Unconference Round Table Notes
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
 
Spark Streaming the Industrial IoT
Spark Streaming the Industrial IoTSpark Streaming the Industrial IoT
Spark Streaming the Industrial IoT
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
 
INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...
INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...
INTERFACE by apidays 2023 - Leveraging Event Streaming to Super-Charge your B...
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
 
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
 
Architecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructureArchitecture of a Kafka camus infrastructure
Architecture of a Kafka camus infrastructure
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?OpenWhisk: Where Did My Servers Go?
OpenWhisk: Where Did My Servers Go?
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
 

Plus de jazoon13

JAZOON'13 - Joe Justice - Test First Saves The World
JAZOON'13 - Joe Justice - Test First Saves The WorldJAZOON'13 - Joe Justice - Test First Saves The World
JAZOON'13 - Joe Justice - Test First Saves The Worldjazoon13
 
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User StoriesJAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Storiesjazoon13
 
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung FuJAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fujazoon13
 
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -EssentialsJAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentialsjazoon13
 
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScriptJAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScriptjazoon13
 
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone beforeJAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone beforejazoon13
 
JAZOON'13 - Andres Almiray - Rocket Propelled Java
JAZOON'13 - Andres Almiray - Rocket Propelled JavaJAZOON'13 - Andres Almiray - Rocket Propelled Java
JAZOON'13 - Andres Almiray - Rocket Propelled Javajazoon13
 
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
JAZOON'13 - Sven Peters - How to do Kick-Ass Software DevelopmentJAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Developmentjazoon13
 
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...jazoon13
 
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed TeamsJAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teamsjazoon13
 
JAZOON'13 - Kai Waehner - Hadoop Integration
JAZOON'13 - Kai Waehner - Hadoop IntegrationJAZOON'13 - Kai Waehner - Hadoop Integration
JAZOON'13 - Kai Waehner - Hadoop Integrationjazoon13
 
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next GenerationJAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generationjazoon13
 
JAZOON'13 - Andrej Vckovski - Go synchronized
JAZOON'13 - Andrej Vckovski - Go synchronizedJAZOON'13 - Andrej Vckovski - Go synchronized
JAZOON'13 - Andrej Vckovski - Go synchronizedjazoon13
 
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experienceJAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experiencejazoon13
 
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !jazoon13
 
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling SoftwareJAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Softwarejazoon13
 
JAZOON'13 - Stefan Saasen - True Git: The Great Migration
JAZOON'13 - Stefan Saasen - True Git: The Great MigrationJAZOON'13 - Stefan Saasen - True Git: The Great Migration
JAZOON'13 - Stefan Saasen - True Git: The Great Migrationjazoon13
 
JAZOON'13 - Stefan Saasen - Real World Git Workflows
JAZOON'13 - Stefan Saasen - Real World Git WorkflowsJAZOON'13 - Stefan Saasen - Real World Git Workflows
JAZOON'13 - Stefan Saasen - Real World Git Workflowsjazoon13
 

Plus de jazoon13 (18)

JAZOON'13 - Joe Justice - Test First Saves The World
JAZOON'13 - Joe Justice - Test First Saves The WorldJAZOON'13 - Joe Justice - Test First Saves The World
JAZOON'13 - Joe Justice - Test First Saves The World
 
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User StoriesJAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
JAZOON'13 - Ulrika Park- User Story telling The Lost Art of User Stories
 
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung FuJAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
JAZOON'13 - Bartosz Majsak - Git Workshop - Kung Fu
 
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -EssentialsJAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
JAZOON'13 - Thomas Hug & Bartosz Majsak - Git Workshop -Essentials
 
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScriptJAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
JAZOON'13 - Oliver Zeigermann - Typed JavaScript with TypeScript
 
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone beforeJAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
JAZOON'13 - Andres Almiray - Spock: boldly go where no test has gone before
 
JAZOON'13 - Andres Almiray - Rocket Propelled Java
JAZOON'13 - Andres Almiray - Rocket Propelled JavaJAZOON'13 - Andres Almiray - Rocket Propelled Java
JAZOON'13 - Andres Almiray - Rocket Propelled Java
 
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
JAZOON'13 - Sven Peters - How to do Kick-Ass Software DevelopmentJAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
JAZOON'13 - Sven Peters - How to do Kick-Ass Software Development
 
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
JAZOON'13 - Nikita Salnikov-Tarnovski - Multiplatform Java application develo...
 
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed TeamsJAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
JAZOON'13 - Pawel Wrzeszcz - Visibility Shift In Distributed Teams
 
JAZOON'13 - Kai Waehner - Hadoop Integration
JAZOON'13 - Kai Waehner - Hadoop IntegrationJAZOON'13 - Kai Waehner - Hadoop Integration
JAZOON'13 - Kai Waehner - Hadoop Integration
 
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next GenerationJAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
JAZOON'13 - Sam Brannen - Spring Framework 4.0 - The Next Generation
 
JAZOON'13 - Andrej Vckovski - Go synchronized
JAZOON'13 - Andrej Vckovski - Go synchronizedJAZOON'13 - Andrej Vckovski - Go synchronized
JAZOON'13 - Andrej Vckovski - Go synchronized
 
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experienceJAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
JAZOON'13 - Paul Brauner - A backend developer meets the web: my Dart experience
 
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
JAZOON'13 - Anatole Tresch - Go for the money (JSR 354) !
 
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling SoftwareJAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
JAZOON'13 - Abdelmonaim Remani - The Economies of Scaling Software
 
JAZOON'13 - Stefan Saasen - True Git: The Great Migration
JAZOON'13 - Stefan Saasen - True Git: The Great MigrationJAZOON'13 - Stefan Saasen - True Git: The Great Migration
JAZOON'13 - Stefan Saasen - True Git: The Great Migration
 
JAZOON'13 - Stefan Saasen - Real World Git Workflows
JAZOON'13 - Stefan Saasen - Real World Git WorkflowsJAZOON'13 - Stefan Saasen - Real World Git Workflows
JAZOON'13 - Stefan Saasen - Real World Git Workflows
 

Dernier

2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 

Dernier (20)

2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 

JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime

  • 1. Kafka and Storm – event processing in realtime Guido Schmutz
  • 2. WELCOME Kafka and Storm – event processing in realtime Guido Schmutz
 Jazoon 2013 23.10.2013 BASEL 2 BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

  • 3. Guido Schmutz •  •  Working for Trivadis for more than 16 years Oracle ACE Director for Fusion Middleware and SOA •  •  Co-Author of different books Consultant, Trainer Software Architect for Java, Oracle, SOA, EDA, BigData und FastData •  •  Member of Trivadis Architecture Board Technology Manager @ Trivadis •  More than 20 years of software development 
 experience •  Contact: guido.schmutz@trivadis.com •  •  Blog: http://guidoschmutz.wordpress.com Twitter: gschmutz 3 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 4. Our company Trivadis is a market leader in IT consulting, system integration, solution engineering and the provision of IT services focusing on and technologies in Switzerland, Germany and Austria. We offer our services in the following strategic business fields: OPERATION Trivadis Services takes over the interacting operation of your IT systems. 4 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 5. With over 600 specialists and IT experts in your region Hamburg Düsseldorf Frankfurt Stuttgart Freiburg Wien München Basel Brugg Bern Zurich Lausanne 5 5 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 12 Trivadis branches and more than 600 employees   200 Service Level Agreements   Over 4,000 training participants   Research and development budget: CHF 5.0 / EUR 4 million   Financially self-supporting and sustainably profitable   Experience from more than 1,900 projects per year at over 800 customers
  • 6. Agenda 1.  Introduction 2.  What is Apache Kafka? 3.  What is Twitter Storm? 4.  Summary 6 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 7. Big Data Definition (Gartner et al) Characteristics of Big Data: Its Volume, Velocity and Variety in combination Tera-, Peta-, Exa-, Zetta-, Yota- bytes and constantly growing Velocity “Traditional” computing in RDBMS 
 is not scalable enough. 
 We search for “linear scalability” “Only … structured information 
 is not enough” – “95% of produced data in unstructured” + Veracity (IBM) - information uncertainty + Time to action ? – Big Data + Event Processing = Fast Data 7 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 8. 8 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 9. Sample – Show the counts of tweets related to Jazoon (using term jazoon) 
 23.10.2013 – 22:20 not showing #jazoon http://54.217.159.208:8484/jazoon-restapi/ 9 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 10. Velocity §  Velocity requirement examples: §  §  §  §  §  §  §  §  §  §  §  §  10 Recommendation Engine Predictive Analytics Marketing Campaign Analysis Customer Retention and Churn Analysis Social Graph Analysis Capital Markets Analysis Risk Management Rogue Trading Fraud Detection Retail Banking Network Monitoring Research and Development 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 11. Agenda 1.  Introduction 2.  What is Apache Kafka? 3.  What is Twitter Storm? 4.  Summary 11 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 12. Apache Kafka •  A distributed publish-subscribe messaging system •  Designed for processing of real time activity stream data (logs, metrics collections, social media streams, …) •  Initially developed at LinkedIn, now part of Apache •  Does not follow JMS Standards and does not use JMS API •  Kafka maintains feeds of messages in topics 12 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 13. Sample (#1) 1) Use the Twitter Streaming API (Twitter Horsebird Client) to get all the tweets containing the term jazoon Twitter Stream 13 tweet Twitter to Kafka tweet 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Kafka Topic Create a Filter Stream for tracking a list of terms
  • 14. Twitter Stream Sample (#1) tweet Twitter to Kafka tweet Kafka Topic Create a fully configured Horsebird Client instance Start the client in multiple threads 14 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 15. Twitter Stream Sample (#1) tweet Twitter to Kafka tweet Kafka Topic Convert Twitter status to Avro format Create the Kafka producer Send the message to the Kafka topic 15 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 16. Agenda 1.  Introduction 2.  What is Apache Kafka? 3.  What is Twitter Storm? 4.  Summary 16 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 17. What is Twitter Storm ? A platform for doing analysis on streams of data as they come in, so you can react to data as it happens. •  A highly distributed real-time computation system •  Provides general primitives to do real-time computation •  To simplify working with queues & workers •  scalable and fault-tolerant •  complementary to Hadoop Originated at Backtype, acquired by Twitter in 2011 Open Sourced late 2011 Part of Apache Incubator since September 2013 17 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 18. Hadoop vs. Storm Hadoop Storm Batch processing Jobs runs to completion Stateful nodes Scalable Guarantees no data loss Open Source => big batch processing 18 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 != = Real-time processing Topologies run forever Stateless nodes Scalable Guarantees no data loss Open Source => Fast, reactive, real time procesing
  • 19. Idea: Simplify dealing with queue & workers Scaling is painful – queue partitioning & worker deploy Operational overhead – worker failures & queue backups No guarantees on data processing 19 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 20. What does Storm provide? §  At least once message processing §  Fault-tolerant §  Horizontal scalability §  No intermediate queues §  Less operational overhead §  „Just works“ §  Broad set of use cases §  Stream processing §  Continous computation §  Distributed RPC 20 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 21. Main Concepts - Streams Stream •  an unbounded sequence of tuples •  core abstraction in Storm •  Defined with a schema that names the 
 fields in the tuple •  Value must be serializable •  Every stream has an id T T T T T T Subscribes: C & D Emits: Source of stream A Source of stream B 21 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 T Subscribes: A Emits: C Topology •  graph where each node is a spout or bolt •  edges indicating which bolt subscribes 
 to which streams T Subscribes: A Emits: D Subscribes: A & B Emits: -
  • 22. Main Concepts – Worker, Executor, Task Worker •  executes a subset of a topology, may run one or more executors for one or more components Executor •  thread spawned by worker Task •  performs the actual data processing Worker Process Task Task 22 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Task Task
  • 23. Main Concepts - Spout A spout is the component which acts as the source of streams in a topology Generally spouts read tuples from an external source and emit them into the topology Spouts can emit more than one stream 23 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 24. Main Concepts - Bolt Bolt is doing the processing of the message •  filtering, functions, aggregations, joins, talking to databases…. •  Can do simple stream transformations •  Complex transformations often require multiple steps and therefore multiple bolts Bolts can emit one or more streams 24 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 25. Sample (#2) 2) Use a Spout to subscribe to the Kafka Topic to get the related tweets Twitter Stream tweet Twitter to Kafka tweet Kafka Topic 25 tweet Kafka
 Spout tweet 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 26. Kafka Topic Sample (#2) Tweets Kafka
 Spout tweet Use existing implementation of the Kafka Spout from storm-kafka project Create a Kafka configuration 26 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Create Kafka spout to connect to the topic
  • 27. Sample (#3) 3) Use a bolt to retrieve the hashtags for each tweet Twitter Stream tweet Twitter to Kafka tweet Kafka Topic 27 tweet Kafka
 Spout tweet 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Hashtag
 Splitter hashtag
  • 28. Kafka Topic Sample (#3) Tweets Kafka
 Spout tweet Hashtag
 Splitter hashtag Execute is called for each tuple arriving on the input stream Declares the output fields for the component and is called when bolt is created 28 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 29. Sample (#4) 4) Use a bolt to count the occurrences of hashtags into a Redis NoSQL DB Twitter Stream Tweets Twitter to Kafka Tweets Kafka Topic 29 Tweets Kafka
 Spout tweet 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Hashtag
 Splitter hashtag Hashtag
 Counter
  • 30. Sample (#4) Kafka Topic Tweets Kafka
 Spout tweet Hashtag
 Splitter hashtag Hashtag
 Counter Execute is called for each tuple arriving on the input stream Prepare is called when bolt is created 30 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 31. Main Concepts - Topology Kafka Topic Tweets Kafka
 Spout Hashtag
 Splitter Shuffle Hashtag
 Counter Fields Hashtag
 Splitter Hashtag
 Counter Each Spout or Bolt are running N instances in parallel Shuffle grouping is random grouping Fields grouping is grouped by value, such that equal value results in equal task All grouping replicates to all tasks Global grouping makes all tuples go to one task None grouping makes bolt run in the same thread as bold/spout it subscribes to Direct grouping producer (task that emits) controls which consumer will receive 31 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 32. Sample – How does it work ? My presentation „Kafka and Storm – event processing in realtime“ tomorrow 12:00 at Jazoon Zurich #jazoon #storm #kafka CU there! Kafka Topic 32 Kafka
 Spout 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Hashtag
 Splitter Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter
  • 33. Sample – How does it work ? My presentation „Kafka and Storm – event processing in realtime“ tomorrow 12:00 at Jazoon Zurich #jazoon #storm #kafka CU there! Kafka Topic 33 Kafka
 Spout 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Hashtag
 Splitter Hashtag
 Splitter jazoon storm kafka Hashtag
 Counter Hashtag
 Counter
  • 34. Sample – How does it work ? My presentation „Kafka and Storm – event processing in realtime“ tomorrow 12:00 at Jazoon Zurich #jazoon #storm #kafka CU there! Kafka Topic 34 Kafka
 Spout 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Hashtag
 Splitter Hashtag
 Splitter jazoon storm kafka Hashtag
 Counter Hashtag
 Counter INCR jazoon INCR storm INCR kafka jazoon = 1 storm = 1 kafka = 1
  • 35. Sample (#5) 5) Setup the topology with the necessary groupings (distribution) Create Stream called „tweet-stream“ Register Kafka spout and run by 1 worker Subscribe to stream „tweetstream“ using shuffle grouping Run this bolt by 2 workers Subscribe to „hashtag-splitter“ using fields grouping on field „hashtag“ 35 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 36. Trident •  High-Level abstraction on top of storm •  Makes it easier to build topologies •  Core data model is the stream •  Processed as a series of batches •  Stream is partitioned among nodes in cluster •  5 kinds of operations in Trident •  Operations that apply locally to each partition and cause no network transfer •  Repartitioning operations that don‘t change the contents •  Aggregation operations that do network transfer •  Operations on grouped streams •  Merges and Joins 36 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 37. Trident Supports •  •  •  •  •  Joins Aggregations Grouping Functions Filters Similar to Hadoop and Pig/Cascading 37 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 38. Trident Concepts - Topology Kafka Topic 38 tweet Kafka
 Spout tweet Bolt Hashtag
 Splitter 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 hashtag local Hashtag
 Normalizer hashtag groupBy Bolt Persistent
 Aggregate
  • 39. Trident Concepts - Function •  takes in a set of input fields and emits zero or more tuples as output •  fields of the output tuple are appended to the original input tuple in the stream •  If a function emits no tuples, the original input tuple is filtered out •  Otherwise the input tuple is duplicated for each output tuple 39 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 40. Agenda 1.  Introduction 2.  What is Apache Kafka? 3.  What is Twitter Storm? 4.  Summary 40 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 41. Summary Kafka Storm •  Distributed Scalable Pub/Sub system for Big Data •  Express real-time processing naturally •  Producer => Broker => Consumer of message topics •  Persists messages with ability to rewind •  Consumer decides what he as consumed so far •  Not a Hadoop/MapReduce competitor •  Supports other languages •  Hard to debug •  Object Serialization •  Didn‘t cover §  Reliability through guaranteed message processing §  Distributed RPC §  Storm cluster setup and deploy http://54.217.159.208:8484/jazoon-restapi/ 41 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 42. Lambda Architecture
 Big Data and Fast Data combined Precompute Precomputed Views information All data Batch recompute Incoming Data Serving Layer batch view batch view Merge Batch Layer Speed Layer Process stream Incremented information Realtime increment query real time view real time view Source: Marz, N. & Warren, J. (2013) Big Data. Manning. 42 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 43. Lambda Architecture
 Big Data and Fast Data combined one possible product/framework mapping Precompute Precomputed Views information All data Batch recompute Incoming Data batch view batch view Speed Layer Process stream Incremented information Realtime increment 43 Serving Layer Merge Batch Layer 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 real time view real time view query
  • 44. Resources Sample Code: https://github.com/gschmutz/jazoon-storm-twitter-sample Twitter Streaming API: https://dev.twitter.com/docs/streaming-apis Twitter Horsebird Client (HBC): https://github.com/twitter/hbc Apache Kafka: http://kafka.apache.org/ Storm Website: http://storm-project.net/ Storm Wiki: https://github.com/nathanmarz/storm/wiki Storm Doc: https://github.com/nathanmarz/storm/wiki/Documentation 44 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013
  • 45. Thank You! BASEL 45 BERN LAUSANNE ZÜRICH DÜSSELDORF Trivadis AG Guido Schmutz
 guido.schmutz@trivadis.com FRANKFURT A.M. 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

  • 46. Lambda Architecture
 Big Data and Fast Data combined Batch Layer Serving Layer Immutable data Batch View B Incoming Data C D A G Speed Layer Data Stream E 46 2013 © Trivadis Kafka and Storm – event processing in realtime 22.10.2013 Realtime View F Query