SlideShare une entreprise Scribd logo
1  sur  40
2013 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

WELCOME Twitter Storm:
Ereignisverarbeitung in
Echtzeit
Guido Schmutz

Java Forum Stuttgart 2013
4.7.2013
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
1
2013 © Trivadis
Trivadis ist führend bei der IT-Beratung, der Systemintegration,
dem Solution-Engineering und der Erbringung von IT-Services
mit Fokussierung auf und Technologien
im D-A-CH-Raum.
Unsere Leistungen erbringen wir aus den strategischen Geschäftsfeldern:
Trivadis Services übernimmt den korrespondierenden Betrieb
Ihrer IT Systeme.
Unser Unternehmen
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
B E T R I E B
2
2013 © Trivadis
Mit über 600 IT- und Fachexperten bei Ihnen vor Ort
3
11 Trivadis Niederlassungen mit

über 600 Mitarbeitenden
200 Service Level Agreements
Mehr als 4'000 Trainingsteilnehmer
Forschungs- und Entwicklungs-
budget: CHF 5.0 / EUR 4 Mio.
Finanziell unabhängig und

nachhaltig profitabel
Erfahrung aus mehr als 1'900
Projekten pro Jahr bei über 800
Kunden
Stand 12/2012
Hamburg
Düsseldorf
Frankfurt
Freiburg
München
Wien
Basel
ZürichBern
Lausanne
3
Stuttgart
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
3
2013 © Trivadis
Guido Schmutz
•  Working for Trivadis for more than 16 years
•  Oracle ACE Director for Fusion Middleware and SOA
•  Co-Author of different books
•  Consultant, Trainer Software Architect for Java, Oracle, SOA,
EDA, BigData und FastData
•  Member of Trivadis Architecture Board
•  Technology Manager @ Trivadis
•  More than 20 years of software development 

experience
•  Contact: guido.schmutz@trivadis.com
•  Blog: http://guidoschmutz.wordpress.com
•  Twitter: gschmutz
4.7.2013
4
Twitter Storm: Ereignisverarbeitung in Echtzeit
2013 © Trivadis
Big Data Definition (Gartner et al)
14.02.2013
Big Data 4 Sales
5
Velocity
Tera-, Peta-, Exa-, Zetta-, Yota- bytes and constantly growing
“Traditional” computing in RDBMS 

is not scalable enough. 

We search for “linear scalability”
“Only … structured information 

is not enough” – “95% of produced data in
unstructured”
Characteristics of Big Data: Its
Volume, Velocity and Variety in
combination
+ Veracity (IBM) - information uncertainty
+ Time to action ? – Big Data + Event Processing = Fast Data
2013 © Trivadis
14.02.2013
Big Data 4 Sales
6
2013 © Trivadis
Velocity
24. April 2013
Big Data und Fast Data
7
§  Velocity requirement examples:
§  Recommendation Engine
§  Predictive Analytics
§  Marketing Campaign Analysis
§  Customer Retention and Churn Analysis
§  Social Graph Analysis
§  Capital Markets Analysis
§  Risk Management
§  Rogue Trading
§  Fraud Detection
§  Retail Banking
§  Network Monitoring
§  Research and Development
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
8
2013 © Trivadis
What is Twitter Storm ?
A platform for doing analysis on streams of data as they come in, so you
can react to data as it happens.
•  A highly distributed real-time computation system
•  Provides general primitives to do real-time computation
•  To simplify working with queues & workers
•  scalable and fault-tolerant
•  complementary to Hadoop
Originated at Backtype, acquired by Twitter in 2011
Open Sourced late 2011
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
9
2013 © Trivadis
Hadoop vs. Storm
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
10
Storm
Real-time processing
Topologies run forever
Stateless nodes
Scalable
Guarantees no data loss
Open Source
=> Fast, reactive, real time procesing
Hadoop
Batch processing
Jobs runs to completion
Stateful nodes
Scalable
Guarantees no data loss
Open Source
=> big batch processing
=
!=
2013 © Trivadis
Idea: Simplify dealing with queue & workers
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
11
Scaling is painful – queue partitioning & worker deploy
Operational overhead – worker failures & queue backups
No guarantees on data processing
2013 © Trivadis
What does Storm provide?
§  At least once message processing
§  Fault-tolerant
§  Horizontal scalability
§  No intermediate queues
§  Less operational overhead
§  „Just works“
§  Broad set of use cases
§  Stream processing
§  Continous computation
§  Distributed RPC
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
12
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
13
2013 © Trivadis
Main Concepts – Shown based on Twitter example
Show the tweets related to Java Forum (use hashtag #JFS2013) and show
the count of the hashtags used
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
14
not showing #JFS2013
2013 © Trivadis
Main Concepts - Streams
Stream
•  an unbounded sequence of tuples
•  core abstraction in Storm
•  Defined with a schema that names the 

fields in the tuple
•  Value must be serializable
•  Every stream has an id
Topology
•  graph where each node is a spout or bolt
•  edges indicating which bolt subscribes 

to which streams
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
15
Source of
stream A
Source of
stream B
Subscribes: A
Emits: C
Subscribes: A
Emits: D
Subscribes: A & B
Emits: -
Subscribes: C & D
Emits: -
T T T T T T T T
2013 © Trivadis
Worker
•  executes a subset of a topology, may run one or more executors for one or
more components
Executor
•  thread spawned by worker
Task
•  performs the actual data processing
Main Concepts – Worker, Executor, Task
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
16
Worker Process
Task
Task
Task
Task
2013 © Trivadis
Main Concepts - Spout
A spout is the component which acts as the source of streams in a
topology
Generally spouts read tuples from an external source and emit them into
the topology
Spouts can emit more than one stream
Step 1: Use a Spout to subscribe to the Twitter Filter Streaming API to get
all the tweets containing the hashtag #JFS2013
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
17
Twitter
Stream
Tweets tweet
Tweet

Stream
2013 © Trivadis
Main Concepts - Spout
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
18
Twitter
Stream
Tweets tweet
Tweet

Stream
By calling this method, Storm requests that the
Spout emits tuples to the output collector
Called when a task for this
component is initialized
Used to declare streams and their schmeas. it is possible to declare
serveral streams and specifying the one to use when emiting tuples
2013 © Trivadis
Bolt is doing the processing of the message
•  filtering, functions, aggregations, joins, talking to databases….
•  Can do simple stream transformations
•  Complex transformations often require multiple steps and therefore multiple
bolts
Bolts can emit one or more streams
Step 2: Use a bolt to retrieve the hashtags for each tweet
Main Concepts - Bolt
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
19
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag
2013 © Trivadis
Main Concepts - Bolt
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
20
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag
Execute is called for each tuple
arriving on the input stream
Declares the output fields for the component and
is called when bolt is created
2013 © Trivadis
Main Concepts - Bolt
Step 3: Use a bolt to count the occurences of the hashtag in Redis NoSQL
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
21
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag Hashtag

Counter
2013 © Trivadis
Main Concepts - Bolt
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
22
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag Hashtag

Counter
Prepare is called when bolt is
created
Execute is called for each tuple
arriving on the input stream
2013 © Trivadis
Main Concepts - Topology
Step 4: Setup the topology with the necessary groupings (distribution)
Each Spout or Bolt are running N instances in parallel
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
23
Twitter
Stream
Tweets
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
Shuffle Fields
Shuffle grouping is random grouping
Fields grouping is grouped by value, such that equal value results in equal task
All grouping replicates to all tasks
Global grouping makes all tuples go to one task
None grouping makes bolt run in the same thread as bold/spout it subscribes to
Direct grouping producer (task that emits) controls which consumer will receive
2013 © Trivadis
Main Concepts - Topology
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
24
Create Stream called „tweet-stream“
Run this bolt by 2 workers
Run this spout
by 1 worker
Subscribe to stream „tweet-stream“
using shuffle grouping
Create stream called „hashtag-splitter“
Subscribe to stream „hashtag-splitter“
using fields grouping on field „hashtag“
2013 © Trivadis
Sample – How does it work ?
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
25
Twitter
Stream
Mein Präsentation
„Twitter Storm:
Ereignisverarbeitung
in Echtzeit“ Morgen
um 15:35 am Java
Forum Stuttgart
#JFS2013 #storm
#fastdata CU there!
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
2013 © Trivadis
Sample – How does it work ?
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
26
Twitter
Stream
Mein Präsentation
„Twitter Storm:
Ereignisverarbeitung
in Echtzeit“ Morgen
um 15:35 am Java
Forum Stuttgart
#JFS2013 #storm
#fastdata CU there!
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
storm
fastdata
jsf2013
2013 © Trivadis
Sample – How does it work ?
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
27
Twitter
Stream
Mein Präsentation
„Twitter Storm:
Ereignisverarbeitung
in Echtzeit“ Morgen
um 15:35 am Java
Forum Stuttgart
#JFS2013 #storm
#fastdata CU there!
Hashtag

Splitter
Tweet

Stream
Hashtag

Counter
Hashtag

Splitter
Hashtag

Counter
storm
fastdata
jfs2013
INCR
jfs2013
INCR
storm
INCR
fastdata storm = 1
fastdata =1
jfs2013= 1
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
28
2013 © Trivadis
What is Trident?
•  High-Level abstraction on top of storm
•  Makes it easier to build topologies
•  Core data model is the stream
•  Processed as a series of batches
•  Stream is partitioned among nodes in cluster
•  5 kinds of operations in Trident
•  Operations that apply locally to each partition and cause no network
transfer
•  Repartitioning operations that don‘t change the contents
•  Aggregation operations that do network transfer
•  Operations on grouped streams
•  Merges and Joins
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
29
2013 © Trivadis
What is Trident?
Supports
•  Joins
•  Aggregations
•  Grouping
•  Functions
•  Filters
Similar to Hadoop and Pig/Cascading
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
30
2013 © Trivadis
Trident Concepts - Topology
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
31
Twitter
Stream
Tweets tweet
Hashtag

Splitter
Tweet

Stream
hashtag Hashtag

Normalizer
Persistent

Aggregate
hashtag
groupBylocal
Bolt Bolt
2013 © Trivadis
Trident Concepts - Function
•  takes in a set of input fields and emits zero or more tuples as output
•  fields of the output tuple are appended to the original input tuple in the
stream
•  If a function emits no tuples, the original input tuple is filtered out
•  Otherwise the input tuple is duplicated for each output tuple
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
32
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
33
2013 © Trivadis
Lambda Architecture
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
34
Immutable
data
Batch
View
Query
Data
Stream
Realtime
View
Incoming
Data
Serving Layer
Speed Layer
Batch Layer
A
B
C D
E
F
G
2013 © Trivadis
Lambda Architecture
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
35
Speed Layer
Precompute
Views
query
Source: Marz, N. & Warren, J. (2013) Big Data. Manning.
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Incoming
Data
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
Merge
2013 © Trivadis
Lambda Architecture
24. April 2013
Big Data und Fast Data
36
one possible product/framework mapping
Speed Layer
Precompute
Views
query
Batch Layer
Precomputed
information
All data
Incremented
information
Process stream
Incoming
Data
Batch
recompute
Realtime
increment
Serving Layer
batch view
batch view
real time view
real time view
Merge
2013 © Trivadis
Agenda
1.  What is Twitter Storm
2.  Main Concepts
3.  Trident
4.  BigData and FastData combined – Lambda Architecture
5.  Summary
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
37
2013 © Trivadis
Summary
Express real-time processing naturally
Not a Hadoop/MapReduce competitor
Supports other languages
Hard to debug
Object Serialization
Didn‘t cover
§  Reliability through guaranteed message processing
§  Distributed RPC
§  Storm cluster setup and deploy
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
38
2013 © Trivadis
Resources
Website (http://storm-project.net/)
Wiki (https://github.com/nathanmarz/storm/wiki)
Doc (https://github.com/nathanmarz/storm/wiki/Documentation)
Storm-starter project (https://github.com/nathanmarz/storm-starter)
Mailing list (http://groups.google.com/group/storm-user)
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
39
2013 © Trivadis
BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN

Thank You!
Trivadis AG
Guido Schmutz

guido.schmutz@trivadis.com
4.7.2013
Twitter Storm: Ereignisverarbeitung in Echtzeit
40

Contenu connexe

Tendances

From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleDr. Mirko Kämpf
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...SoftServe
 
Webinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph DatabasesWebinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph DatabasesDataStax
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Databricks
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...Kai Wähner
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013Kai Wähner
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at NubankDatabricks
 
Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database DataStax
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...DataStax Academy
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Big Data Spain
 
Webinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveWebinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveDataStax
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015DataWorks Summit
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Data Con LA
 
Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...DataStax
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Guido Schmutz
 

Tendances (20)

From Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on ScaleFrom Events to Networks: Time Series Analysis on Scale
From Events to Networks: Time Series Analysis on Scale
 
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
Big Data Analytics: Reference Architectures and Case Studies by Serhiy Haziye...
 
Webinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph DatabasesWebinar: Fighting Fraud with Graph Databases
Webinar: Fighting Fraud with Graph Databases
 
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
Building a Streaming Microservices Architecture - Data + AI Summit EU 2020
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
How to Apply Machine Learning with R, H20, Apache Spark MLlib or PMML to Real...
 
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
"Big Data beyond Apache Hadoop - How to Integrate ALL your Data" - JavaOne 2013
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Data Democratization at Nubank
 Data Democratization at Nubank Data Democratization at Nubank
Data Democratization at Nubank
 
Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database Webinar - Fighting Bank Fraud with Real-time Graph Database
Webinar - Fighting Bank Fraud with Real-time Graph Database
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
Smart data for a predictive bank
Smart data for a predictive bankSmart data for a predictive bank
Smart data for a predictive bank
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
 
Webinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's PerspectiveWebinar: Customer Experience in Banking - a CTO's Perspective
Webinar: Customer Experience in Banking - a CTO's Perspective
 
Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015Extending Data Lake using the Lambda Architecture June 2015
Extending Data Lake using the Lambda Architecture June 2015
 
Intuit Analytics Cloud 101
Intuit Analytics Cloud 101Intuit Analytics Cloud 101
Intuit Analytics Cloud 101
 
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
Big Data Day LA 2016/ NoSQL track - Architecting Real Life IoT Architecture, ...
 
Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...Data stax webinar cassandra and titandb insights into datastax graph strategy...
Data stax webinar cassandra and titandb insights into datastax graph strategy...
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 

En vedette

HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917Chicago Hadoop Users Group
 
Making Big Data Analytics Interactive and Real-­Time
 Making Big Data Analytics Interactive and Real-­Time Making Big Data Analytics Interactive and Real-­Time
Making Big Data Analytics Interactive and Real-­TimeSeven Nguyen
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBGeoffrey Anderson
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLCloudera, Inc.
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDataWorks Summit
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013Nathan Bijnens
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponCloudera, Inc.
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsHortonworks
 
Integration of Hive and HBase
Integration of Hive and HBaseIntegration of Hive and HBase
Integration of Hive and HBaseHortonworks
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Karthik Ramasamy
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm Chandler Huang
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationnathanmarz
 

En vedette (16)

HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917HCatalog: Table Management for Hadoop - CHUG - 20120917
HCatalog: Table Management for Hadoop - CHUG - 20120917
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
Making Big Data Analytics Interactive and Real-­Time
 Making Big Data Analytics Interactive and Real-­Time Making Big Data Analytics Interactive and Real-­Time
Making Big Data Analytics Interactive and Real-­Time
 
Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBMonitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDB
 
Using Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETLUsing Morphlines for On-the-Fly ETL
Using Morphlines for On-the-Fly ETL
 
Intro to Pig UDF
Intro to Pig UDFIntro to Pig UDF
Intro to Pig UDF
 
Deploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analyticsDeploying Apache Flume to enable low-latency analytics
Deploying Apache Flume to enable low-latency analytics
 
A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013A real time architecture using Hadoop and Storm @ FOSDEM 2013
A real time architecture using Hadoop and Storm @ FOSDEM 2013
 
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
 
Apache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data ApplicationsApache Hadoop YARN - Enabling Next Generation Data Applications
Apache Hadoop YARN - Enabling Next Generation Data Applications
 
Spark and shark
Spark and sharkSpark and shark
Spark and shark
 
Integration of Hive and HBase
Integration of Hive and HBaseIntegration of Hive and HBase
Integration of Hive and HBase
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
 
Introduction to Storm
Introduction to Storm Introduction to Storm
Introduction to Storm
 
Storm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computationStorm: distributed and fault-tolerant realtime computation
Storm: distributed and fault-tolerant realtime computation
 

Similaire à Twitter Storm: Ereignisverarbeitung in Echtzeit

Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeGuido Schmutz
 
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In RealtimeJAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtimejazoon13
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Trivadis
 
Blueprints for the analysis of social media
Blueprints for the analysis of social mediaBlueprints for the analysis of social media
Blueprints for the analysis of social mediaGuido Schmutz
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Timothy Spann
 
Software Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of TrivadisSoftware Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of TrivadisClaude-Alain Glauser
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedGuido Schmutz
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedGuido Schmutz
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...InfluxData
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cProcessing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cGuido Schmutz
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Impetus Technologies
 
Curated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsCurated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsAlejandro Rios Peña
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Brian Brazil
 
Event-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzEvent-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzTrivadis
 
S2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldS2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldSean Roberts
 
Multiple awr reports_parser
Multiple awr reports_parserMultiple awr reports_parser
Multiple awr reports_parserJacques Kostic
 

Similaire à Twitter Storm: Ereignisverarbeitung in Echtzeit (20)

Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtime
 
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In RealtimeJAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
JAZOON'13 - Guide Schmutz - Kafka and Strom Event Processing In Realtime
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
Blueprints for the analysis of social media
Blueprints for the analysis of social mediaBlueprints for the analysis of social media
Blueprints for the analysis of social media
 
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...Implement a Universal Data Distribution Architecture to Manage All Streaming ...
Implement a Universal Data Distribution Architecture to Manage All Streaming ...
 
Software Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of TrivadisSoftware Factory & TVD-REN the Vaadin framework of Trivadis
Software Factory & TVD-REN the Vaadin framework of Trivadis
 
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms comparedApache Storm vs. Spark Streaming – two Stream Processing Platforms compared
Apache Storm vs. Spark Streaming – two Stream Processing Platforms compared
 
Apache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms comparedApache Storm vs. Spark Streaming - two stream processing platforms compared
Apache Storm vs. Spark Streaming - two stream processing platforms compared
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12cProcessing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
Processing Twitter Events in Real-Time with Oracle Event Processing (OEP) 12c
 
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
 
Curated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center PlatformsCurated "Cloud Design Patterns" for Call Center Platforms
Curated "Cloud Design Patterns" for Call Center Platforms
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Monitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp DockerMonitoring in 2017 - TIAD Camp Docker
Monitoring in 2017 - TIAD Camp Docker
 
Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)Evolution of Monitoring and Prometheus (Dublin 2018)
Evolution of Monitoring and Prometheus (Dublin 2018)
 
Event-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutzEvent-Processing-und-BigData-kombiniert-guido_schmutz
Event-Processing-und-BigData-kombiniert-guido_schmutz
 
S2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real WorldS2DS London 2015 - Hadoop Real World
S2DS London 2015 - Hadoop Real World
 
Multiple awr reports_parser
Multiple awr reports_parserMultiple awr reports_parser
Multiple awr reports_parser
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 

Plus de Guido Schmutz

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as CodeGuido Schmutz
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureGuido Schmutz
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsGuido Schmutz
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!Guido Schmutz
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Guido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureGuido Schmutz
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaGuido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaGuido Schmutz
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?Guido Schmutz
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaGuido Schmutz
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaGuido Schmutz
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Guido Schmutz
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaGuido Schmutz
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Guido Schmutz
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming VisualizationGuido Schmutz
 

Plus de Guido Schmutz (20)

30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code30 Minutes to the Analytics Platform with Infrastructure as Code
30 Minutes to the Analytics Platform with Infrastructure as Code
 
Event Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data ArchitectureEvent Broker (Kafka) in a Modern Data Architecture
Event Broker (Kafka) in a Modern Data Architecture
 
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-FormatsBig Data, Data Lake, Fast Data - Dataserialiation-Formats
Big Data, Data Lake, Fast Data - Dataserialiation-Formats
 
ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!ksqlDB - Stream Processing simplified!
ksqlDB - Stream Processing simplified!
 
Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?Kafka as your Data Lake - is it Feasible?
Kafka as your Data Lake - is it Feasible?
 
Event Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data ArchitectureEvent Hub (i.e. Kafka) in Modern Data Architecture
Event Hub (i.e. Kafka) in Modern Data Architecture
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) ArchitectureEvent Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
Event Hub (i.e. Kafka) in Modern Data (Analytics) Architecture
 
Building Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache KafkaBuilding Event Driven (Micro)services with Apache Kafka
Building Event Driven (Micro)services with Apache Kafka
 
Location Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache KafkaLocation Analytics - Real-Time Geofencing using Apache Kafka
Location Analytics - Real-Time Geofencing using Apache Kafka
 
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache KafkaSolutions for bi-directional integration between Oracle RDBMS and Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS and Apache Kafka
 
What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?What is Apache Kafka? Why is it so popular? Should I use it?
What is Apache Kafka? Why is it so popular? Should I use it?
 
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache KafkaSolutions for bi-directional integration between Oracle RDBMS & Apache Kafka
Solutions for bi-directional integration between Oracle RDBMS & Apache Kafka
 
Location Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using KafkaLocation Analytics Real-Time Geofencing using Kafka
Location Analytics Real-Time Geofencing using Kafka
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?Kafka as an event store - is it good enough?
Kafka as an event store - is it good enough?
 
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache KafkaSolutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
 
Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka Location Analytics - Real-Time Geofencing using Kafka
Location Analytics - Real-Time Geofencing using Kafka
 
Streaming Visualization
Streaming VisualizationStreaming Visualization
Streaming Visualization
 

Dernier

WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneUiPathCommunity
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsYoss Cohen
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesBernd Ruecker
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 

Dernier (20)

WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyone
 
Infrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platformsInfrared simulation and processing on Nvidia platforms
Infrared simulation and processing on Nvidia platforms
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
QCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architecturesQCon London: Mastering long-running processes in modern architectures
QCon London: Mastering long-running processes in modern architectures
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 

Twitter Storm: Ereignisverarbeitung in Echtzeit

  • 1. 2013 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
 WELCOME Twitter Storm: Ereignisverarbeitung in Echtzeit Guido Schmutz
 Java Forum Stuttgart 2013 4.7.2013 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 1
  • 2. 2013 © Trivadis Trivadis ist führend bei der IT-Beratung, der Systemintegration, dem Solution-Engineering und der Erbringung von IT-Services mit Fokussierung auf und Technologien im D-A-CH-Raum. Unsere Leistungen erbringen wir aus den strategischen Geschäftsfeldern: Trivadis Services übernimmt den korrespondierenden Betrieb Ihrer IT Systeme. Unser Unternehmen 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit B E T R I E B 2
  • 3. 2013 © Trivadis Mit über 600 IT- und Fachexperten bei Ihnen vor Ort 3 11 Trivadis Niederlassungen mit
 über 600 Mitarbeitenden 200 Service Level Agreements Mehr als 4'000 Trainingsteilnehmer Forschungs- und Entwicklungs- budget: CHF 5.0 / EUR 4 Mio. Finanziell unabhängig und
 nachhaltig profitabel Erfahrung aus mehr als 1'900 Projekten pro Jahr bei über 800 Kunden Stand 12/2012 Hamburg Düsseldorf Frankfurt Freiburg München Wien Basel ZürichBern Lausanne 3 Stuttgart 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 3
  • 4. 2013 © Trivadis Guido Schmutz •  Working for Trivadis for more than 16 years •  Oracle ACE Director for Fusion Middleware and SOA •  Co-Author of different books •  Consultant, Trainer Software Architect for Java, Oracle, SOA, EDA, BigData und FastData •  Member of Trivadis Architecture Board •  Technology Manager @ Trivadis •  More than 20 years of software development 
 experience •  Contact: guido.schmutz@trivadis.com •  Blog: http://guidoschmutz.wordpress.com •  Twitter: gschmutz 4.7.2013 4 Twitter Storm: Ereignisverarbeitung in Echtzeit
  • 5. 2013 © Trivadis Big Data Definition (Gartner et al) 14.02.2013 Big Data 4 Sales 5 Velocity Tera-, Peta-, Exa-, Zetta-, Yota- bytes and constantly growing “Traditional” computing in RDBMS 
 is not scalable enough. 
 We search for “linear scalability” “Only … structured information 
 is not enough” – “95% of produced data in unstructured” Characteristics of Big Data: Its Volume, Velocity and Variety in combination + Veracity (IBM) - information uncertainty + Time to action ? – Big Data + Event Processing = Fast Data
  • 7. 2013 © Trivadis Velocity 24. April 2013 Big Data und Fast Data 7 §  Velocity requirement examples: §  Recommendation Engine §  Predictive Analytics §  Marketing Campaign Analysis §  Customer Retention and Churn Analysis §  Social Graph Analysis §  Capital Markets Analysis §  Risk Management §  Rogue Trading §  Fraud Detection §  Retail Banking §  Network Monitoring §  Research and Development
  • 8. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 8
  • 9. 2013 © Trivadis What is Twitter Storm ? A platform for doing analysis on streams of data as they come in, so you can react to data as it happens. •  A highly distributed real-time computation system •  Provides general primitives to do real-time computation •  To simplify working with queues & workers •  scalable and fault-tolerant •  complementary to Hadoop Originated at Backtype, acquired by Twitter in 2011 Open Sourced late 2011 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 9
  • 10. 2013 © Trivadis Hadoop vs. Storm 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 10 Storm Real-time processing Topologies run forever Stateless nodes Scalable Guarantees no data loss Open Source => Fast, reactive, real time procesing Hadoop Batch processing Jobs runs to completion Stateful nodes Scalable Guarantees no data loss Open Source => big batch processing = !=
  • 11. 2013 © Trivadis Idea: Simplify dealing with queue & workers 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 11 Scaling is painful – queue partitioning & worker deploy Operational overhead – worker failures & queue backups No guarantees on data processing
  • 12. 2013 © Trivadis What does Storm provide? §  At least once message processing §  Fault-tolerant §  Horizontal scalability §  No intermediate queues §  Less operational overhead §  „Just works“ §  Broad set of use cases §  Stream processing §  Continous computation §  Distributed RPC 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 12
  • 13. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 13
  • 14. 2013 © Trivadis Main Concepts – Shown based on Twitter example Show the tweets related to Java Forum (use hashtag #JFS2013) and show the count of the hashtags used 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 14 not showing #JFS2013
  • 15. 2013 © Trivadis Main Concepts - Streams Stream •  an unbounded sequence of tuples •  core abstraction in Storm •  Defined with a schema that names the 
 fields in the tuple •  Value must be serializable •  Every stream has an id Topology •  graph where each node is a spout or bolt •  edges indicating which bolt subscribes 
 to which streams 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 15 Source of stream A Source of stream B Subscribes: A Emits: C Subscribes: A Emits: D Subscribes: A & B Emits: - Subscribes: C & D Emits: - T T T T T T T T
  • 16. 2013 © Trivadis Worker •  executes a subset of a topology, may run one or more executors for one or more components Executor •  thread spawned by worker Task •  performs the actual data processing Main Concepts – Worker, Executor, Task 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 16 Worker Process Task Task Task Task
  • 17. 2013 © Trivadis Main Concepts - Spout A spout is the component which acts as the source of streams in a topology Generally spouts read tuples from an external source and emit them into the topology Spouts can emit more than one stream Step 1: Use a Spout to subscribe to the Twitter Filter Streaming API to get all the tweets containing the hashtag #JFS2013 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 17 Twitter Stream Tweets tweet Tweet
 Stream
  • 18. 2013 © Trivadis Main Concepts - Spout 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 18 Twitter Stream Tweets tweet Tweet
 Stream By calling this method, Storm requests that the Spout emits tuples to the output collector Called when a task for this component is initialized Used to declare streams and their schmeas. it is possible to declare serveral streams and specifying the one to use when emiting tuples
  • 19. 2013 © Trivadis Bolt is doing the processing of the message •  filtering, functions, aggregations, joins, talking to databases…. •  Can do simple stream transformations •  Complex transformations often require multiple steps and therefore multiple bolts Bolts can emit one or more streams Step 2: Use a bolt to retrieve the hashtags for each tweet Main Concepts - Bolt 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 19 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag
  • 20. 2013 © Trivadis Main Concepts - Bolt 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 20 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Execute is called for each tuple arriving on the input stream Declares the output fields for the component and is called when bolt is created
  • 21. 2013 © Trivadis Main Concepts - Bolt Step 3: Use a bolt to count the occurences of the hashtag in Redis NoSQL 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 21 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Hashtag
 Counter
  • 22. 2013 © Trivadis Main Concepts - Bolt 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 22 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Hashtag
 Counter Prepare is called when bolt is created Execute is called for each tuple arriving on the input stream
  • 23. 2013 © Trivadis Main Concepts - Topology Step 4: Setup the topology with the necessary groupings (distribution) Each Spout or Bolt are running N instances in parallel 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 23 Twitter Stream Tweets Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter Shuffle Fields Shuffle grouping is random grouping Fields grouping is grouped by value, such that equal value results in equal task All grouping replicates to all tasks Global grouping makes all tuples go to one task None grouping makes bolt run in the same thread as bold/spout it subscribes to Direct grouping producer (task that emits) controls which consumer will receive
  • 24. 2013 © Trivadis Main Concepts - Topology 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 24 Create Stream called „tweet-stream“ Run this bolt by 2 workers Run this spout by 1 worker Subscribe to stream „tweet-stream“ using shuffle grouping Create stream called „hashtag-splitter“ Subscribe to stream „hashtag-splitter“ using fields grouping on field „hashtag“
  • 25. 2013 © Trivadis Sample – How does it work ? 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 25 Twitter Stream Mein Präsentation „Twitter Storm: Ereignisverarbeitung in Echtzeit“ Morgen um 15:35 am Java Forum Stuttgart #JFS2013 #storm #fastdata CU there! Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter
  • 26. 2013 © Trivadis Sample – How does it work ? 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 26 Twitter Stream Mein Präsentation „Twitter Storm: Ereignisverarbeitung in Echtzeit“ Morgen um 15:35 am Java Forum Stuttgart #JFS2013 #storm #fastdata CU there! Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter storm fastdata jsf2013
  • 27. 2013 © Trivadis Sample – How does it work ? 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 27 Twitter Stream Mein Präsentation „Twitter Storm: Ereignisverarbeitung in Echtzeit“ Morgen um 15:35 am Java Forum Stuttgart #JFS2013 #storm #fastdata CU there! Hashtag
 Splitter Tweet
 Stream Hashtag
 Counter Hashtag
 Splitter Hashtag
 Counter storm fastdata jfs2013 INCR jfs2013 INCR storm INCR fastdata storm = 1 fastdata =1 jfs2013= 1
  • 28. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 28
  • 29. 2013 © Trivadis What is Trident? •  High-Level abstraction on top of storm •  Makes it easier to build topologies •  Core data model is the stream •  Processed as a series of batches •  Stream is partitioned among nodes in cluster •  5 kinds of operations in Trident •  Operations that apply locally to each partition and cause no network transfer •  Repartitioning operations that don‘t change the contents •  Aggregation operations that do network transfer •  Operations on grouped streams •  Merges and Joins 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 29
  • 30. 2013 © Trivadis What is Trident? Supports •  Joins •  Aggregations •  Grouping •  Functions •  Filters Similar to Hadoop and Pig/Cascading 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 30
  • 31. 2013 © Trivadis Trident Concepts - Topology 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 31 Twitter Stream Tweets tweet Hashtag
 Splitter Tweet
 Stream hashtag Hashtag
 Normalizer Persistent
 Aggregate hashtag groupBylocal Bolt Bolt
  • 32. 2013 © Trivadis Trident Concepts - Function •  takes in a set of input fields and emits zero or more tuples as output •  fields of the output tuple are appended to the original input tuple in the stream •  If a function emits no tuples, the original input tuple is filtered out •  Otherwise the input tuple is duplicated for each output tuple 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 32
  • 33. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 33
  • 34. 2013 © Trivadis Lambda Architecture 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 34 Immutable data Batch View Query Data Stream Realtime View Incoming Data Serving Layer Speed Layer Batch Layer A B C D E F G
  • 35. 2013 © Trivadis Lambda Architecture 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 35 Speed Layer Precompute Views query Source: Marz, N. & Warren, J. (2013) Big Data. Manning. Batch Layer Precomputed information All data Incremented information Process stream Incoming Data Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view Merge
  • 36. 2013 © Trivadis Lambda Architecture 24. April 2013 Big Data und Fast Data 36 one possible product/framework mapping Speed Layer Precompute Views query Batch Layer Precomputed information All data Incremented information Process stream Incoming Data Batch recompute Realtime increment Serving Layer batch view batch view real time view real time view Merge
  • 37. 2013 © Trivadis Agenda 1.  What is Twitter Storm 2.  Main Concepts 3.  Trident 4.  BigData and FastData combined – Lambda Architecture 5.  Summary 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 37
  • 38. 2013 © Trivadis Summary Express real-time processing naturally Not a Hadoop/MapReduce competitor Supports other languages Hard to debug Object Serialization Didn‘t cover §  Reliability through guaranteed message processing §  Distributed RPC §  Storm cluster setup and deploy 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 38
  • 39. 2013 © Trivadis Resources Website (http://storm-project.net/) Wiki (https://github.com/nathanmarz/storm/wiki) Doc (https://github.com/nathanmarz/storm/wiki/Documentation) Storm-starter project (https://github.com/nathanmarz/storm-starter) Mailing list (http://groups.google.com/group/storm-user) 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 39
  • 40. 2013 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN
 Thank You! Trivadis AG Guido Schmutz
 guido.schmutz@trivadis.com 4.7.2013 Twitter Storm: Ereignisverarbeitung in Echtzeit 40