SlideShare une entreprise Scribd logo
1  sur  80
Télécharger pour lire hors ligne
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
A real-time Lambda Architecture using Hadoop & Storm
NoSQL Matters Cologne 2014 by Nathan Bijnens
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Speaker
Nathan Bijnens
Big Data Engineer @ Virdata
@nathan_gs
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Computing Trends
Past
Computation (CPUs)
Expensive
Disk Storage Expensive
Coordination Easy
(Latches Don’t Often Hit)
DRAM Expensive
Computation Cheap
(Many Core Computers)
Disk Storage Cheap
(Cheap Commodity Disks)
Coordination Hard
(Latches Stall a Lot, etc)
DRAM / SSD
Getting Cheap
Current
Source: Immutability Changes Everything - Pat Helland, RICON2012
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Credits
Nathan Marz
● Ex-Backtype & Twitter
● Startup in Stealthmode
Creator of
● Storm
● Cascalog
● ElephantDB
Coined the term Lambda Architecture.
manning.com/marz
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
a Data System
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Not all information is equal.
Some information is derived from other pieces of information.
Data is more than Information
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Eventually you will reach the most ‘raw’ form of
information.
This is the information you hold true, simply because it exists.
Let’s call this ‘data’, very similar to ‘event’.
Data is more than Information
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Events used to manipulate the master data.
Events: Before
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Today, events are the master data.
Events: After
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Let’s store everything.
Data System
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Data is Immutable.
Data System
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Data is Time Based.
Data System
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Capturing change
INSERT INTO contact (name, city) VALUES (‘Nathan’, ‘Antwerp’)
UPDATE contact SET city = ‘Cologne’ WHERE name = ‘Nathan’
Traditionally
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Capturing change
INSERT INTO contact (name, city, timestamp) VALUES (‘Nathan’, ‘Antwerp’, 2008-10-11 20:00Z)
INSERT INTO contact (name, city, timestamp) VALUES (‘Nathan’, ‘Cologne’, 2014-04-29 10:00Z)
in a Data System
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
The data you query is often transformed, aggregated, ...
Rarely used in it’s original form.
Query
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Query = function ( all data )
Query
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Query: Number of people living in each city
Person City Timestamp
Nathan Antwerp 2008-10-11
John Cologne 2010-01-23
Dirk Antwerp 2012-09-12
Nathan Cologne 2014-04-29
City Count
Antwerp 1
Cologne 2
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Query
All Data QueryPrecomputed
View
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Layered Architecture
Batch Layer
Speed Layer
Serving
Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Layered Architecture
Hadoop ElephantDB
Incoming Data
Cassandra
Query
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch Layer
Hadoop ElephantDB
Incoming Data
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch Layer
The batch layer can calculate anything, given enough time...
Unrestrained computation.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
No need to De-Normalize.
The batch layer stores the data normalized, the generated views are often, if not always denormalized.
Batch Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Horizontally scalable.
Batch Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
High Latency.
Let’s for now pretend the update latency doesn’t matter.
Batch Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Functional computation, based on immutable inputs, is
idempotent.
Batch Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Stores a master copy of the data set
Batch Layer
… append only
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch: view generation
Master Dataset
View #1
View #3
View #2
MapReduce
MapReduce
MapReduce
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
MapReduce
1. Take a large data set and divide it into subsets
2. Perform the same function on all subsets
3. Combine the output from all subsets
…
…
Output
DoWork() DoWork() DoWork() …
MAPREDUCE
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
MapReduce
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Serialization & Schema
Catch errors as quickly as they happen.
Validate on write vs on read.
Catch errors as quickly as they happen.
Validate on write vs on read.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
CSV is actually a serialization language that is just
poorly defined.
Serialization & Schema
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Use a format with a schema
● Thrift
● Avro
● Protocolbuffers
Could be combined with Parquet.
Added bonus: it’s faster and uses less space.
Serialization & Schema
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch View Database
No random writes required.
Read Only database
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Every iteration produces the views from scratch.
Batch View Database
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Pure Lambda databases
● ElephantDB
● SploutSQL
Databases with a batch load & read only views
● Voldemort
Other databases that could be used
● ElasticSearch/Solr: generate the lucene indexes using MapReduce
● Cassandra: generate sstables
● ...
Batch View Databases
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch Layer
Without the associated complexities.
Eventually consistent
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Batch Layer
Data absorbed into Batch Views
Time
Now
We are not done yet…
Not yet absorbed.
Just a few hours of data.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Speed Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Speed Layer
Hadoop ElephantDB
Incoming Data
Cassandra
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Stream processing.
Speed Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Continuous computation.
Speed Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Storing a limited window of data.
Compensating for the last few hours of data.
Speed Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
All the complexity is isolated in the Speed Layer.
If anything goes wrong, it’s auto-corrected.
Speed Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
You have a choice between:
● Availability
○ Queries are eventual consistent
● Consistency
○ Queries are consistent
CAP
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Eventual accuracy
Some algorithms are hard to implement in real-time.
For those cases we could estimate the results.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Storm
Speed Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Message passing
Storm
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Distributed processing
Storm
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Horizontally scalable.
Storm
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Incremental algorithms
Storm
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Fast.
Storm
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Storm
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Storm
Tuple
Stream
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Storm
Spout
Bolt
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Storm
Grouping
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Data Ingestion
Queues & Pub/Sub models are a natural fit.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
● Kafka
● Flume
● Scribe
● *MQ
● …
Data Ingestion
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Speed Layer Views
The views need to be stored in a random writable
database.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
The logic behind a R/W database is much more
complex than a read-only view.
Speed Layer Views
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
The views are stored in a Read & Write database.
● Cassandra
● Hbase
● Redis
● SQL
● ElasticSearch
● ...
Speed Layer Views
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Serving Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Serving Layer
Hadoop ElephantDB
Incoming Data
Cassandra
Query
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Serving Layer
Random reads.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
This layer queries the batch & real-time views and
merges it.
Serving layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
How to query an Average?
Serving Layer
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Side note: CQRS
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
CQRS
Source: martinfowler.com/bliki/CQRS.html - Martin Fowler
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
CQRS & Event Sourcing
Event Sourcing
● Every command is a new event.
● The event store keeps all events, new events are
appended.
● Any query loops through all related events, even
to produce an aggregate.
source: CQRS Journey - Microsoft Patterns & Practices
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Lambda Architecture
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Lambda Architecture
The Lambda Architecture can discard any view, batch
and real-time, and just recreate everything from the
master data.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Mistakes are corrected via recomputation.
Write bad data? Remove the data & recompute.
Bug in view generation? Just recompute the view.
Lambda Architecture
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Data storage is highly optimized.
Lambda Architecture
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Immutability changes everything.
Lambda Architecture
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Questions?
@nathan_gs #nosql14
nathan@nathan.gs / slideshare.net/nathan_gs
lambda-architecture.net / @LambdaArch / #LambdaArch
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Virdata is the cross-industry cloud service/platform for the Internet
of Things. Designed to elastically scale to monitor and manage an
unprecedented amount of devices and applications using
concurrent persistent connections, Virdata opens the door to
numerous new business opportunities.
Virdata combines Publish-Subscribe based Distributed Messaging,
Complex Event Processing and state-of-the-art Big Data paradigms
to enable both historical & real-time monitoring and near real-time
analytics with a scale required for the Internet of Things.
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Acknowledgements
I would like to thank Nathan Marz for writing a very insightful book, where most of the ideas in this presentation come from.
Parts of this presentation has been created while working for datacrunchers.eu, I thank them for the opportunities to speak about the
Lambda Architecture both at clients and at conferences. DataCrunchers is the first Big Data agency in Belgium.
Schema’s & Pictures:
Computing Trends: Immutability Changes Everything - Pat Helland, RICON2012
MapReduce #1: PolybasePass2012.pptx - David J. DeWitt, Microsoft Gray Systems Lab
MapReduce #2: Introduction to MapReduce and Hadoop - Shivnath Babu, Duke
CQRS: martinfowler.com/bliki/CQRS.html - Martin Fowler
CQRS & Event Sourcing: CQRS Journey - Adam Dymitruk, Josh Elster & Mark Seemann, Microsoft Patterns & Practices
NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14
Thank you
@nathan_gs
nathan@nathan.gs

Contenu connexe

Tendances

Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingReal Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingHari Shreedharan
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkDataWorks Summit
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaHelena Edelson
 
Spark Intro @ analytics big data summit
Spark  Intro @ analytics big data summitSpark  Intro @ analytics big data summit
Spark Intro @ analytics big data summitSujee Maniyam
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsBen Laird
 
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsApache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsTimothy Spann
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraPatrick McFadin
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and CassandraNatalino Busa
 
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra   webinar feb 16 2016 Kafka spark cassandra   webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016 Hiromitsu Komatsu
 
Lambda Architecture with Spark
Lambda Architecture with SparkLambda Architecture with Spark
Lambda Architecture with SparkKnoldus Inc.
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...DataStax Academy
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Brian O'Neill
 
Cassandra spark connector
Cassandra spark connectorCassandra spark connector
Cassandra spark connectorDuyhai Doan
 
Feeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and KafkaFeeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and KafkaDataStax Academy
 
Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data prajods
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...Simon Ambridge
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkGuido Schmutz
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Codemotion
 
Architectural Patterns for Streaming Applications
Architectural Patterns for Streaming ApplicationsArchitectural Patterns for Streaming Applications
Architectural Patterns for Streaming Applicationshadooparchbook
 

Tendances (20)

Lambda architecture
Lambda architectureLambda architecture
Lambda architecture
 
Real Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark StreamingReal Time Data Processing Using Spark Streaming
Real Time Data Processing Using Spark Streaming
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and AkkaStreaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka
 
Spark Intro @ analytics big data summit
Spark  Intro @ analytics big data summitSpark  Intro @ analytics big data summit
Spark Intro @ analytics big data summit
 
Real time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.jsReal time data viz with Spark Streaming, Kafka and D3.js
Real time data viz with Spark Streaming, Kafka and D3.js
 
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDsApache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
Apache Spark 1.6 with Zeppelin - Transformations and Actions on RDDs
 
Analyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and CassandraAnalyzing Time Series Data with Apache Spark and Cassandra
Analyzing Time Series Data with Apache Spark and Cassandra
 
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
Real-Time Anomaly Detection  with Spark MLlib, Akka and  CassandraReal-Time Anomaly Detection  with Spark MLlib, Akka and  Cassandra
Real-Time Anomaly Detection with Spark MLlib, Akka and Cassandra
 
Kafka spark cassandra webinar feb 16 2016
Kafka spark cassandra   webinar feb 16 2016 Kafka spark cassandra   webinar feb 16 2016
Kafka spark cassandra webinar feb 16 2016
 
Lambda Architecture with Spark
Lambda Architecture with SparkLambda Architecture with Spark
Lambda Architecture with Spark
 
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
C* Summit 2013: Real-time Analytics using Cassandra, Spark and Shark by Evan ...
 
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
Re-envisioning the Lambda Architecture : Web Services & Real-time Analytics ...
 
Cassandra spark connector
Cassandra spark connectorCassandra spark connector
Cassandra spark connector
 
Feeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and KafkaFeeding Cassandra with Spark-Streaming and Kafka
Feeding Cassandra with Spark-Streaming and Kafka
 
Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data Apache Cassandra and Python for Analyzing Streaming Big Data
Apache Cassandra and Python for Analyzing Streaming Big Data
 
Sa introduction to big data pipelining with cassandra & spark west mins...
Sa introduction to big data pipelining with cassandra & spark   west mins...Sa introduction to big data pipelining with cassandra & spark   west mins...
Sa introduction to big data pipelining with cassandra & spark west mins...
 
Real-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache SparkReal-Time Analytics with Apache Cassandra and Apache Spark
Real-Time Analytics with Apache Cassandra and Apache Spark
 
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
 
Architectural Patterns for Streaming Applications
Architectural Patterns for Streaming ApplicationsArchitectural Patterns for Streaming Applications
Architectural Patterns for Streaming Applications
 

En vedette

Getting more out of your big data
Getting more out of your big dataGetting more out of your big data
Getting more out of your big dataNathan Bijnens
 
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!Nathan Bijnens
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big dataTrieu Nguyen
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionGuido Schmutz
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to OneSerg Masyutin
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaHelena Edelson
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopDataWorks Summit
 
Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeGuido Schmutz
 
Real time hadoop + mapreduce intro
Real time hadoop + mapreduce introReal time hadoop + mapreduce intro
Real time hadoop + mapreduce introGeoff Hendrey
 
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013Gigaom
 
Url Shortening Services
Url Shortening ServicesUrl Shortening Services
Url Shortening ServicesAltan Khendup
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Altan Khendup
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with SparkVincent GALOPIN
 
Lambda Architecture in Practice
Lambda Architecture in PracticeLambda Architecture in Practice
Lambda Architecture in PracticeNavneet kumar
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQLSATOSHI TAGOMORI
 
National Weather Service Storm Spotter Training
National Weather Service Storm Spotter TrainingNational Weather Service Storm Spotter Training
National Weather Service Storm Spotter Trainingchowd
 
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réelMathieu DESPRIEE
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in PracticeC4Media
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Hortonworks
 

En vedette (20)

Getting more out of your big data
Getting more out of your big dataGetting more out of your big data
Getting more out of your big data
 
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
 
Lambda architecture for real time big data
Lambda architecture for real time big dataLambda architecture for real time big data
Lambda architecture for real time big data
 
Big Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in ActionBig Data and Fast Data - Lambda Architecture in Action
Big Data and Fast Data - Lambda Architecture in Action
 
Lambda architecture: from zero to One
Lambda architecture: from zero to OneLambda architecture: from zero to One
Lambda architecture: from zero to One
 
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, ScalaLambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
Lambda Architecture with Spark Streaming, Kafka, Cassandra, Akka, Scala
 
Realtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and HadoopRealtime Analytics with Storm and Hadoop
Realtime Analytics with Storm and Hadoop
 
Kafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtimeKafka and Storm - event processing in realtime
Kafka and Storm - event processing in realtime
 
Real time hadoop + mapreduce intro
Real time hadoop + mapreduce introReal time hadoop + mapreduce intro
Real time hadoop + mapreduce intro
 
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
COMPLEMENTING HADOOP WITH REAL-TIME DATA ANALYSIS from Structure:Data 2013
 
Url Shortening Services
Url Shortening ServicesUrl Shortening Services
Url Shortening Services
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
 
Storm
StormStorm
Storm
 
Lambda architecture with Spark
Lambda architecture with SparkLambda architecture with Spark
Lambda architecture with Spark
 
Lambda Architecture in Practice
Lambda Architecture in PracticeLambda Architecture in Practice
Lambda Architecture in Practice
 
Lambda Architecture Using SQL
Lambda Architecture Using SQLLambda Architecture Using SQL
Lambda Architecture Using SQL
 
National Weather Service Storm Spotter Training
National Weather Service Storm Spotter TrainingNational Weather Service Storm Spotter Training
National Weather Service Storm Spotter Training
 
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
[USI] Lambda-Architecture : comment réconcilier BigData et temps-réel
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in Practice
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
 

Similaire à A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne '14)

New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015Robbie Strickland
 
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...jaxLondonConference
 
SSTable Reader Cassandra Day Denver 2014
SSTable Reader Cassandra Day Denver 2014SSTable Reader Cassandra Day Denver 2014
SSTable Reader Cassandra Day Denver 2014Ben Vanberg
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Djamel Zouaoui
 
Announcing Spark Driver for Cassandra
Announcing Spark Driver for CassandraAnnouncing Spark Driver for Cassandra
Announcing Spark Driver for CassandraDataStax
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaItai Yaffe
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksAnant Corporation
 
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...Cedric CARBONE
 
2014 04-AMPlifying-docker-at-451-hcts-eu
2014 04-AMPlifying-docker-at-451-hcts-eu2014 04-AMPlifying-docker-at-451-hcts-eu
2014 04-AMPlifying-docker-at-451-hcts-euAlex Heneveld
 
Apache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosApache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosOpenSistemas
 
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...DataStax Academy
 
How R Developers Can Build and Share Data and AI Applications that Scale with...
How R Developers Can Build and Share Data and AI Applications that Scale with...How R Developers Can Build and Share Data and AI Applications that Scale with...
How R Developers Can Build and Share Data and AI Applications that Scale with...Databricks
 
Cleveland Hadoop Users Group - Spark
Cleveland Hadoop Users Group - SparkCleveland Hadoop Users Group - Spark
Cleveland Hadoop Users Group - SparkVince Gonzalez
 
Scalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev BandungScalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev BandungRendy Bambang Junior
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Apache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupNed Shawa
 

Similaire à A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne '14) (20)

New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015New Analytics Toolbox DevNexus 2015
New Analytics Toolbox DevNexus 2015
 
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
A real-time architecture using Hadoop & Storm - Nathan Bijnens & Geert Van La...
 
SSTable Reader Cassandra Day Denver 2014
SSTable Reader Cassandra Day Denver 2014SSTable Reader Cassandra Day Denver 2014
SSTable Reader Cassandra Day Denver 2014
 
Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Announcing Spark Driver for Cassandra
Announcing Spark Driver for CassandraAnnouncing Spark Driver for Cassandra
Announcing Spark Driver for Cassandra
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and Kafka
 
Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
Paris Spark Meetup (Feb2015) ccarbone : SPARK Streaming vs Storm / MLLib / Ne...
 
2014 04-AMPlifying-docker-at-451-hcts-eu
2014 04-AMPlifying-docker-at-451-hcts-eu2014 04-AMPlifying-docker-at-451-hcts-eu
2014 04-AMPlifying-docker-at-451-hcts-eu
 
Apache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectosApache spark y cómo lo usamos en nuestros proyectos
Apache spark y cómo lo usamos en nuestros proyectos
 
ASPgems - kappa architecture
ASPgems - kappa architectureASPgems - kappa architecture
ASPgems - kappa architecture
 
New Analytics Toolbox
New Analytics ToolboxNew Analytics Toolbox
New Analytics Toolbox
 
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
 
Unlock the value of your big data infrastructure
Unlock the value of your big data infrastructureUnlock the value of your big data infrastructure
Unlock the value of your big data infrastructure
 
How R Developers Can Build and Share Data and AI Applications that Scale with...
How R Developers Can Build and Share Data and AI Applications that Scale with...How R Developers Can Build and Share Data and AI Applications that Scale with...
How R Developers Can Build and Share Data and AI Applications that Scale with...
 
Stratio big data spain
Stratio   big data spainStratio   big data spain
Stratio big data spain
 
Cleveland Hadoop Users Group - Spark
Cleveland Hadoop Users Group - SparkCleveland Hadoop Users Group - Spark
Cleveland Hadoop Users Group - Spark
 
Scalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev BandungScalable data pipeline at Traveloka - Facebook Dev Bandung
Scalable data pipeline at Traveloka - Facebook Dev Bandung
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Apache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetup
 

Plus de Nathan Bijnens

Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Nathan Bijnens
 
Dataminds - ML in Production
Dataminds - ML in ProductionDataminds - ML in Production
Dataminds - ML in ProductionNathan Bijnens
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Nathan Bijnens
 
Big Data Expo '18 - Microsoft AI
Big Data Expo '18 - Microsoft AIBig Data Expo '18 - Microsoft AI
Big Data Expo '18 - Microsoft AINathan Bijnens
 
Spark on Azure, a gentle introduction (nov 2015)
Spark on Azure, a gentle introduction (nov 2015)Spark on Azure, a gentle introduction (nov 2015)
Spark on Azure, a gentle introduction (nov 2015)Nathan Bijnens
 
Cloudera, Azure and Big Data at Cloudera Meetup '17
Cloudera, Azure and Big Data at Cloudera Meetup '17Cloudera, Azure and Big Data at Cloudera Meetup '17
Cloudera, Azure and Big Data at Cloudera Meetup '17Nathan Bijnens
 
Microsoft AI at SAI '17
Microsoft AI at SAI '17Microsoft AI at SAI '17
Microsoft AI at SAI '17Nathan Bijnens
 
Microsoft Advanced Analytics @ Data Science Ghent '16
Microsoft Advanced Analytics @ Data Science Ghent '16Microsoft Advanced Analytics @ Data Science Ghent '16
Microsoft Advanced Analytics @ Data Science Ghent '16Nathan Bijnens
 
A real-time architecture using Hadoop and Storm @ BigData.be
A real-time architecture using Hadoop and Storm @ BigData.beA real-time architecture using Hadoop and Storm @ BigData.be
A real-time architecture using Hadoop and Storm @ BigData.beNathan Bijnens
 

Plus de Nathan Bijnens (10)

Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)Data Mesh in Azure using Cloud Scale Analytics (WAF)
Data Mesh in Azure using Cloud Scale Analytics (WAF)
 
Dataminds - ML in Production
Dataminds - ML in ProductionDataminds - ML in Production
Dataminds - ML in Production
 
Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018Azure Databricks & Spark @ Techorama 2018
Azure Databricks & Spark @ Techorama 2018
 
Big Data Expo '18 - Microsoft AI
Big Data Expo '18 - Microsoft AIBig Data Expo '18 - Microsoft AI
Big Data Expo '18 - Microsoft AI
 
Spark on Azure, a gentle introduction (nov 2015)
Spark on Azure, a gentle introduction (nov 2015)Spark on Azure, a gentle introduction (nov 2015)
Spark on Azure, a gentle introduction (nov 2015)
 
Cloudera, Azure and Big Data at Cloudera Meetup '17
Cloudera, Azure and Big Data at Cloudera Meetup '17Cloudera, Azure and Big Data at Cloudera Meetup '17
Cloudera, Azure and Big Data at Cloudera Meetup '17
 
Microsoft AI at SAI '17
Microsoft AI at SAI '17Microsoft AI at SAI '17
Microsoft AI at SAI '17
 
Microsoft Advanced Analytics @ Data Science Ghent '16
Microsoft Advanced Analytics @ Data Science Ghent '16Microsoft Advanced Analytics @ Data Science Ghent '16
Microsoft Advanced Analytics @ Data Science Ghent '16
 
A real-time architecture using Hadoop and Storm @ BigData.be
A real-time architecture using Hadoop and Storm @ BigData.beA real-time architecture using Hadoop and Storm @ BigData.be
A real-time architecture using Hadoop and Storm @ BigData.be
 

Dernier

Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 

Dernier (20)

Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 

A real-time (lambda) architecture using Hadoop & Storm (NoSQL Matters Cologne '14)

  • 1. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 A real-time Lambda Architecture using Hadoop & Storm NoSQL Matters Cologne 2014 by Nathan Bijnens
  • 2. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Speaker Nathan Bijnens Big Data Engineer @ Virdata @nathan_gs
  • 3. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Computing Trends Past Computation (CPUs) Expensive Disk Storage Expensive Coordination Easy (Latches Don’t Often Hit) DRAM Expensive Computation Cheap (Many Core Computers) Disk Storage Cheap (Cheap Commodity Disks) Coordination Hard (Latches Stall a Lot, etc) DRAM / SSD Getting Cheap Current Source: Immutability Changes Everything - Pat Helland, RICON2012
  • 4. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Credits Nathan Marz ● Ex-Backtype & Twitter ● Startup in Stealthmode Creator of ● Storm ● Cascalog ● ElephantDB Coined the term Lambda Architecture. manning.com/marz
  • 5. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 a Data System
  • 6. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Not all information is equal. Some information is derived from other pieces of information. Data is more than Information
  • 7. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Eventually you will reach the most ‘raw’ form of information. This is the information you hold true, simply because it exists. Let’s call this ‘data’, very similar to ‘event’. Data is more than Information
  • 8. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Events used to manipulate the master data. Events: Before
  • 9. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Today, events are the master data. Events: After
  • 10. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Let’s store everything. Data System
  • 11. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Data is Immutable. Data System
  • 12. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Data is Time Based. Data System
  • 13. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Capturing change INSERT INTO contact (name, city) VALUES (‘Nathan’, ‘Antwerp’) UPDATE contact SET city = ‘Cologne’ WHERE name = ‘Nathan’ Traditionally
  • 14. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Capturing change INSERT INTO contact (name, city, timestamp) VALUES (‘Nathan’, ‘Antwerp’, 2008-10-11 20:00Z) INSERT INTO contact (name, city, timestamp) VALUES (‘Nathan’, ‘Cologne’, 2014-04-29 10:00Z) in a Data System
  • 15. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 The data you query is often transformed, aggregated, ... Rarely used in it’s original form. Query
  • 16. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Query = function ( all data ) Query
  • 17. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Query: Number of people living in each city Person City Timestamp Nathan Antwerp 2008-10-11 John Cologne 2010-01-23 Dirk Antwerp 2012-09-12 Nathan Cologne 2014-04-29 City Count Antwerp 1 Cologne 2
  • 18. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Query All Data QueryPrecomputed View
  • 19. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Layered Architecture Batch Layer Speed Layer Serving Layer
  • 20. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Layered Architecture Hadoop ElephantDB Incoming Data Cassandra Query
  • 21. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch Layer
  • 22. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch Layer Hadoop ElephantDB Incoming Data
  • 23. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch Layer The batch layer can calculate anything, given enough time... Unrestrained computation.
  • 24. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 No need to De-Normalize. The batch layer stores the data normalized, the generated views are often, if not always denormalized. Batch Layer
  • 25. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Horizontally scalable. Batch Layer
  • 26. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 High Latency. Let’s for now pretend the update latency doesn’t matter. Batch Layer
  • 27. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Functional computation, based on immutable inputs, is idempotent. Batch Layer
  • 28. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Stores a master copy of the data set Batch Layer … append only
  • 29. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch Layer
  • 30. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch: view generation Master Dataset View #1 View #3 View #2 MapReduce MapReduce MapReduce
  • 31. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 MapReduce 1. Take a large data set and divide it into subsets 2. Perform the same function on all subsets 3. Combine the output from all subsets … … Output DoWork() DoWork() DoWork() … MAPREDUCE
  • 32. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 MapReduce
  • 33. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Serialization & Schema Catch errors as quickly as they happen. Validate on write vs on read. Catch errors as quickly as they happen. Validate on write vs on read.
  • 34. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 CSV is actually a serialization language that is just poorly defined. Serialization & Schema
  • 35. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Use a format with a schema ● Thrift ● Avro ● Protocolbuffers Could be combined with Parquet. Added bonus: it’s faster and uses less space. Serialization & Schema
  • 36. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch View Database No random writes required. Read Only database
  • 37. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Every iteration produces the views from scratch. Batch View Database
  • 38. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Pure Lambda databases ● ElephantDB ● SploutSQL Databases with a batch load & read only views ● Voldemort Other databases that could be used ● ElasticSearch/Solr: generate the lucene indexes using MapReduce ● Cassandra: generate sstables ● ... Batch View Databases
  • 39. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch Layer Without the associated complexities. Eventually consistent
  • 40. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Batch Layer Data absorbed into Batch Views Time Now We are not done yet… Not yet absorbed. Just a few hours of data.
  • 41. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Speed Layer
  • 42. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Speed Layer Hadoop ElephantDB Incoming Data Cassandra
  • 43. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Stream processing. Speed Layer
  • 44. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Continuous computation. Speed Layer
  • 45. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Storing a limited window of data. Compensating for the last few hours of data. Speed Layer
  • 46. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 All the complexity is isolated in the Speed Layer. If anything goes wrong, it’s auto-corrected. Speed Layer
  • 47. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 You have a choice between: ● Availability ○ Queries are eventual consistent ● Consistency ○ Queries are consistent CAP
  • 48. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Eventual accuracy Some algorithms are hard to implement in real-time. For those cases we could estimate the results.
  • 49. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Storm Speed Layer
  • 50. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Message passing Storm
  • 51. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Distributed processing Storm
  • 52. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Horizontally scalable. Storm
  • 53. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Incremental algorithms Storm
  • 54. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Fast. Storm
  • 55. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Storm
  • 56. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Storm Tuple Stream
  • 57. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Storm Spout Bolt
  • 58. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Storm Grouping
  • 59. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Data Ingestion Queues & Pub/Sub models are a natural fit.
  • 60. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 ● Kafka ● Flume ● Scribe ● *MQ ● … Data Ingestion
  • 61. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Speed Layer Views The views need to be stored in a random writable database.
  • 62. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 The logic behind a R/W database is much more complex than a read-only view. Speed Layer Views
  • 63. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 The views are stored in a Read & Write database. ● Cassandra ● Hbase ● Redis ● SQL ● ElasticSearch ● ... Speed Layer Views
  • 64. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Serving Layer
  • 65. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Serving Layer Hadoop ElephantDB Incoming Data Cassandra Query
  • 66. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Serving Layer Random reads.
  • 67. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 This layer queries the batch & real-time views and merges it. Serving layer
  • 68. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 How to query an Average? Serving Layer
  • 69. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Side note: CQRS
  • 70. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 CQRS Source: martinfowler.com/bliki/CQRS.html - Martin Fowler
  • 71. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 CQRS & Event Sourcing Event Sourcing ● Every command is a new event. ● The event store keeps all events, new events are appended. ● Any query loops through all related events, even to produce an aggregate. source: CQRS Journey - Microsoft Patterns & Practices
  • 72. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Lambda Architecture
  • 73. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Lambda Architecture The Lambda Architecture can discard any view, batch and real-time, and just recreate everything from the master data.
  • 74. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Mistakes are corrected via recomputation. Write bad data? Remove the data & recompute. Bug in view generation? Just recompute the view. Lambda Architecture
  • 75. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Data storage is highly optimized. Lambda Architecture
  • 76. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Immutability changes everything. Lambda Architecture
  • 77. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Questions? @nathan_gs #nosql14 nathan@nathan.gs / slideshare.net/nathan_gs lambda-architecture.net / @LambdaArch / #LambdaArch
  • 78. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Virdata is the cross-industry cloud service/platform for the Internet of Things. Designed to elastically scale to monitor and manage an unprecedented amount of devices and applications using concurrent persistent connections, Virdata opens the door to numerous new business opportunities. Virdata combines Publish-Subscribe based Distributed Messaging, Complex Event Processing and state-of-the-art Big Data paradigms to enable both historical & real-time monitoring and near real-time analytics with a scale required for the Internet of Things.
  • 79. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Acknowledgements I would like to thank Nathan Marz for writing a very insightful book, where most of the ideas in this presentation come from. Parts of this presentation has been created while working for datacrunchers.eu, I thank them for the opportunities to speak about the Lambda Architecture both at clients and at conferences. DataCrunchers is the first Big Data agency in Belgium. Schema’s & Pictures: Computing Trends: Immutability Changes Everything - Pat Helland, RICON2012 MapReduce #1: PolybasePass2012.pptx - David J. DeWitt, Microsoft Gray Systems Lab MapReduce #2: Introduction to MapReduce and Hadoop - Shivnath Babu, Duke CQRS: martinfowler.com/bliki/CQRS.html - Martin Fowler CQRS & Event Sourcing: CQRS Journey - Adam Dymitruk, Josh Elster & Mark Seemann, Microsoft Patterns & Practices
  • 80. NoSQL Matter 2014 - A real-time (Lambda) Architecture using Hadoop & Storm - #nosql14 Thank you @nathan_gs nathan@nathan.gs