SlideShare a Scribd company logo
1 of 28
1
© OCTO 2015© OCTO 2015
Event Driven Architecture
bluckbluck
2
© OCTO 2015
The first problem was how to transport data between systems
The second part of this problem was the need to do richer
analytical data processing with very low latency.
3
© OCTO 2015
The pipeline for log data was
scalable
but lossy and
could only deliver data with high latency.
The pipeline between Oracle instances was
fast,
exact, and
real-time,
but not available to any other systems.
4
© OCTO 2015
The pipeline of Oracle data for Hadoop was
periodic CSV dumps—
high throughput,
but batch.
The pipeline of data to our search system was
low latency,
but unscalable
and tied directly to the database.
The messaging systems were
low latency
but unreliable
and unscalable.
5
© OCTO 2015
6
© OCTO 2015
We added data centers geographically distributed around the world we had to
build out geographical replication for each of these data flows
he data was always unreliable.
Our reports were untrustworthy,
derived indexes and stores were questionable, and
everyone spent a lot of time battling data quality issues of all kinds
At the same time we weren't just shipping data from place to place; we also
wanted to do things with it
Hadoop had given us a platform for batch processing, data archival, and ad hoc
processing, and this had been enormously successful, but we lacked an
analogous platform for low-latency processing.
7
© OCTO 2015
Stream Data Plateform
8
© OCTO 2015
Stream Data Plateform
9
© OCTO 2015
Your database stores the current state of your data. But the current state is
always caused by some actions that took place in the past. The actions are the
events.
Much of what people refer to when they talk about "big data" is really the act of
capturing these events that previously weren't recorded anywhere and putting
them to use for analysis, optimization, and decision making
Event streams are an obvious fit for log data or things like "orders", "sales",
"clicks" or "trades" that are obviously event-like.
The Rise of Events and Event Streams
10
© OCTO 2015
data in databases can also be thought of as an event stream. The process of creating a
backup or standby copy of a database :
to dump out the contents
to take a "diff" of what has changed
Change capture : If we take our diffs more and more frequently what we will be left with is a
continuous sequence of single row changes.
By publishing the database changes into the stream data platform you add this to the other
set of event streams. You can use these streams to synchronize other systems like
Hadoop cluster,
a replica database, or
a search index, or
you can feed these changes into applications
or stream processors to directly compute new things off the changes.
Databases Are Event Streams
11
© OCTO 2015
A stream data platform has two primary uses:
Data Integration: The stream data platform captures streams of events or data changes and
feeds these to other data systems such as relational databases, key-value stores, Hadoop, or
the data warehouse.
Stream processing: It enables continuous, real-time processing and transformation of these
streams and makes the results available system-wide.
The stream data platform is a central hub for data streams.
t also acts as a buffer between these systems—the publisher of data doesn't need to be
concerned with the various systems that will eventually consume and load the data. This
means consumers of data can come and go and are fully decoupled from the source.
What Is a Stream Data Platform For?
12
© OCTO 2015
Hadoop wants to be able to maintain a full copy of all the data in your organization and act
as a "data lake" or "enterprise data hub".
Directly integrating each data source with HDFS is a hugely time consuming proposition
the end result only makes that data available to Hadoop.
This type of data capture isn't suitable for real-time processing or syncing other real-time
applications.
This same pipeline can run in reverse: Hadoop and the data warehouse environment can
publish out results that need to flow into appropriate systems for serving in customer-facing
applications.
What Is a Stream Data Platform For? Zoom
Hadoop
13
© OCTO 2015
The stream processing use case plays off the data integration use case.
The results of the stream processing are just a new, derived stream.
Stream processing acts as both a way to develop applications that need low-latency
transformations but it is also directly part of the data integration usage as well:
integrating systems often requires some munging of data streams in between.
What Is a Stream Data Platform For? Zoom
ETL
14
© OCTO 2015
A stream data platform is similar to an enterprise messaging system—it receives
messages and distributes them to interested subscribers. There are three
important differences:
Messaging systems are typically run in one-off deployments for different
applications. The purpose of the stream data platform is very much as a central data
hub.
Messaging systems do a poor job of supporting integration with batch systems, such
as a data warehouse or a Hadoop cluster, as they have limited data storage
capacity.
Messaging systems do not provide semantics that are easily compatible with rich
stream processing.
How Does a Stream Data Platform Relate To Existing Things
15
© OCTO 2015
In other words a data stream data platform is a messaging system whose role
has been rethought at a company-wide scale.
How Does a Stream Data Platform Relate To Existing Things
16
© OCTO 2015
A stream data platform is a true platform that any other system can choose to
tap into and many applications can build around.
by making data available in a uniform format in a single place with a common
stream abstraction, many of the routine data clean-up tasks can be avoided
entirely.
Data Integration Tools
17
© OCTO 2015
The advantage of a stream data platform is that transformation is fundamentally
decoupled from the stream itself.
This code can live in applications or stream processing tasks, allowing teams to
iterate at their own pace without a central bottleneck for application
development.
Enterprise Service Buses
18
© OCTO 2015
Databases have long had similar log mechanisms such as Golden Gate.
However these mechanisms are limited to database changes only and are not a
general purpose event capture platform.
Change Capture Systems
19
© OCTO 2015
A stream data platform doesn't replace your data warehouse; in fact, quite the
opposite: it feeds it data.
Data Warehouses and Hadoop
20
© OCTO 2015
They attempt to add richer processing semantics to subscribers and can make
implementing data transformation easier.
Stream Processing Systems
21
© OCTO 2015
everything from user activity to database changes to administrative actions like
restarting a process are captured in real-time streams that are subscribed to and
processed in real-time.
What Does This Look Like In Practice?
22
© OCTO 2015
part of the promise of this approach to data management is having a central repository with
the full set of data streams your organization generates. This works best when data is all in
the same place.
simplifying system architecture.
fewer integration points for data consumers,
fewer things to operate,
lower incremental cost for adding new applications,
makes it easier to reason about data flow.
But, there are several reasons to end up with multiple clusters
To keep activity local to a datacenter
For security reasons
For SLA control.
Rcommendations : Limit The Number of Clusters
23
© OCTO 2015
Apache Kafka does not enforce any particular format
If each individual or application chooses a representation of their own preference—say
some use JSON, others XML, and others CSV—the result is that any system or process
which uses multiple data streams has to munge and understand each of these.
Local optimization—choosing your favorite format for data you produce—leads to huge
global sub-optimization since now each system needs to write N adaptors, one for each
format it wants to ingest.
imagine how useless the Unix toolchain would be if each tool invented its own format: you
would have to translate between formats every time you wanted to pipe one command to
another.
Rcommendations : Pick A Single Data Format
24
© OCTO 2015
Connecting all systems directly would look
something like this
Whereas having this central stream data platform
looks something like this
Rcommendations : Pick A Single Data Format
25
© OCTO 2015
We think Avro is the best choice for a number of reasons:
1. It has a direct mapping to and from JSON
2. It has a very compact format. The bulk of JSON, repeating every field name with every single
record, is what makes JSON inefficient for high-volume usage.
3. It is very fast.
4. It has great bindings for a wide variety of programming languages so you can generate Java
objects that make working with event data easier, but it does not require code generation so
tools can be written generically for any data stream.
5. It has a rich, extensible schema language defined in pure JSON
6. It has the best notion of compatibility for evolving your data over time.
Rcommendations : Use Avro as Your Data Format
26
© OCTO 2015
Isn't the modern world of big data all about unstructured data, dumped in whatever form is
convenient, and parsed later when it is queried?
One of the primary advantages of this type of architecture where data is modeled as
streams is that applications are decoupled. Applications produce a stream of events
capturing what occurred without knowledge of which things subscribe to these streams.
The Need For Schemas
27
© OCTO 2015
Whenever you see a common activity across multiple systems try to use a common schema
for this activity.
An example of this that is common to all businesses is application errors.
Share Event Schemas
28
© OCTO 2015
MODELING SPECIFIC DATA TYPES IN KAFKA

More Related Content

What's hot

Building Event-Driven Services with Apache Kafka
Building Event-Driven Services with Apache KafkaBuilding Event-Driven Services with Apache Kafka
Building Event-Driven Services with Apache Kafkaconfluent
 
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Timo Walther
 
Building Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion PipelinesBuilding Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion PipelinesArvind Prabhakar
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology confluent
 
Evolving from Messaging to Event Streaming
Evolving from Messaging to Event StreamingEvolving from Messaging to Event Streaming
Evolving from Messaging to Event Streamingconfluent
 
The Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and ServicesThe Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and Servicesconfluent
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...confluent
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservicesBigstep
 
Monitoring and Troubleshooting a Real Time Pipeline
Monitoring and Troubleshooting a Real Time PipelineMonitoring and Troubleshooting a Real Time Pipeline
Monitoring and Troubleshooting a Real Time PipelineApache Apex
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...HostedbyConfluent
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analyticsconfluent
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...confluent
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...confluent
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...confluent
 

What's hot (20)

Building Event-Driven Services with Apache Kafka
Building Event-Driven Services with Apache KafkaBuilding Event-Driven Services with Apache Kafka
Building Event-Driven Services with Apache Kafka
 
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
Introduction to Stream Processing with Apache Flink (2019-11-02 Bengaluru Mee...
 
Building Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion PipelinesBuilding Continuously Curated Ingestion Pipelines
Building Continuously Curated Ingestion Pipelines
 
A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology A Practical Guide to Selecting a Stream Processing Technology
A Practical Guide to Selecting a Stream Processing Technology
 
Evolving from Messaging to Event Streaming
Evolving from Messaging to Event StreamingEvolving from Messaging to Event Streaming
Evolving from Messaging to Event Streaming
 
The Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and ServicesThe Data Dichotomy- Rethinking the Way We Treat Data and Services
The Data Dichotomy- Rethinking the Way We Treat Data and Services
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
Designing and Implementing Information Systems with Event Modeling, Bobby Cal...
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Monitoring and Troubleshooting a Real Time Pipeline
Monitoring and Troubleshooting a Real Time PipelineMonitoring and Troubleshooting a Real Time Pipeline
Monitoring and Troubleshooting a Real Time Pipeline
 
The Evolution of Big Data Pipelines at Intuit
The Evolution of Big Data Pipelines at Intuit The Evolution of Big Data Pipelines at Intuit
The Evolution of Big Data Pipelines at Intuit
 
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
Building a Modern, Scalable Cyber Intelligence Platform with Apache Kafka | J...
 
Reliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at AirbnbReliable and Scalable Data Ingestion at Airbnb
Reliable and Scalable Data Ingestion at Airbnb
 
Leveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern AnalyticsLeveraging Mainframe Data for Modern Analytics
Leveraging Mainframe Data for Modern Analytics
 
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
 
Flink Streaming
Flink StreamingFlink Streaming
Flink Streaming
 
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...Introducing Events and Stream Processing into Nationwide Building Society (Ro...
Introducing Events and Stream Processing into Nationwide Building Society (Ro...
 
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
How To Use Kafka and Druid to Tame Your Router Data (Rachel Pedreschi, Imply ...
 
In Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging serviceIn Flux Limiting for a multi-tenant logging service
In Flux Limiting for a multi-tenant logging service
 
Accelerating Data Warehouse Modernization
Accelerating Data Warehouse ModernizationAccelerating Data Warehouse Modernization
Accelerating Data Warehouse Modernization
 

Viewers also liked

Event Driven Architectures - Phoenix Java Users Group 2013
Event Driven Architectures - Phoenix Java Users Group 2013Event Driven Architectures - Phoenix Java Users Group 2013
Event Driven Architectures - Phoenix Java Users Group 2013clairvoyantllc
 
Fraud Management Industry Update Webinar
Fraud Management Industry Update WebinarFraud Management Industry Update Webinar
Fraud Management Industry Update WebinarcVidya Networks
 
Detecting Opportunities and Threats with Complex Event Processing: Case St...
Detecting Opportunities and Threats with Complex Event Processing: Case St...Detecting Opportunities and Threats with Complex Event Processing: Case St...
Detecting Opportunities and Threats with Complex Event Processing: Case St...Tim Bass
 
Spring Batch - Lessons Learned out of a real life banking system.
Spring Batch - Lessons Learned out of a real life banking system.Spring Batch - Lessons Learned out of a real life banking system.
Spring Batch - Lessons Learned out of a real life banking system.Raffael Schmid
 
Event Driven Architecture (EDA), November 2, 2006
Event Driven Architecture (EDA), November 2, 2006Event Driven Architecture (EDA), November 2, 2006
Event Driven Architecture (EDA), November 2, 2006Tim Bass
 
The Impact of Messaging Standards on Event-Driven Architecture and IoT
The Impact of Messaging Standards on Event-Driven Architecture and IoTThe Impact of Messaging Standards on Event-Driven Architecture and IoT
The Impact of Messaging Standards on Event-Driven Architecture and IoTSolace
 
Event Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya GrigorikEvent Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya GrigorikIlya Grigorik
 
Complex Event Processing in Practice at jDays 2012
Complex Event Processing in Practice at jDays 2012Complex Event Processing in Practice at jDays 2012
Complex Event Processing in Practice at jDays 2012Peter Norrhall
 
Developing real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and KafkaDeveloping real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and Kafkamarius_bogoevici
 
Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?
Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?
Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?Kai Wähner
 
Data Microservices with Spring Cloud Stream, Task, and Data Flow #jsug #spri...
Data Microservices with Spring Cloud Stream, Task,  and Data Flow #jsug #spri...Data Microservices with Spring Cloud Stream, Task,  and Data Flow #jsug #spri...
Data Microservices with Spring Cloud Stream, Task, and Data Flow #jsug #spri...Toshiaki Maki
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Databricks
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Michael Noll
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka StreamsGuozhang Wang
 

Viewers also liked (16)

Event Driven Architectures - Phoenix Java Users Group 2013
Event Driven Architectures - Phoenix Java Users Group 2013Event Driven Architectures - Phoenix Java Users Group 2013
Event Driven Architectures - Phoenix Java Users Group 2013
 
Fraud Management Industry Update Webinar
Fraud Management Industry Update WebinarFraud Management Industry Update Webinar
Fraud Management Industry Update Webinar
 
Detecting Opportunities and Threats with Complex Event Processing: Case St...
Detecting Opportunities and Threats with Complex Event Processing: Case St...Detecting Opportunities and Threats with Complex Event Processing: Case St...
Detecting Opportunities and Threats with Complex Event Processing: Case St...
 
Spring Batch - Lessons Learned out of a real life banking system.
Spring Batch - Lessons Learned out of a real life banking system.Spring Batch - Lessons Learned out of a real life banking system.
Spring Batch - Lessons Learned out of a real life banking system.
 
Kafka Utrecht Meetup
Kafka Utrecht MeetupKafka Utrecht Meetup
Kafka Utrecht Meetup
 
Event Driven Architecture (EDA), November 2, 2006
Event Driven Architecture (EDA), November 2, 2006Event Driven Architecture (EDA), November 2, 2006
Event Driven Architecture (EDA), November 2, 2006
 
The Impact of Messaging Standards on Event-Driven Architecture and IoT
The Impact of Messaging Standards on Event-Driven Architecture and IoTThe Impact of Messaging Standards on Event-Driven Architecture and IoT
The Impact of Messaging Standards on Event-Driven Architecture and IoT
 
Event Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya GrigorikEvent Driven Architecture - MeshU - Ilya Grigorik
Event Driven Architecture - MeshU - Ilya Grigorik
 
Complex Event Processing in Practice at jDays 2012
Complex Event Processing in Practice at jDays 2012Complex Event Processing in Practice at jDays 2012
Complex Event Processing in Practice at jDays 2012
 
Kafka at scale facebook israel
Kafka at scale   facebook israelKafka at scale   facebook israel
Kafka at scale facebook israel
 
Developing real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and KafkaDeveloping real-time data pipelines with Spring and Kafka
Developing real-time data pipelines with Spring and Kafka
 
Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?
Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?
Spoilt for Choice: How to Choose the Right Enterprise Service Bus (ESB)?
 
Data Microservices with Spring Cloud Stream, Task, and Data Flow #jsug #spri...
Data Microservices with Spring Cloud Stream, Task,  and Data Flow #jsug #spri...Data Microservices with Spring Cloud Stream, Task,  and Data Flow #jsug #spri...
Data Microservices with Spring Cloud Stream, Task, and Data Flow #jsug #spri...
 
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
 
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
Introducing Kafka Streams, the new stream processing library of Apache Kafka,...
 
Introduction to Kafka Streams
Introduction to Kafka StreamsIntroduction to Kafka Streams
Introduction to Kafka Streams
 

Similar to Event Driven Architecture

Introduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationIntroduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationKnoldus Inc.
 
Introduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationIntroduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationKnoldus Inc.
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Denodo
 
Enterprise application characteristics
Enterprise application characteristicsEnterprise application characteristics
Enterprise application characteristicsSalegram Padhee
 
From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...Cognizant
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Amy W. Tang
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Hortonworks
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaJeffrey T. Pollock
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse conceptsobieefans
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesFbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesCindy Irby
 
Effective Multi-stream Joining in Apache Samza Framework
Effective Multi-stream Joining in Apache Samza FrameworkEffective Multi-stream Joining in Apache Samza Framework
Effective Multi-stream Joining in Apache Samza FrameworkTao Feng
 
Idc analyst report a new breed of servers for digital transformation
Idc analyst report a new breed of servers for digital transformationIdc analyst report a new breed of servers for digital transformation
Idc analyst report a new breed of servers for digital transformationKaizenlogcom
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overviewvhrocca
 
Sybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase Türkiye
 

Similar to Event Driven Architecture (20)

ExecutiveWhitePaper
ExecutiveWhitePaperExecutiveWhitePaper
ExecutiveWhitePaper
 
Introduction to GCP Data Flow Presentation
Introduction to GCP Data Flow PresentationIntroduction to GCP Data Flow Presentation
Introduction to GCP Data Flow Presentation
 
Introduction to GCP DataFlow Presentation
Introduction to GCP DataFlow PresentationIntroduction to GCP DataFlow Presentation
Introduction to GCP DataFlow Presentation
 
Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)Future of Data Strategy (ASEAN)
Future of Data Strategy (ASEAN)
 
Benefits of a data lake
Benefits of a data lake Benefits of a data lake
Benefits of a data lake
 
ExecutiveWhitePaper
ExecutiveWhitePaperExecutiveWhitePaper
ExecutiveWhitePaper
 
Enterprise application characteristics
Enterprise application characteristicsEnterprise application characteristics
Enterprise application characteristics
 
From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...From Relational Database Management to Big Data: Solutions for Data Migration...
From Relational Database Management to Big Data: Solutions for Data Migration...
 
Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn Data Infrastructure at LinkedIn
Data Infrastructure at LinkedIn
 
Sdn in big data
Sdn in big dataSdn in big data
Sdn in big data
 
Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016Attunity Hortonworks Webinar- Sept 22, 2016
Attunity Hortonworks Webinar- Sept 22, 2016
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Data warehouse concepts
Data warehouse conceptsData warehouse concepts
Data warehouse concepts
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
Fbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_servicesFbdl enabling comprehensive_data_services
Fbdl enabling comprehensive_data_services
 
Effective Multi-stream Joining in Apache Samza Framework
Effective Multi-stream Joining in Apache Samza FrameworkEffective Multi-stream Joining in Apache Samza Framework
Effective Multi-stream Joining in Apache Samza Framework
 
Idc analyst report a new breed of servers for digital transformation
Idc analyst report a new breed of servers for digital transformationIdc analyst report a new breed of servers for digital transformation
Idc analyst report a new breed of servers for digital transformation
 
Hd insight overview
Hd insight overviewHd insight overview
Hd insight overview
 
Sybase IQ ile Analitik Platform
Sybase IQ ile Analitik PlatformSybase IQ ile Analitik Platform
Sybase IQ ile Analitik Platform
 

Recently uploaded

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Recently uploaded (20)

Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Event Driven Architecture

  • 1. 1 © OCTO 2015© OCTO 2015 Event Driven Architecture bluckbluck
  • 2. 2 © OCTO 2015 The first problem was how to transport data between systems The second part of this problem was the need to do richer analytical data processing with very low latency.
  • 3. 3 © OCTO 2015 The pipeline for log data was scalable but lossy and could only deliver data with high latency. The pipeline between Oracle instances was fast, exact, and real-time, but not available to any other systems.
  • 4. 4 © OCTO 2015 The pipeline of Oracle data for Hadoop was periodic CSV dumps— high throughput, but batch. The pipeline of data to our search system was low latency, but unscalable and tied directly to the database. The messaging systems were low latency but unreliable and unscalable.
  • 6. 6 © OCTO 2015 We added data centers geographically distributed around the world we had to build out geographical replication for each of these data flows he data was always unreliable. Our reports were untrustworthy, derived indexes and stores were questionable, and everyone spent a lot of time battling data quality issues of all kinds At the same time we weren't just shipping data from place to place; we also wanted to do things with it Hadoop had given us a platform for batch processing, data archival, and ad hoc processing, and this had been enormously successful, but we lacked an analogous platform for low-latency processing.
  • 7. 7 © OCTO 2015 Stream Data Plateform
  • 8. 8 © OCTO 2015 Stream Data Plateform
  • 9. 9 © OCTO 2015 Your database stores the current state of your data. But the current state is always caused by some actions that took place in the past. The actions are the events. Much of what people refer to when they talk about "big data" is really the act of capturing these events that previously weren't recorded anywhere and putting them to use for analysis, optimization, and decision making Event streams are an obvious fit for log data or things like "orders", "sales", "clicks" or "trades" that are obviously event-like. The Rise of Events and Event Streams
  • 10. 10 © OCTO 2015 data in databases can also be thought of as an event stream. The process of creating a backup or standby copy of a database : to dump out the contents to take a "diff" of what has changed Change capture : If we take our diffs more and more frequently what we will be left with is a continuous sequence of single row changes. By publishing the database changes into the stream data platform you add this to the other set of event streams. You can use these streams to synchronize other systems like Hadoop cluster, a replica database, or a search index, or you can feed these changes into applications or stream processors to directly compute new things off the changes. Databases Are Event Streams
  • 11. 11 © OCTO 2015 A stream data platform has two primary uses: Data Integration: The stream data platform captures streams of events or data changes and feeds these to other data systems such as relational databases, key-value stores, Hadoop, or the data warehouse. Stream processing: It enables continuous, real-time processing and transformation of these streams and makes the results available system-wide. The stream data platform is a central hub for data streams. t also acts as a buffer between these systems—the publisher of data doesn't need to be concerned with the various systems that will eventually consume and load the data. This means consumers of data can come and go and are fully decoupled from the source. What Is a Stream Data Platform For?
  • 12. 12 © OCTO 2015 Hadoop wants to be able to maintain a full copy of all the data in your organization and act as a "data lake" or "enterprise data hub". Directly integrating each data source with HDFS is a hugely time consuming proposition the end result only makes that data available to Hadoop. This type of data capture isn't suitable for real-time processing or syncing other real-time applications. This same pipeline can run in reverse: Hadoop and the data warehouse environment can publish out results that need to flow into appropriate systems for serving in customer-facing applications. What Is a Stream Data Platform For? Zoom Hadoop
  • 13. 13 © OCTO 2015 The stream processing use case plays off the data integration use case. The results of the stream processing are just a new, derived stream. Stream processing acts as both a way to develop applications that need low-latency transformations but it is also directly part of the data integration usage as well: integrating systems often requires some munging of data streams in between. What Is a Stream Data Platform For? Zoom ETL
  • 14. 14 © OCTO 2015 A stream data platform is similar to an enterprise messaging system—it receives messages and distributes them to interested subscribers. There are three important differences: Messaging systems are typically run in one-off deployments for different applications. The purpose of the stream data platform is very much as a central data hub. Messaging systems do a poor job of supporting integration with batch systems, such as a data warehouse or a Hadoop cluster, as they have limited data storage capacity. Messaging systems do not provide semantics that are easily compatible with rich stream processing. How Does a Stream Data Platform Relate To Existing Things
  • 15. 15 © OCTO 2015 In other words a data stream data platform is a messaging system whose role has been rethought at a company-wide scale. How Does a Stream Data Platform Relate To Existing Things
  • 16. 16 © OCTO 2015 A stream data platform is a true platform that any other system can choose to tap into and many applications can build around. by making data available in a uniform format in a single place with a common stream abstraction, many of the routine data clean-up tasks can be avoided entirely. Data Integration Tools
  • 17. 17 © OCTO 2015 The advantage of a stream data platform is that transformation is fundamentally decoupled from the stream itself. This code can live in applications or stream processing tasks, allowing teams to iterate at their own pace without a central bottleneck for application development. Enterprise Service Buses
  • 18. 18 © OCTO 2015 Databases have long had similar log mechanisms such as Golden Gate. However these mechanisms are limited to database changes only and are not a general purpose event capture platform. Change Capture Systems
  • 19. 19 © OCTO 2015 A stream data platform doesn't replace your data warehouse; in fact, quite the opposite: it feeds it data. Data Warehouses and Hadoop
  • 20. 20 © OCTO 2015 They attempt to add richer processing semantics to subscribers and can make implementing data transformation easier. Stream Processing Systems
  • 21. 21 © OCTO 2015 everything from user activity to database changes to administrative actions like restarting a process are captured in real-time streams that are subscribed to and processed in real-time. What Does This Look Like In Practice?
  • 22. 22 © OCTO 2015 part of the promise of this approach to data management is having a central repository with the full set of data streams your organization generates. This works best when data is all in the same place. simplifying system architecture. fewer integration points for data consumers, fewer things to operate, lower incremental cost for adding new applications, makes it easier to reason about data flow. But, there are several reasons to end up with multiple clusters To keep activity local to a datacenter For security reasons For SLA control. Rcommendations : Limit The Number of Clusters
  • 23. 23 © OCTO 2015 Apache Kafka does not enforce any particular format If each individual or application chooses a representation of their own preference—say some use JSON, others XML, and others CSV—the result is that any system or process which uses multiple data streams has to munge and understand each of these. Local optimization—choosing your favorite format for data you produce—leads to huge global sub-optimization since now each system needs to write N adaptors, one for each format it wants to ingest. imagine how useless the Unix toolchain would be if each tool invented its own format: you would have to translate between formats every time you wanted to pipe one command to another. Rcommendations : Pick A Single Data Format
  • 24. 24 © OCTO 2015 Connecting all systems directly would look something like this Whereas having this central stream data platform looks something like this Rcommendations : Pick A Single Data Format
  • 25. 25 © OCTO 2015 We think Avro is the best choice for a number of reasons: 1. It has a direct mapping to and from JSON 2. It has a very compact format. The bulk of JSON, repeating every field name with every single record, is what makes JSON inefficient for high-volume usage. 3. It is very fast. 4. It has great bindings for a wide variety of programming languages so you can generate Java objects that make working with event data easier, but it does not require code generation so tools can be written generically for any data stream. 5. It has a rich, extensible schema language defined in pure JSON 6. It has the best notion of compatibility for evolving your data over time. Rcommendations : Use Avro as Your Data Format
  • 26. 26 © OCTO 2015 Isn't the modern world of big data all about unstructured data, dumped in whatever form is convenient, and parsed later when it is queried? One of the primary advantages of this type of architecture where data is modeled as streams is that applications are decoupled. Applications produce a stream of events capturing what occurred without knowledge of which things subscribe to these streams. The Need For Schemas
  • 27. 27 © OCTO 2015 Whenever you see a common activity across multiple systems try to use a common schema for this activity. An example of this that is common to all businesses is application errors. Share Event Schemas
  • 28. 28 © OCTO 2015 MODELING SPECIFIC DATA TYPES IN KAFKA