SlideShare une entreprise Scribd logo
1  sur  52
Rediscovering the value of
apache kafka® in modern
data architectures
@riferrei | #kafkameetup | @CONFLUENTINC
About me
@riferrei | @kafkameetup | @CONFLUENTINC
• RICARDO FERREIRA
• Works for confluent
• Developer advocate
• Ricardo@confluent.iO
• HTTPS://RIFERREI.NET
Origins of apache kafka
@riferrei | @kafkameetup | @CONFLUENTINC
”there were lots of databases and
other systems built to store data,
but what was missing in our
architecture was something that
would help us to handle continuous
flows of data.” – jay kreps
@riferrei | @kafkameetup | @CONFLUENTINC
@riferrei | @kafkameetup | @CONFLUENTINC
First realization
>
I changed my job
from oracle to
confluent
I work at
confluent
event state
@riferrei | @kafkameetup | @CONFLUENTINC
Events are both
notification State transfer
+
@riferrei | @kafkameetup | @CONFLUENTINC
Event-driven application
Job change recommendation engine
Search engine
Email service
@riferrei | @kafkameetup | @CONFLUENTINC
SQL
SQL
SQL
Recommendation engine
Search engine
Email servicedatabase
LOG
Let’s implement this!
@riferrei | @kafkameetup | @CONFLUENTINC
second realization
database
1000x more volume
Non-transactional events
Transactional events
LOG
Databases, 30
years ago...
Developer
Databases,
these days...
@riferrei | @kafkameetup | @CONFLUENTINC
Databases
are limited
Limited?
Are you
kidding me?
@riferrei | @kafkameetup | @CONFLUENTINC
ARE DATABASES LIMITED?
YES, THEY ARE. WHY
DO WE HAVE TO MOVE
DATA FROM ONE DB TO
ANOTHER JUST TO DO
ANALYTICS?
@riferrei | @kafkameetup | @CONFLUENTINC
SHARED STATE = MORE DB’S
Business line 1 Business line 2 Business line 3
@riferrei | @kafkameetup | @CONFLUENTINC
THIRD REALIZATION
User
tracking
Historical
data
Operational
metricsNosql
database
Graph
database
Sql
database
microservices
...HADOOP
Elastic
search
grafana
Machine
learning
REC.
ENGINE SEARCH SECURITY EMAIL
SOCIAL
GRAPH
“The truth is the log.
The database is a cache
of a subset of the log.”
— pat helland
Immutability changes everything
http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
@riferrei | @kafkameetup | @CONFLUENTINC
log as first-class citizen
database
LOG
0 1 2 3 4 5 6 7 8LOG
reads
writes
Destination System a
(time = 1)
Destination System b
(time = 3)
@riferrei | @kafkameetup | @CONFLUENTINC
SOLUTION: BUILD A COMMIT LOG
Commit LOG
User
tracking
Historical
data
Operational
metricsNosql
database
Graph
database
Sql
database
microservices
...HADOOP
Elastic
search
grafana
Machine
learning
REC.
ENGINE SEARCH SECURITY EMAIL
SOCIAL
GRAPH
http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
Do you THINK is a table
that you RUN QUERIES?
@riferrei | @kafkameetup | @CONFLUENTINC
STREAMS AND TABLES DUALITY
{"user":"riferrei","score":"1001"}
{"user":"riferrei","score":"1002"}
{"user":"riferrei","score":"1003"}
{"user":"riferrei","score":"1004"}
{"user":"riferrei","score":"1005"}
{"user":"riferrei","score":"1005"}
stream
table
Origins of apache kafka
@riferrei | @kafkameetup | @CONFLUENTINC
”WE’VE COME TO THINK OF KAFKA AS A
STREAMING PLATFORM: A SYSTEM THAT
LETS YOU PUBLISH AND SUBSCRIBE TO
STREAMS OF DATA, STORE THEM, AND
PROCESS THEM, AND THAT IS EXACTLY
WHAT APACHE KAFKA IS BUILT TO BE.”
– jay kreps
@riferrei | @kafkameetup | @CONFLUENTINC
ORIGINS OF APACHE KAFKA
Databases Messaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence After
Consumption
No Replay
Highly Scalable
Durable
Persistent
Ordered
Fast (Low Latency)
@riferrei | @kafkameetup | @CONFLUENTINC
ORIGINS OF APACHE KAFKA
Databases Messaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence After
Consumption
No Replay
Highly Scalable
Durable
Persistent
Ordered
Fast (Low Latency)Highly Scalable
Durable
Persistent
Ordered
Fast (Low Latency)
Distributed
Commit log
@riferrei | @kafkameetup | @CONFLUENTINC
ORIGINS OF APACHE KAFKA
Databases Messaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence After
Consumption
No Replay
Highly Scalable
Durable
Persistent
Ordered
Fast (Low Latency)Highly Scalable
Durable
Persistent
Ordered
Fast (Low Latency)
Stream processing
Continuous flows
Scalable integration
Distributed
Streaming platform
@riferrei | @confluentinc | @itau
Origins of apache kafka
@riferrei | @kafkameetup | @CONFLUENTINC
”the ability to combine these three
areas – to bring all the streams of
data together across all the use
cases – is what makes the idea of a
streaming platform so appealing
to people” – jay kreps
01
Well done
messaging
02
Durable
storage
03
Stream
processing
WHAT IS APACHE KAFKA?
@riferrei | @kafkameetup | @CONFLUENTINC
Time for some fun
1. Get the game 2. Name yourself
@riferrei | @KAFKAMEETUP | @CONFLUENTINC
https://github.com/confluentinc/demo-scene
<<Pacman-ccloud>>
Source-code
@riferrei | @kafkameetup | @CONFLUENTINC
Source: USER_GAME TOPIC
@riferrei | @kafkameetup | @CONFLUENTINC
Creating User_game stream
@riferrei | @kafkameetup | @CONFLUENTINC
Querying USER_GAME STREAM
@riferrei | @kafkameetup | @CONFLUENTINC
Creating Stats_per_user table
@riferrei | @kafkameetup | @CONFLUENTINC
Querying STATS_PER_USER TABLE
@riferrei | @kafkameetup | @CONFLUENTINC
Low latency Pull queries
@riferrei | @kafkameetup | @CONFLUENTINC
Source: User_losses topic
@riferrei | @kafkameetup | @CONFLUENTINC
Creating USER_LOSSES STREAM
@riferrei | @kafkameetup | @CONFLUENTINC
querying USER_LOSSES STREAM
@riferrei | @kafkameetup | @CONFLUENTINC
Creating LOSSES_PER_USER TABLE
@riferrei | @kafkameetup | @CONFLUENTINC
Querying LOSSES_PER_USER TABLE
@riferrei | @kafkameetup | @CONFLUENTINC
Creating SCOREBOARD TABLE
@riferrei | @kafkameetup | @CONFLUENTINC
Querying SCOREBOARD TABLE
@riferrei | @kafkameetup | @CONFLUENTINC
Complete scoreboard
USER_GAME
USER_losses
Stats_per_user
losses_per_user
SCOREBOARD
storage process storage process storage
@riferrei | @kafkameetup | @CONFLUENTINC
how can I
learn more?
@riferrei | @kafkameetup | @CONFLUENTINC
Get kafka: confluent cloud
Try free:
https://cnfl.io/confluent-cloud
@riferrei | @kafkameetup | @CONFLUENTINC
https://cnfl.io/tutorials
Get examples: kafka tutorials
@riferrei | @kafkameetup | @CONFLUENTINC
https://cnfl.io/books
Get books: o’reilly bundle
@riferrei | @kafkameetup | @CONFLUENTINC
https://kafka-summit.org/events/kafka-summit-austin-2020
join kafka summit
https://myeventi.events/kafka20/aus
Use 25% discount code: KSL20Meetup
@riferrei | @kafkameetup | @CONFLUENTINC
Thank you

Contenu connexe

Tendances

Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureGuido Schmutz
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKai Wähner
 
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...confluent
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...Kai Wähner
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaQAware GmbH
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2confluent
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of dataconfluent
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming VisualisationGuido Schmutz
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appNeil Avery
 
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19Confluent Cloud for Apache Kafka® | Google Cloud Next ’19
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19confluent
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming PlatformDr. Mirko Kämpf
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinarconfluent
 
JUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingJUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingNicolas Fränkel
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertconfluent
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?confluent
 
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Timothy Spann
 
Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...confluent
 
Real time analytics in Azure IoT
Real time analytics in Azure IoT Real time analytics in Azure IoT
Real time analytics in Azure IoT Sam Vanhoutte
 
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...confluent
 
Apache Kafka for Smart Grid, Utilities and Energy Production
Apache Kafka for Smart Grid, Utilities and Energy ProductionApache Kafka for Smart Grid, Utilities and Energy Production
Apache Kafka for Smart Grid, Utilities and Energy ProductionKai Wähner
 

Tendances (20)

Fundamentals Big Data and AI Architecture
Fundamentals Big Data and AI ArchitectureFundamentals Big Data and AI Architecture
Fundamentals Big Data and AI Architecture
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
 
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
 
Concepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with KafkaConcepts and Patterns for Streaming Services with Kafka
Concepts and Patterns for Streaming Services with Kafka
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
 
Real-time processing of large amounts of data
Real-time processing of large amounts of dataReal-time processing of large amounts of data
Real-time processing of large amounts of data
 
Streaming Visualisation
Streaming VisualisationStreaming Visualisation
Streaming Visualisation
 
Kafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming appKafka summit SF 2019 - the art of the event-streaming app
Kafka summit SF 2019 - the art of the event-streaming app
 
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19Confluent Cloud for Apache Kafka® | Google Cloud Next ’19
Confluent Cloud for Apache Kafka® | Google Cloud Next ’19
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
Bridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure WebinarBridge Your Kafka Streams to Azure Webinar
Bridge Your Kafka Streams to Azure Webinar
 
JUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streamingJUG Tirana - Introduction to data streaming
JUG Tirana - Introduction to data streaming
 
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniertFast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?
 
Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020Fast data for fitness 10 nov 2020
Fast data for fitness 10 nov 2020
 
Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...Event Driven Architecture: Mistakes, I've made a few...
Event Driven Architecture: Mistakes, I've made a few...
 
Real time analytics in Azure IoT
Real time analytics in Azure IoT Real time analytics in Azure IoT
Real time analytics in Azure IoT
 
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
 
Apache Kafka for Smart Grid, Utilities and Energy Production
Apache Kafka for Smart Grid, Utilities and Energy ProductionApache Kafka for Smart Grid, Utilities and Energy Production
Apache Kafka for Smart Grid, Utilities and Energy Production
 

Similaire à Rediscovering the Value of Apache Kafka® in Modern Data Architecture

Learning How to Build Event Streaming Applications with Pac-Man
Learning How to Build Event Streaming Applications with Pac-ManLearning How to Build Event Streaming Applications with Pac-Man
Learning How to Build Event Streaming Applications with Pac-Manconfluent
 
Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher   Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher Tamir Dresher
 
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Jeff Magnusson
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?confluent
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixJeff Magnusson
 
Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson Hakka Labs
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloITCamp
 
Parallel-Ready Java Code: Managing Mutation in an Imperative Language
Parallel-Ready Java Code: Managing Mutation in an Imperative LanguageParallel-Ready Java Code: Managing Mutation in an Imperative Language
Parallel-Ready Java Code: Managing Mutation in an Imperative LanguageMaurice Naftalin
 
From Developer to Data Scientist - Gaines Kergosien
From Developer to Data Scientist - Gaines KergosienFrom Developer to Data Scientist - Gaines Kergosien
From Developer to Data Scientist - Gaines KergosienITCamp
 
Etl is Dead; Long Live Streams
Etl is Dead; Long Live StreamsEtl is Dead; Long Live Streams
Etl is Dead; Long Live Streamsconfluent
 
Building scalable data with kafka and spark
Building scalable data with kafka and sparkBuilding scalable data with kafka and spark
Building scalable data with kafka and sparkbabatunde ekemode
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsC4Media
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkDatabricks
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingPaco Nathan
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringGianluca Arbezzano
 
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoOSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoNETWAYS
 
Hadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an exampleHadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an examplehadooparchbook
 

Similaire à Rediscovering the Value of Apache Kafka® in Modern Data Architecture (20)

Learning How to Build Event Streaming Applications with Pac-Man
Learning How to Build Event Streaming Applications with Pac-ManLearning How to Build Event Streaming Applications with Pac-Man
Learning How to Build Event Streaming Applications with Pac-Man
 
Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher   Anatomy of a data driven architecture - Tamir Dresher
Anatomy of a data driven architecture - Tamir Dresher
 
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
Watching Pigs Fly with the Netflix Hadoop Toolkit (Hadoop Summit 2013)
 
What is Apache Kafka®?
What is Apache Kafka®?What is Apache Kafka®?
What is Apache Kafka®?
 
Lipstick On Pig
Lipstick On Pig Lipstick On Pig
Lipstick On Pig
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at Netflix
 
Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson Netflix - Pig with Lipstick by Jeff Magnusson
Netflix - Pig with Lipstick by Jeff Magnusson
 
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea SaltarelloAzure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
Azure tales: a real world CQRS and ES Deep Dive - Andrea Saltarello
 
Parallel-Ready Java Code: Managing Mutation in an Imperative Language
Parallel-Ready Java Code: Managing Mutation in an Imperative LanguageParallel-Ready Java Code: Managing Mutation in an Imperative Language
Parallel-Ready Java Code: Managing Mutation in an Imperative Language
 
From Developer to Data Scientist - Gaines Kergosien
From Developer to Data Scientist - Gaines KergosienFrom Developer to Data Scientist - Gaines Kergosien
From Developer to Data Scientist - Gaines Kergosien
 
Etl is Dead; Long Live Streams
Etl is Dead; Long Live StreamsEtl is Dead; Long Live Streams
Etl is Dead; Long Live Streams
 
Building scalable data with kafka and spark
Building scalable data with kafka and sparkBuilding scalable data with kafka and spark
Building scalable data with kafka and spark
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Dev Ops Training
Dev Ops TrainingDev Ops Training
Dev Ops Training
 
ETL Is Dead, Long-live Streams
ETL Is Dead, Long-live StreamsETL Is Dead, Long-live Streams
ETL Is Dead, Long-live Streams
 
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache SparkData-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
Data-Driven Transformation: Leveraging Big Data at Showtime with Apache Spark
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 
OSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoringOSDC 2018 - Distributed monitoring
OSDC 2018 - Distributed monitoring
 
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca ArbezzanoOSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
OSDC 2018 | Distributed Monitoring by Gianluca Arbezzano
 
Hadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an exampleHadoop application architectures - using Customer 360 as an example
Hadoop application architectures - using Customer 360 as an example
 

Plus de confluent

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flinkconfluent
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsconfluent
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flinkconfluent
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluentconfluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkconfluent
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloudconfluent
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Diveconfluent
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluentconfluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Meshconfluent
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservicesconfluent
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3confluent
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernizationconfluent
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataconfluent
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2confluent
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023confluent
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesisconfluent
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023confluent
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streamsconfluent
 

Plus de confluent (20)

Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Santander Stream Processing with Apache Flink
Santander Stream Processing with Apache FlinkSantander Stream Processing with Apache Flink
Santander Stream Processing with Apache Flink
 
Unlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insightsUnlocking the Power of IoT: A comprehensive approach to real-time insights
Unlocking the Power of IoT: A comprehensive approach to real-time insights
 
Workshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con FlinkWorkshop híbrido: Stream Processing con Flink
Workshop híbrido: Stream Processing con Flink
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
AWS Immersion Day Mapfre - Confluent
AWS Immersion Day Mapfre   -   ConfluentAWS Immersion Day Mapfre   -   Confluent
AWS Immersion Day Mapfre - Confluent
 
Eventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalkEventos y Microservicios - Santander TechTalk
Eventos y Microservicios - Santander TechTalk
 
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent CloudQ&A with Confluent Experts: Navigating Networking in Confluent Cloud
Q&A with Confluent Experts: Navigating Networking in Confluent Cloud
 
Citi TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep DiveCiti TechTalk Session 2: Kafka Deep Dive
Citi TechTalk Session 2: Kafka Deep Dive
 
Build real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with ConfluentBuild real-time streaming data pipelines to AWS with Confluent
Build real-time streaming data pipelines to AWS with Confluent
 
Q&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service MeshQ&A with Confluent Professional Services: Confluent Service Mesh
Q&A with Confluent Professional Services: Confluent Service Mesh
 
Citi Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka MicroservicesCiti Tech Talk: Event Driven Kafka Microservices
Citi Tech Talk: Event Driven Kafka Microservices
 
Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3Confluent & GSI Webinars series - Session 3
Confluent & GSI Webinars series - Session 3
 
Citi Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging ModernizationCiti Tech Talk: Messaging Modernization
Citi Tech Talk: Messaging Modernization
 
Citi Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time dataCiti Tech Talk: Data Governance for streaming and real time data
Citi Tech Talk: Data Governance for streaming and real time data
 
Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2Confluent & GSI Webinars series: Session 2
Confluent & GSI Webinars series: Session 2
 
Data In Motion Paris 2023
Data In Motion Paris 2023Data In Motion Paris 2023
Data In Motion Paris 2023
 
Confluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with SynthesisConfluent Partner Tech Talk with Synthesis
Confluent Partner Tech Talk with Synthesis
 
The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023The Future of Application Development - API Days - Melbourne 2023
The Future of Application Development - API Days - Melbourne 2023
 
The Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data StreamsThe Playful Bond Between REST And Data Streams
The Playful Bond Between REST And Data Streams
 

Dernier

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Dernier (20)

Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Rediscovering the Value of Apache Kafka® in Modern Data Architecture

  • 1. Rediscovering the value of apache kafka® in modern data architectures @riferrei | #kafkameetup | @CONFLUENTINC
  • 2. About me @riferrei | @kafkameetup | @CONFLUENTINC • RICARDO FERREIRA • Works for confluent • Developer advocate • Ricardo@confluent.iO • HTTPS://RIFERREI.NET
  • 3.
  • 4.
  • 5. Origins of apache kafka @riferrei | @kafkameetup | @CONFLUENTINC ”there were lots of databases and other systems built to store data, but what was missing in our architecture was something that would help us to handle continuous flows of data.” – jay kreps
  • 6. @riferrei | @kafkameetup | @CONFLUENTINC
  • 7. @riferrei | @kafkameetup | @CONFLUENTINC First realization > I changed my job from oracle to confluent I work at confluent event state
  • 8. @riferrei | @kafkameetup | @CONFLUENTINC Events are both notification State transfer +
  • 9. @riferrei | @kafkameetup | @CONFLUENTINC Event-driven application Job change recommendation engine Search engine Email service
  • 10. @riferrei | @kafkameetup | @CONFLUENTINC SQL SQL SQL Recommendation engine Search engine Email servicedatabase LOG Let’s implement this!
  • 11. @riferrei | @kafkameetup | @CONFLUENTINC second realization database 1000x more volume Non-transactional events Transactional events LOG
  • 14. @riferrei | @kafkameetup | @CONFLUENTINC Databases are limited
  • 16. @riferrei | @kafkameetup | @CONFLUENTINC ARE DATABASES LIMITED? YES, THEY ARE. WHY DO WE HAVE TO MOVE DATA FROM ONE DB TO ANOTHER JUST TO DO ANALYTICS?
  • 17. @riferrei | @kafkameetup | @CONFLUENTINC SHARED STATE = MORE DB’S Business line 1 Business line 2 Business line 3
  • 18. @riferrei | @kafkameetup | @CONFLUENTINC THIRD REALIZATION User tracking Historical data Operational metricsNosql database Graph database Sql database microservices ...HADOOP Elastic search grafana Machine learning REC. ENGINE SEARCH SECURITY EMAIL SOCIAL GRAPH
  • 19. “The truth is the log. The database is a cache of a subset of the log.” — pat helland Immutability changes everything http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
  • 20. @riferrei | @kafkameetup | @CONFLUENTINC log as first-class citizen database LOG 0 1 2 3 4 5 6 7 8LOG reads writes Destination System a (time = 1) Destination System b (time = 3)
  • 21. @riferrei | @kafkameetup | @CONFLUENTINC SOLUTION: BUILD A COMMIT LOG Commit LOG User tracking Historical data Operational metricsNosql database Graph database Sql database microservices ...HADOOP Elastic search grafana Machine learning REC. ENGINE SEARCH SECURITY EMAIL SOCIAL GRAPH
  • 23. @riferrei | @kafkameetup | @CONFLUENTINC STREAMS AND TABLES DUALITY {"user":"riferrei","score":"1001"} {"user":"riferrei","score":"1002"} {"user":"riferrei","score":"1003"} {"user":"riferrei","score":"1004"} {"user":"riferrei","score":"1005"} {"user":"riferrei","score":"1005"} stream table
  • 24. Origins of apache kafka @riferrei | @kafkameetup | @CONFLUENTINC ”WE’VE COME TO THINK OF KAFKA AS A STREAMING PLATFORM: A SYSTEM THAT LETS YOU PUBLISH AND SUBSCRIBE TO STREAMS OF DATA, STORE THEM, AND PROCESS THEM, AND THAT IS EXACTLY WHAT APACHE KAFKA IS BUILT TO BE.” – jay kreps
  • 25. @riferrei | @kafkameetup | @CONFLUENTINC ORIGINS OF APACHE KAFKA Databases Messaging Batch Expensive Time Consuming Difficult to Scale No Persistence After Consumption No Replay Highly Scalable Durable Persistent Ordered Fast (Low Latency)
  • 26. @riferrei | @kafkameetup | @CONFLUENTINC ORIGINS OF APACHE KAFKA Databases Messaging Batch Expensive Time Consuming Difficult to Scale No Persistence After Consumption No Replay Highly Scalable Durable Persistent Ordered Fast (Low Latency)Highly Scalable Durable Persistent Ordered Fast (Low Latency) Distributed Commit log
  • 27. @riferrei | @kafkameetup | @CONFLUENTINC ORIGINS OF APACHE KAFKA Databases Messaging Batch Expensive Time Consuming Difficult to Scale No Persistence After Consumption No Replay Highly Scalable Durable Persistent Ordered Fast (Low Latency)Highly Scalable Durable Persistent Ordered Fast (Low Latency) Stream processing Continuous flows Scalable integration Distributed Streaming platform
  • 29. Origins of apache kafka @riferrei | @kafkameetup | @CONFLUENTINC ”the ability to combine these three areas – to bring all the streams of data together across all the use cases – is what makes the idea of a streaming platform so appealing to people” – jay kreps
  • 31. @riferrei | @kafkameetup | @CONFLUENTINC Time for some fun 1. Get the game 2. Name yourself
  • 32. @riferrei | @KAFKAMEETUP | @CONFLUENTINC https://github.com/confluentinc/demo-scene <<Pacman-ccloud>> Source-code
  • 33. @riferrei | @kafkameetup | @CONFLUENTINC Source: USER_GAME TOPIC
  • 34. @riferrei | @kafkameetup | @CONFLUENTINC Creating User_game stream
  • 35. @riferrei | @kafkameetup | @CONFLUENTINC Querying USER_GAME STREAM
  • 36. @riferrei | @kafkameetup | @CONFLUENTINC Creating Stats_per_user table
  • 37. @riferrei | @kafkameetup | @CONFLUENTINC Querying STATS_PER_USER TABLE
  • 38. @riferrei | @kafkameetup | @CONFLUENTINC Low latency Pull queries
  • 39. @riferrei | @kafkameetup | @CONFLUENTINC Source: User_losses topic
  • 40. @riferrei | @kafkameetup | @CONFLUENTINC Creating USER_LOSSES STREAM
  • 41. @riferrei | @kafkameetup | @CONFLUENTINC querying USER_LOSSES STREAM
  • 42. @riferrei | @kafkameetup | @CONFLUENTINC Creating LOSSES_PER_USER TABLE
  • 43. @riferrei | @kafkameetup | @CONFLUENTINC Querying LOSSES_PER_USER TABLE
  • 44. @riferrei | @kafkameetup | @CONFLUENTINC Creating SCOREBOARD TABLE
  • 45. @riferrei | @kafkameetup | @CONFLUENTINC Querying SCOREBOARD TABLE
  • 46. @riferrei | @kafkameetup | @CONFLUENTINC Complete scoreboard USER_GAME USER_losses Stats_per_user losses_per_user SCOREBOARD storage process storage process storage
  • 47. @riferrei | @kafkameetup | @CONFLUENTINC how can I learn more?
  • 48. @riferrei | @kafkameetup | @CONFLUENTINC Get kafka: confluent cloud Try free: https://cnfl.io/confluent-cloud
  • 49. @riferrei | @kafkameetup | @CONFLUENTINC https://cnfl.io/tutorials Get examples: kafka tutorials
  • 50. @riferrei | @kafkameetup | @CONFLUENTINC https://cnfl.io/books Get books: o’reilly bundle
  • 51. @riferrei | @kafkameetup | @CONFLUENTINC https://kafka-summit.org/events/kafka-summit-austin-2020 join kafka summit https://myeventi.events/kafka20/aus Use 25% discount code: KSL20Meetup
  • 52. @riferrei | @kafkameetup | @CONFLUENTINC Thank you