Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
1
Now You See Me, Now You Compute
Building Event-Driven Architectures with Apache Kafka®
Michael G. Noll
Technologist, Offic...
22
Event Streaming
Why
?
33
The world is changing.
44
The New Business Reality
Past
Technology was a support function
Innovation required for growth
Running the business on ...
5
The Rise Of Event Streaming
60%Fortune 100 Companies
Using Apache Kafka
66
Taxis become Software
2
min
77
The world is changing.
Then
Hardware product
Up-front purchase
Opaque
No data
Now
Hardware, Software, and Global Intern...
88
Transportation
99
This transformation is
happening everywhere
1010
Banking
1111
Retail
1212
What enables this
transformation?
1313
Cloud Machine
Learning
Mobile Event
Streaming
Rethink
Decision Making
Rethink
User Experience
Rethink
Data
Rethink
Da...
1414
Do you see me?
Or: Would you blindly cross the street with
traffic information that is 5 minutes old?
1515
Transportation
ETA
Real-time sensor
diagnostics
Driver-rider match
Banking
Fraud detection
Trading and risk
systems
M...
1616
This is a fundamental paradigm shift...
Infrastructure
as code
Data as continuous
stream of events
Future of the
data...
1717
Event Streaming
The
Paradigm
1818
Two Problems in Application Infrastructure
What’s the state of
the world?
What’s happening
in the world?
Solution:
Da...
1919
ETL/Data Integration Messaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence
Data Loss
No Replay
Hig...
2020
ETL/Data Integration Messaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence
Data Loss
No Replay
Hig...
2121
2222
ETL/Data Integration Messaging
Transient MessagesStored records
ETL/Data Integration MessagingMessaging
Batch
Expensi...
2323
To rethink data as neither stored records
nor transient messages, but instead as a
continuously updating Stream of Ev...
24
An Event
records the fact that something happened
24
A good
was sold
An invoice
was issued
A payment
was made
A new cus...
25
A Stream
represents history as a sequence of Events
25
26
Events change the way we think
26
Monolithic Approach
● a database
● a variable
● a singleton
● an RPC
Event-First Appr...
27
An Event Streaming Platform
gives you three key functionalities
27
Publish & Subscribe
to Events
Store
Events
Process &...
2828
2929
Event Streaming
Platform
Universal Event Pipeline
Data Stores Logs 3rd Party Apps Custom Apps/Microservices
✓ Real-ti...
3030
Data Stores Logs 3rd Party Apps Custom Apps/Microservices
Real-Time
Inventory
Real-Time
Fraud
Detection
Real-Time
Cus...
3131
Event-Driven App
(Location Tracking)
Only Real-time Events
Messaging Queues and
Event Streaming
Platforms can do this...
3232
The Event Streaming Platform
is the Central Nervous System
for today’s enterprises
3333
Event Streaming Architectures
How to Build
With Kafka
34
is a distributed event streaming platform
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
3535
01
Stream your data
in real-time as Events
02
Store your
Event Streams
03
Process & Analyze
your Events Streams
3636
01
Stream your data
in real-time as Events
From apps, microservices
Use a Kafka producer client from your favorite la...
37
From apps, microservices: producer example
Python App
network
write
… and more
38
From/to other systems: Kafka Connect
and more
Tip: Great option to gradually move workloads to Kafka while keeping prod...
39
Kafka Connect
● Deployed standalone (development) or as a distributed cluster (production)
● Elastic service that works...
40
Single Message Transforms for real-time ETL
Ingress: modify an Event before storing
● Obfuscate sensitive information, ...
41
Where SMTs live (ingress example)
Data
Source
Kafka Connect
SMT1
Converter
transform serializes
Source
Connector
genera...
4242
Confluent Hub
Discover Connectors,
SMTs, and converters
confluent.io/hub
Easy installation
Documentation,
support, etc.
43
02
Store your
Event Streams
43
Kafka Cluster
VM
Storage is
Distributed
Scalable
Reliable
Durable
Performant
44
Topics PartitionsMessages / sec Brokers
10,000,000 25,000 1,000,000 1,500
Topics PartitionsMessages / sec Brokers
250,0...
4545
Event Streaming Paradigm
Highly Scalable
Durable
Persistent
Maintains Order
Fast (Low Latency)
Kafka = Source of Trut...
46
Secure your Event Streams
Authentication
Data
Confidentiality
Authorization
47
Achievement Data Unlocked:
All Your Data Now Available as Streams of Events
48
Consumer Bob Consumer Dina
Reads
Offset = 3 Offset = 7
Producer Alice
Writes
91 2 3 4 5 6 7 8
Independent access to Eve...
49
03
Process & Analyze
your Events Streams
49
With separate frameworks
… and more
With Streaming SQL
KSQL
streams
With ap...
50
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8
● You write only SQL
● No Java...
51
Stream Processing with KSQL
4 Headless1 UI 2 CLI
ksql>
3 API
POST /query
Pick your favorite interface
52
Where KSQL lives
VM
network
read/write
Elastic & Scalable
Fault-tolerant
Exactly-once
Kafka security
Aggregations
Windo...
53
Stream Processing with KSQL
Stream 01
Stream 02
Stream 03
Table
Process event streams to create new, continuously updat...
54
Stream Processing with KSQL
Query tables in Kafka from other apps, similar to a relational database
Table
QueryQuery
Pu...
55
Query tables in Kafka from other apps, similar to a relational database
Other Applications
(Java, Go, Python, etc.)
can...
56
KSQL integrates with Kafka Connect
Simplifies event streaming between Kafka and other systems
CREATE SOURCE CONNECTOR my...
57
KSQL example use case
Creating an event-driven dashboard from a customer database
customers
table
Kafka Connect is
stre...
58
Kafka Streams
● You write standard Java or
Scala applications to
process your events
● The Kafka Streams library
makes ...
59
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>2.3.0</version>
</depe...
60
KStreams Application
App instance 1
...
App instance n
VM
network
read/write
Elastic & Scalable
Fault-tolerant
Exactly-...
61
Process event streams to create new, continuously updated streams or tables
Orders
Inventory
Shipping
Stream Processing...
62
App instance 1
...
App instance n
ResultTable
Query your application’s tables and state from other apps
Other Applicati...
63
is a distributed event streaming platform
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
6464
Where to go from here
for more details on event-driven architectures with Kafka
65
THANK YOU
@miguno
michael@confluent.io
cnfl.io/meetups cnfl.io/blog cnfl.io/slack
Prochain SlideShare
Chargement dans…5
×
Prochain SlideShare
What to Upload to SlideShare
Suivant
Télécharger pour lire hors ligne et voir en mode plein écran

8

Partager

Télécharger pour lire hors ligne

Now You See Me, Now You Compute: Building Event-Driven Architectures with Apache Kafka | Strata New York 2019

Télécharger pour lire hors ligne

Talk URL: https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/77360

Abstract: Would you cross the street with traffic information that’s a minute old? Certainly not. Modern businesses have the same needs nowadays, whether it’s due to competitive pressure or because their customers have much higher expectations of how they want to interact with a product or service. At the heart of this movement are events: in today’s digital age, events are everywhere. Every digital action—across online purchases to ride-sharing requests to bank deposits—creates a set of events around transaction amount, transaction time, user location, account balance, and much more. The technology that allows businesses to read, write, store, and compute and process these events in real-time are event-streaming platforms, and tens of thousands of companies like Netflix, Audi, PayPal, Airbnb, Uber, and Pinterest have picked Apache Kafka as the de facto choice to implement event-driven architectures and reshape their industries.

Michael Noll explores why and how you can use Apache Kafka and its growing ecosystem to build event-driven architectures that are elastic, scalable, robust, and fault tolerant, whether it’s on-premises, in the cloud, on bare metal machines, or in Kubernetes with Docker containers. Specifically, you’ll look at Kafka as the storage and publish and subscribe layer; Kafka’s Connect framework for integrating external data systems such as MySQL, Elastic, or S3 with Kafka; and Kafka’s Streams API and KSQL as the compute layer to implement event-driven applications and microservices in Java and Scala and streaming SQL, respectively, that process the events flowing through Kafka in real time. Michael provides an overview of the most relevant functionality, both current and upcoming, and shares best practices and typical use cases so you can tie it all together for your own needs.

Now You See Me, Now You Compute: Building Event-Driven Architectures with Apache Kafka | Strata New York 2019

  1. 1. 1 Now You See Me, Now You Compute Building Event-Driven Architectures with Apache Kafka® Michael G. Noll Technologist, Office of the CTO, Confluent @miguno
  2. 2. 22 Event Streaming Why ?
  3. 3. 33 The world is changing.
  4. 4. 44 The New Business Reality Past Technology was a support function Innovation required for growth Running the business on yesterday’s data was “good enough” Today Technology is the business Innovation required for survival Yesterday’s data = failure. Modern, real-time data infrastructure is required.
  5. 5. 5 The Rise Of Event Streaming 60%Fortune 100 Companies Using Apache Kafka
  6. 6. 66 Taxis become Software 2 min
  7. 7. 77 The world is changing. Then Hardware product Up-front purchase Opaque No data Now Hardware, Software, and Global Internet Service On-demand Real-time visibility Built on a foundation of data Transportation
  8. 8. 88 Transportation
  9. 9. 99 This transformation is happening everywhere
  10. 10. 1010 Banking
  11. 11. 1111 Retail
  12. 12. 1212 What enables this transformation?
  13. 13. 1313 Cloud Machine Learning Mobile Event Streaming Rethink Decision Making Rethink User Experience Rethink Data Rethink Data Centers
  14. 14. 1414 Do you see me? Or: Would you blindly cross the street with traffic information that is 5 minutes old?
  15. 15. 1515 Transportation ETA Real-time sensor diagnostics Driver-rider match Banking Fraud detection Trading and risk systems Mobile applications / customer experience Retail Real-time inventory Real-time POS reporting Personalization Entertainment Real-time recommendations Personalized news feed In-app purchases
  16. 16. 1616 This is a fundamental paradigm shift... Infrastructure as code Data as continuous stream of events Future of the datacenter Future of data Cloud Event Streaming
  17. 17. 1717 Event Streaming The Paradigm
  18. 18. 1818 Two Problems in Application Infrastructure What’s the state of the world? What’s happening in the world? Solution: Databases Solution: Messaging, RPC, ETL, etc.
  19. 19. 1919 ETL/Data Integration Messaging Batch Expensive Time Consuming Difficult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency)
  20. 20. 2020 ETL/Data Integration Messaging Batch Expensive Time Consuming Difficult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency) Transient MessagesStored records
  21. 21. 2121
  22. 22. 2222 ETL/Data Integration Messaging Transient MessagesStored records ETL/Data Integration MessagingMessaging Batch Expensive Time Consuming Difficult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency) Event Streaming Paradigm High Throughput Durable Persistent Maintains Order Fast (Low Latency) Replay
  23. 23. 2323 To rethink data as neither stored records nor transient messages, but instead as a continuously updating Stream of Events Event Streaming Paradigm
  24. 24. 24 An Event records the fact that something happened 24 A good was sold An invoice was issued A payment was made A new customer registered
  25. 25. 25 A Stream represents history as a sequence of Events 25
  26. 26. 26 Events change the way we think 26 Monolithic Approach ● a database ● a variable ● a singleton ● an RPC Event-First Approach ● an event ● a stream ● a ‘data’ flow ● a stream processor Orders Service Payments Service Customers Service Orders Service Order Validation Service Tax ServiceEmail Notification ServiceDB request response event streams
  27. 27. 27 An Event Streaming Platform gives you three key functionalities 27 Publish & Subscribe to Events Store Events Process & Analyze Events
  28. 28. 2828
  29. 29. 2929 Event Streaming Platform Universal Event Pipeline Data Stores Logs 3rd Party Apps Custom Apps/Microservices ✓ Real-time but also persistent ✓ Elastic, scalable, reliable ✓ High throughput, low latency ✓ All apps and systems can now speak to each other for a complete view of data
  30. 30. 3030 Data Stores Logs 3rd Party Apps Custom Apps/Microservices Real-Time Inventory Real-Time Fraud Detection Real-Time Customer 360 Machine Learning Models Real-Time Data Transformation ... Event-Driven Apps, with Historical Context Universal Event Pipeline Event Streaming Platform ✓ Real-time but also persistent ✓ Elastic, scalable, reliable ✓ High throughput, low latency ✓ All apps and systems can now speak to each other for a complete view of data
  31. 31. 3131 Event-Driven App (Location Tracking) Only Real-time Events Messaging Queues and Event Streaming Platforms can do this Contextual Event-Driven App (ETA) Real-time combined with stored data Only Event Streaming Platforms can do this Where is my driver? When will my driver get here? Where is my driver? When will my driver get here? Why Combine Real-time With Historical Context? 2 min
  32. 32. 3232 The Event Streaming Platform is the Central Nervous System for today’s enterprises
  33. 33. 3333 Event Streaming Architectures How to Build With Kafka
  34. 34. 34 is a distributed event streaming platform Publish & Subscribe to Events Store Events Process & Analyze Events
  35. 35. 3535 01 Stream your data in real-time as Events 02 Store your Event Streams 03 Process & Analyze your Events Streams
  36. 36. 3636 01 Stream your data in real-time as Events From apps, microservices Use a Kafka producer client from your favorite language … and many more From/to other systems Use Kafka Connect plus a Connector for your system … and many more
  37. 37. 37 From apps, microservices: producer example Python App network write … and more
  38. 38. 38 From/to other systems: Kafka Connect and more Tip: Great option to gradually move workloads to Kafka while keeping production running!
  39. 39. 39 Kafka Connect ● Deployed standalone (development) or as a distributed cluster (production) ● Elastic service that works on bare-metal, VMs, containers, Kubernetes, ... ● The individual ‘Connector’ determines delivery guarantees, e.g., exactly-once VM VM
  40. 40. 40 Single Message Transforms for real-time ETL Ingress: modify an Event before storing ● Obfuscate sensitive information, e.g. PII ● Add origin of event for lineage tracking ● Remove unnecessary data fields ● … and more Egress: modify an Event on its way out ● Route high-priority events to faster stores ● Direct events to different Elasticsearch indexes ● Cast data types to match destination ● … and more { user: ab123, gender: female, ip: 1.2.3.95 } { user: ab123, ip: 1.2.3.XXX }
  41. 41. 41 Where SMTs live (ingress example) Data Source Kafka Connect SMT1 Converter transform serializes Source Connector generates events ... SMTn 10101 01010
  42. 42. 4242 Confluent Hub Discover Connectors, SMTs, and converters confluent.io/hub Easy installation Documentation, support, etc.
  43. 43. 43 02 Store your Event Streams 43 Kafka Cluster VM Storage is Distributed Scalable Reliable Durable Performant
  44. 44. 44 Topics PartitionsMessages / sec Brokers 10,000,000 25,000 1,000,000 1,500 Topics PartitionsMessages / sec Brokers 250,000 500 25,000 25 Topics PartitionsMessages / sec Brokers 1 5 300 3 Kafka scales from S to XXL
  45. 45. 4545 Event Streaming Paradigm Highly Scalable Durable Persistent Maintains Order Fast (Low Latency) Kafka = Source of Truth, stores every article since 1851 Denormalized into “Content View” Normalized assets (images, articles, bylines, etc.) https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/ Store your Events as long as you want
  46. 46. 46 Secure your Event Streams Authentication Data Confidentiality Authorization
  47. 47. 47 Achievement Data Unlocked: All Your Data Now Available as Streams of Events
  48. 48. 48 Consumer Bob Consumer Dina Reads Offset = 3 Offset = 7 Producer Alice Writes 91 2 3 4 5 6 7 8 Independent access to Event Streams
  49. 49. 49 03 Process & Analyze your Events Streams 49 With separate frameworks … and more With Streaming SQL KSQL streams With apps, microservices … and more Kafka consumer clients or
  50. 50. 50 CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8 ● You write only SQL ● No Java, Python, or other boilerplate to wrap around it! ● Create KSQL User Defined Functions in Java when needed ● All you need is Kafka KSQL
  51. 51. 51 Stream Processing with KSQL 4 Headless1 UI 2 CLI ksql> 3 API POST /query Pick your favorite interface
  52. 52. 52 Where KSQL lives VM network read/write Elastic & Scalable Fault-tolerant Exactly-once Kafka security Aggregations Windowing Streams & Tables KSQL Cluster
  53. 53. 53 Stream Processing with KSQL Stream 01 Stream 02 Stream 03 Table Process event streams to create new, continuously updated streams or tables QueryQuery Streaming Query CREATE TABLE OrderTotals AS SELECT * FROM ... EMIT CHANGES
  54. 54. 54 Stream Processing with KSQL Query tables in Kafka from other apps, similar to a relational database Table QueryQuery Pull Query SELECT * FROM OrderTotals WHERE region = ‘Europe’ Result Upcoming feature (KLIP-8)
  55. 55. 55 Query tables in Kafka from other apps, similar to a relational database Other Applications (Java, Go, Python, etc.) can directly query tables Result request-response via network (KSQL REST API) Table SELECT * FROM OrderTotals WHERE region = ‘Europe’ Stream Processing with KSQL Upcoming feature (KLIP-8)
  56. 56. 56 KSQL integrates with Kafka Connect Simplifies event streaming between Kafka and other systems CREATE SOURCE CONNECTOR my-postgres-jdbc WITH ( connector.class = "io.confluent.connect.jdbc.jdbcSourceConnector", connection.url = "jdbc:postgresql://dbserver:5432/my-db", ...); Upcoming feature (KLIP-7) controls controls
  57. 57. 57 KSQL example use case Creating an event-driven dashboard from a customer database customers table Kafka Connect is streaming change events Results are continuously updating Elasticsearch Aggregations are computed in real-time
  58. 58. 58 Kafka Streams ● You write standard Java or Scala applications to process your events ● The Kafka Streams library makes these applications: elastic, scalable, fault-tolerant, and more ● All you need is Kafka streams
  59. 59. 59 <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-streams</artifactId> <version>2.3.0</version> </dependency> Add as dependency to your Java/Scala app Writing a Kafka Streams application
  60. 60. 60 KStreams Application App instance 1 ... App instance n VM network read/write Elastic & Scalable Fault-tolerant Exactly-once Kafka security Aggregations Windowing Streams & Tables Where your Kafka Streams apps live
  61. 61. 61 Process event streams to create new, continuously updated streams or tables Orders Inventory Shipping Stream Processing with Kafka Streams apps Frontend Event-driven apps and services communicate through Kafka Reporting New apps can easily be added by tapping into existing event streams
  62. 62. 62 App instance 1 ... App instance n ResultTable Query your application’s tables and state from other apps Other Applications (Java, Go, Python, etc.) can directly query tables request-response via network (e.g. REST API) Stream Processing with Kafka Streams apps Reporting App
  63. 63. 63 is a distributed event streaming platform Publish & Subscribe to Events Store Events Process & Analyze Events
  64. 64. 6464 Where to go from here for more details on event-driven architectures with Kafka
  65. 65. 65 THANK YOU @miguno michael@confluent.io cnfl.io/meetups cnfl.io/blog cnfl.io/slack
  • timhaselaars

    Sep. 10, 2020
  • Ghislain

    Nov. 2, 2019
  • BuddyBallentine

    Oct. 24, 2019
  • SudhakarDaggubati

    Oct. 6, 2019
  • MarcosColebrookSantamaria

    Oct. 4, 2019
  • StreamingAnalytics

    Oct. 3, 2019
  • SargisHarutyunyan

    Oct. 3, 2019
  • adityaparikh

    Oct. 3, 2019

Talk URL: https://conferences.oreilly.com/strata/strata-ny/public/schedule/detail/77360 Abstract: Would you cross the street with traffic information that’s a minute old? Certainly not. Modern businesses have the same needs nowadays, whether it’s due to competitive pressure or because their customers have much higher expectations of how they want to interact with a product or service. At the heart of this movement are events: in today’s digital age, events are everywhere. Every digital action—across online purchases to ride-sharing requests to bank deposits—creates a set of events around transaction amount, transaction time, user location, account balance, and much more. The technology that allows businesses to read, write, store, and compute and process these events in real-time are event-streaming platforms, and tens of thousands of companies like Netflix, Audi, PayPal, Airbnb, Uber, and Pinterest have picked Apache Kafka as the de facto choice to implement event-driven architectures and reshape their industries. Michael Noll explores why and how you can use Apache Kafka and its growing ecosystem to build event-driven architectures that are elastic, scalable, robust, and fault tolerant, whether it’s on-premises, in the cloud, on bare metal machines, or in Kubernetes with Docker containers. Specifically, you’ll look at Kafka as the storage and publish and subscribe layer; Kafka’s Connect framework for integrating external data systems such as MySQL, Elastic, or S3 with Kafka; and Kafka’s Streams API and KSQL as the compute layer to implement event-driven applications and microservices in Java and Scala and streaming SQL, respectively, that process the events flowing through Kafka in real time. Michael provides an overview of the most relevant functionality, both current and upcoming, and shares best practices and typical use cases so you can tie it all together for your own needs.

Vues

Nombre de vues

1 474

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

50

Actions

Téléchargements

97

Partages

0

Commentaires

0

Mentions J'aime

8

×