Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
1
Rethinking Stream Processing
with Apache Kafka:
Applications vs. Clusters,
Streams vs. Databases
Michael G. Noll, Conflu...
2
0.11 Exactly-once
semantics
0.10 Data processing (Streams API)
0.9 Data integration (Connect API)
Intra-cluster
replicat...
3
4
5
6
7
8
9
,
10
11
12
13
(Does NOT run inside
the Kafka brokers!)
14
(Does NOT run inside
the Kafka brokers!)
15
16
17
18
http://docs.confluent.io/current/streams/kafka-streams-examples/docs/index.html
19
20
Before
21
Before
With Kafka’s
Streams API
22
KStream<Integer, Integer> input =
builder.stream("numbers-topic");
// Stateless computation
KStream<Integer, Integer> d...
23
24
Linux Windows
25
26
27
28
29
30
http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple
https://kafka.apache.org/documenta...
31
32
33
34
35
36
37
38
39
40
41
42
43
…and many more…
44
…and many more…
45
46
47
Kafka 1.0*
2016 2017
First release of Kafka’s
Streams API (0.10.0)
today
Kafka Streams API in the wild In production at...
48
49Supported since Apache Kafka 0.11 (June 2017)
50
51
52
53
54
55
56
57
58
…and more…
59
60
$ curl -sXGET http://localhost:7070/kafka-music/charts/top-five
[
{
"artist": "Subhumans",
"album": "Live In A Dive",
"...
61
62
63
64
https://kafka.apache.org/documentation/streams
http://docs.confluent.io/current/streams/
https://www.confluent.io/downl...
65
KSQL: a Streaming SQL Engine for Apache Kafka™ from Confluent
ü No coding required, all you need is SQL
ü No separate p...
Prochain SlideShare
Chargement dans…5
×
Prochain SlideShare
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Suivant
Télécharger pour lire hors ligne et voir en mode plein écran

5

Partager

Télécharger pour lire hors ligne

Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, Streams vs. Databases (Google DevFest Switzerland 2017)

Télécharger pour lire hors ligne

My talk at Google DevFest Switzerland, Fribourg, Oct 2017.

https://devfest.ch/schedule/day1?sessionId=118

Abstract:
Modern businesses have data at their core, and this data is changing continuously. How can we harness this torrent of information in real-time? The answer is stream processing, and the technology that has since become the core platform for streaming data is Apache Kafka.

Among the thousands of companies that use Kafka to transform and reshape their industries are the likes of Netflix, Uber, PayPal, and AirBnB, but also established players such as Goldman Sachs, Cisco, and Oracle. Unfortunately, today’s common architectures for real-time data processing at scale suffer from complexity: there are many technologies that need to be stitched and operated together, and each individual technology is often complex by itself. This has led to a strong discrepancy between how we, as engineers, would like to work vs. how we actually end up working in practice.

In this session we talk about how Apache Kafka helps you to radically simplify your data architectures. We cover how you can now build normal applications to serve your real-time processing needs — rather than building clusters or similar special-purpose infrastructure — and still benefit from properties such as high scalability, distributed computing, and fault-tolerance, which are typically associated exclusively with cluster technologies. We discuss common use cases to realize that stream processing in practice often requires database-like functionality, and how Kafka allows you to bridge the worlds of streams and databases when implementing your own core business applications (inventory management for large retailers, patient monitoring in healthcare, fleet tracking in logistics, etc), for example in the form of event-driven, containerized microservices. We will also give a brief shout-out to the recently launched KSQL, a streaming SQL engine for Apache Kafka.

Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, Streams vs. Databases (Google DevFest Switzerland 2017)

  1. 1. 1 Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, Streams vs. Databases Michael G. Noll, Confluent @miguno Google GDG DevFest Switzerland, October 28-29, 2017
  2. 2. 2 0.11 Exactly-once semantics 0.10 Data processing (Streams API) 0.9 Data integration (Connect API) Intra-cluster replication 0.8 2012 2014 2015 2016 2017 Cluster mirroring0.7 2013 Apache Kafka: birthed as a messaging system, now a streaming platform
  3. 3. 3
  4. 4. 4
  5. 5. 5
  6. 6. 6
  7. 7. 7
  8. 8. 8
  9. 9. 9 ,
  10. 10. 10
  11. 11. 11
  12. 12. 12
  13. 13. 13 (Does NOT run inside the Kafka brokers!)
  14. 14. 14 (Does NOT run inside the Kafka brokers!)
  15. 15. 15
  16. 16. 16
  17. 17. 17
  18. 18. 18 http://docs.confluent.io/current/streams/kafka-streams-examples/docs/index.html
  19. 19. 19
  20. 20. 20 Before
  21. 21. 21 Before With Kafka’s Streams API
  22. 22. 22 KStream<Integer, Integer> input = builder.stream("numbers-topic"); // Stateless computation KStream<Integer, Integer> doubled = input.mapValues(v -> v * 2); // Stateful computation KTable<Integer, Integer> sumOfOdds = input .filter((k,v) -> v % 2 != 0) .selectKey((k, v) -> 1) .groupByKey() .reduce((v1, v2) -> v1 + v2, "sum-of-odds"); class PrintToConsoleProcessor implements Processor<K, V> { @Override public void init(ProcessorContext context) {} @Override void process(K key, V value) { System.out.println("Got value " + value); } @Override void punctuate(long timestamp) {} @Override void close() {} }
  23. 23. 23
  24. 24. 24 Linux Windows
  25. 25. 25
  26. 26. 26
  27. 27. 27
  28. 28. 28
  29. 29. 29
  30. 30. 30 http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple https://kafka.apache.org/documentation/streams#streams_duality
  31. 31. 31
  32. 32. 32
  33. 33. 33
  34. 34. 34
  35. 35. 35
  36. 36. 36
  37. 37. 37
  38. 38. 38
  39. 39. 39
  40. 40. 40
  41. 41. 41
  42. 42. 42
  43. 43. 43 …and many more…
  44. 44. 44 …and many more…
  45. 45. 45
  46. 46. 46
  47. 47. 47 Kafka 1.0* 2016 2017 First release of Kafka’s Streams API (0.10.0) today Kafka Streams API in the wild In production at LINE Corp., Japan 220+ million active users, processing millions of msg/s “Applying Kafka Streams for internal message delivery pipeline” https://engineering.linecorp.com/en/blog/detail/80
  48. 48. 48
  49. 49. 49Supported since Apache Kafka 0.11 (June 2017)
  50. 50. 50
  51. 51. 51
  52. 52. 52
  53. 53. 53
  54. 54. 54
  55. 55. 55
  56. 56. 56
  57. 57. 57
  58. 58. 58 …and more…
  59. 59. 59
  60. 60. 60 $ curl -sXGET http://localhost:7070/kafka-music/charts/top-five [ { "artist": "Subhumans", "album": "Live In A Dive", "name": "All Gone Dead", "plays": 126 }, { "artist": "Wheres The Pope?", "album": "PSI", "name": "Fear Of God", "plays": 115 }, ... ]
  61. 61. 61
  62. 62. 62
  63. 63. 63
  64. 64. 64 https://kafka.apache.org/documentation/streams http://docs.confluent.io/current/streams/ https://www.confluent.io/downloads/
  65. 65. 65 KSQL: a Streaming SQL Engine for Apache Kafka™ from Confluent ü No coding required, all you need is SQL ü No separate processing cluster required ü Powered by Kafka: elastic, scalable, distributed, battle-tested CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3; CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.userid WHERE u.level = ‘Platinum’; KSQL is the simplest way to process streams of data in real-time ü Perfect for streaming ETL, anomaly detection, event monitoring, and more ü Part of Confluent Open Source https://github.com/confluentinc/ksql
  • BhargavNReddy

    Jan. 18, 2020
  • dev2983

    Sep. 30, 2018
  • Gudimetla

    Sep. 6, 2018
  • dduransseau

    Apr. 23, 2018
  • StreamingAnalytics

    Nov. 5, 2017

My talk at Google DevFest Switzerland, Fribourg, Oct 2017. https://devfest.ch/schedule/day1?sessionId=118 Abstract: Modern businesses have data at their core, and this data is changing continuously. How can we harness this torrent of information in real-time? The answer is stream processing, and the technology that has since become the core platform for streaming data is Apache Kafka. Among the thousands of companies that use Kafka to transform and reshape their industries are the likes of Netflix, Uber, PayPal, and AirBnB, but also established players such as Goldman Sachs, Cisco, and Oracle. Unfortunately, today’s common architectures for real-time data processing at scale suffer from complexity: there are many technologies that need to be stitched and operated together, and each individual technology is often complex by itself. This has led to a strong discrepancy between how we, as engineers, would like to work vs. how we actually end up working in practice. In this session we talk about how Apache Kafka helps you to radically simplify your data architectures. We cover how you can now build normal applications to serve your real-time processing needs — rather than building clusters or similar special-purpose infrastructure — and still benefit from properties such as high scalability, distributed computing, and fault-tolerance, which are typically associated exclusively with cluster technologies. We discuss common use cases to realize that stream processing in practice often requires database-like functionality, and how Kafka allows you to bridge the worlds of streams and databases when implementing your own core business applications (inventory management for large retailers, patient monitoring in healthcare, fleet tracking in logistics, etc), for example in the form of event-driven, containerized microservices. We will also give a brief shout-out to the recently launched KSQL, a streaming SQL engine for Apache Kafka.

Vues

Nombre de vues

755

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

3

Actions

Téléchargements

26

Partages

0

Commentaires

0

Mentions J'aime

5

×