Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Apache Kafka for the Hybrid IoT

177 vues

Publié le

Tons of devices to supervise the environment connected to the cloud, tons of data to take in as a river in flood, tons of messages to process for getting information from them. This is where IoT data streaming comes in—with the need to analyze this unbounded dataset in real time and react to events. At the same time, controlling these devices at the edge is a crucial part of the overall solution, but just one technology isn’t enough anymore. During this session, we’ll see how it’s possible to use Apache Kafka together with a “traditional” messaging protocol like AMQP 1.0, getting the strengths from both, for making the hybrid IoT a reality.

Publié dans : Logiciels
  • Soyez le premier à commenter

Apache Kafka for the Hybrid IoT

  1. 1. APACHE KAFKA FOR THE HYBRID IOT Data streaming platform and “traditional” messaging living together Paolo Patierno Principal Software Engineer, Messaging & IoT team 21/3/2019
  2. 2. INSERT DESIGNATOR, IF NEEDED2 ● Principal Software Engineer @ Red Hat ○ Messaging & IoT team ● Lead/Committer @ Eclipse Foundation ○ Hono, Paho and Vert.x projects ● Microsoft MVP Azure/IoT ● Hacking embedded/IoT devices in a previous life ● Dad of two, husband of one ● I love running, swimming, MotoGP, #VR46, ssc Napoli WHO AM I ? @ppatierno
  3. 3. INSERT DESIGNATOR, IF NEEDED3 IOT: FROM THE SENSORS AND BACK Ingestion Process & Analysis From raw data to the intelligence … and back Sensors Intelligence DATA INFORMATION KNOWLEDGE WISDOM Actuators COMMAND& CONTROL
  4. 4. INSERT DESIGNATOR, IF NEEDED4 ● Huge amount of data to ingest ● Unbounded dataset ● Need for real time analytics ● Need for real time reactions ● Sometimes the need to join : ○ with static data ○ with historical data IOT WORKLOAD
  6. 6. INSERT DESIGNATOR, IF NEEDED6 ● A type of data processing engine that is designed with infinite datasets in mind ● Working with data as they arrive ● Working with an infinite stream of data ○ Moving boundaries with data in “windows” ● Tradeoffs between latency/cost/correctness STREAM PROCESSING
  7. 7. INSERT DESIGNATOR, IF NEEDED7 ● Timing ○ Supporting “windowing” for getting analytics in a specific time window ○ Using “event time” (vs “processing time”) which makes much sense in IoT ○ Handling “out of order” data when the device is not able to guarantee order ● State management ○ Be fast on joining real time data with metadata to enrich information content ○ Be fast on data aggregation ○ Be able to join different streams of data, from different devices ● Re-processing ○ Supporting the possibility to re-read the stream of data ● Scalability/Partitioning ○ Handle data from more devices and burst of traffic STREAM PROCESSING & IOT What are the good parts for IoT?
  8. 8. INSERT DESIGNATOR, IF NEEDED8 STREAM PROCESSING & IOT Edge Gateways Devices Ingestion platform Applications Streaming data platform Streams processing framework
  9. 9. INSERT DESIGNATOR, IF NEEDED9 ● Streaming data platform ○ Apache Kafka ● Streams processing framework ○ Apache Spark (streaming) ○ Apache Samza ○ Apache Flink ○ … STREAM PROCESSING & IOT It’s a wild west out there
  11. 11. INSERT DESIGNATOR, IF NEEDED11 STREAM PROCESSING & IOT Edge Gateways Devices Apache Kafka + Kafka Streams API streams API
  12. 12. INSERT DESIGNATOR, IF NEEDED12 APACHE KAFKA What is that? “ … a publish/subscribe messaging system …” “ … a streaming data platform …” “ … a distributed, horizontally-scalable, fault-tolerant, commit log …”
  13. 13. INSERT DESIGNATOR, IF NEEDED13 APACHE KAFKA The Streams API ● Stream processing framework, just a Java lib! ● Streams are Kafka topics (as input and output) ● Scaling the stream application horizontally ● Creates a topology of processing nodes (filter, map, join etc) acting on a stream ○ Low level processor API ○ High level DSL ○ Using “internal” topics (when re-partitioning is needed or for “stateful” transformations) Source processor Sink processor Stream processor
  15. 15. INSERT DESIGNATOR, IF NEEDED15 DIFFERENT NEEDS … compared to the data ingestion ● Correlation … ○ … between the command and the related result ● Commands executed just one time … ○ … no need for re-playing a commands “history” ● Most of the times no need for storing … ○ … but executing commands only if the device in online ● Embedded devices are mostly low power … ○ …. and IoT/messaging protocols fit better
  16. 16. INSERT DESIGNATOR, IF NEEDED16 “TRADITIONAL” MESSAGING … with AMQP 1.0 ● Allows to handle request/reply pattern ○ Sending a command, receiving the execution status/result ● Most of the times, storing isn’t really needed ○ The command has to be executed now, not later… ○ … or using storage but with a TTL on the message command … ○ … and you can also check the status of delivery and cancel ● Filtering messages ○ Getting only specific messages based on headers values
  17. 17. INSERT DESIGNATOR, IF NEEDED17 “TRADITIONAL” MESSAGING Direct vs Store-and-forward ● Only if the device is connected ● The “sender” always knows that message has reached the final destination routers network producer consumer consumer routers network producer consumer consumer brokers ● Even if the device isn’t connected; TTL helps for stale messages ● The “senders” only knows that message has reached the intermediary
  19. 19. INSERT DESIGNATOR, IF NEEDED19 MOVING THE IOT INTO THE “FOG” The role of an IoT gateway ● The protocols Babel tower ○ Protocol translation (BLE, ZigBee, MQTT, AMQP 1.0, HTTP, …) ● Reducing cloud workload ○ Offload part of the processing at the edge ● Real time reaction ○ Avoiding the cloud latency ● Offline handling ○ Connection isn’t reliable or the bandwidth is low ● Security and privacy ○ Keep data at the edge, avoiding the cloud ● Reducing price ○ Paying for less devices connected, less data transferred
  20. 20. INSERT DESIGNATOR, IF NEEDED20 AMQP 1.0 ON THE GATEWAY ● Kafka protocol needs more connections to all brokers ○ The IoT gateway has to provide all these connection ○ Exposing all the brokers can be cumbersome ○ Request/reply doesn't fit so well ● With AMQP 1.0 … ○ … the IoT gateway can have just one connection to the cloud ○ … a dispatch router can be used to handle the scalability and failures ○ … with one connection, different sensors traffic can be multiplexed with “links” ○ … with one connection, different “links” for ingestion and control
  21. 21. INSERT DESIGNATOR, IF NEEDED21 DEMO An “hybrid” IoT pipeline streams API
  22. 22. THANK YOU! @ppatierno