SlideShare une entreprise Scribd logo
1  sur  51
1Confidential
IoT Sensor Analytics with
Apache Kafka, KSQL, TensorFlow and MQTT
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
Kafka-Native End-to-End IoT Data Integration and Processing
3
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
5) End-to-End IoT Integration Architecture(s)
6) IoT Data Processing
7) Live Demo: End-to-End Sensor Analytics
4
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
5) End-to-End IoT Integration Architecture(s)
6) IoT Data Processing
7) Live Demo: End-to-End Sensor Analytics
6
Connected Intelligence (Cars, Machines, Robots, …)
7
Smart Cities
8
Smart Retail and Customer 360
9
Intelligent Applications (Early Part Scrapping, Predictive Maintenance, …)
10
?
Architecture (High Level)
Kafka BrokerKafka BrokerStreaming
Platform
Connect
w/ MQTT
connector
GatewayDevicesDevicesDevicesDevice
Device Tracking
(Real Time)
Predictive
Maintenance
(Near Real Time)
Log Analytics
(Batch)
Edge Data Center / Cloud
How to integrate?
13
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
5) End-to-End IoT Integration Architecture(s)
6) IoT Data Processing
7) Live Demo: End-to-End Sensor Analytics
14
MQTT - Publish / subscribe messaging protocol
• Built on top of TCP/IP for constrained devices and unreliable networks
• Many (open source) broker implementations
• Many client libraries
• IoT-specific features for bad network / connectivity
• Widely used (mostly IoT, but also web and mobile apps via MQTT over WebSockets)
17
MQTT Architecture (large scale)
Load
Balancer
MQTT
Server 1
MQTT
Server 2
MQTT
Server 3
MQTT
Server 4
topic: [deviceid]/car
...
Processor
1
Processor
2
Processor
3
Processor
4
18
MQTT Trade-Offs
Pros
• Lightweight
• Simple API
• Built for poor connectivity / high latency scenario
• Many client connections (tens of thousands per MQTT server)
Cons
• Queuing, not stream processing
• Can’t handle usage surges (no buffering)
• No high scalability (true for most MQTT brokers)
• Very asynchronous processing (often offline for long time)
• No good integration to the rest of the enterprise
• No reprocessing of events
19
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
4) End-to-End IoT Integration Architecture(s)
5) IoT Data Processing
6) Live Demo: End-to-End Sensor Analytics
20
Apache Kafka – The Rise of a Streaming Platform
The Log ConnectorsConnectors
Producer Consumer
Streaming Engine
21
Log and Pub/Sub
23
Apache Kafka == Distributed Commit Log with Replication
25
Apache Kafka at Scale
https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63921
https://qconlondon.com/london2018/presentation/cloud-native-and-scalable-kafka-architecture
(2018)
(2018)
26
Kafka Trade-Offs (from IoT perspective)
Pros
• Stream processing, not just queuing
• High throughput
• Large scale
• High availability
• Long term storage and buffering
• Reprocessing of events
• Good integration to the rest of the enterprise
Cons
• Not built for tens of thousands connections
• Requires stable network and good infrastructure
• No IoT-specific features like keep alive, last will or testament
27
(De facto) Standards for Processing IoT Data
A Match Made In Heaven
+ =
28
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
5) End-to-End IoT Integration Architecture(s)
6) IoT Data Processing
7) Live Demo: End-to-End Sensor Analytics
29
TensorFlow
TensorFlow is an open source software library for high
performance numerical computation. Its flexible architecture
allows easy deployment of computation across a variety of
platforms (CPUs, GPUs, TPUs), and from desktops to clusters of
servers to mobile and edge devices. Originally developed by
researchers and engineers from the Google Brain team within
Google’s AI organization, it comes with strong support for
machine learning and deep learning and the flexible
numerical computation core is used across many other scientific
domains.
https://www.tensorflow.org/
30
The First Analytic Models
How to deploy the models
in production?
…real-time processing?
…at scale?
…24/7 zero downtime?
31
Hidden Technical Debt in Machine Learning Systems
https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
32
Apache Kafka’s Open Source Ecosystem as Infrastructure for ML
33
Apache Kafka’s Open Ecosystem as Infrastructure for ML
Kafka
Streams
Kafka
Connect
Rest Proxy
Schema Registry
Go/.NET /Python
Kafka Producer
KSQL
Kafka
Streams
37
Replayability — a log never forgets!
Time
Model B Model XModel A
Producer
Distributed Commit Log
Different models with same data
Different ML frameworks
AutoML compatible
A/B testing
Google Cloud Storage HDFS
38
Analytic Model (Autoencoder for Anomaly Detection)
39
Model Deployment #1: RPC Communication to do Model Inference
Streams
Input Event
Prediction
Request
Response
Model Serving
TensorFlow Serving
gRPC
40
Model deployment #2: Model interference natively in the App
Streams
Input Event
Prediction
41
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
5) End-to-End IoT Integration Architecture(s)
6) IoT Data Processing
7) Live Demo: End-to-End Sensor Analytics
42
?
Architecture (High Level)
Kafka BrokerKafka BrokerStreaming
Platform
Connect
w/ MQTT
connector
GatewayDevicesDevicesDevicesDevice
Device Tracking
(Real Time)
Predictive
Maintenance
(Near Real Time)
Log Analytics
(Batch)
Edge Data Center / Cloud
How to integrate?
43
?
Architecture (High Level) – Machine Learning Perspective
Kafka BrokerKafka BrokerStreaming
Platform
Connect
w/ MQTT
connector
GatewayDevicesDevicesDevicesDevice
Edge Analytics
Real Time
Model Serving
Predictive
Maintenance
Near Real Time
Model Serving
Model Training
(Batch)
Edge Data Center / Cloud
46
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
47
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
49
?
Integration with Kafka Connect (Source and Sink)
Kafka BrokerKafka BrokerKafka Broker
MQTT
Broker
Connect
w/ MQTT
connector
Connect
w/ MQTT
connectorMQTT
DevicesDevicesDevicesDevice
Kafka
Consumer
MQTT Broker
Persistent + offers MQTT-specific features
Consumes push data from IoT devices
Kafka Connect
Kafka Consumer + Kafka Producer under the hood
Pull-based (at own pace, without overwhelming the source or getting overwhelmed by the source)
Out-of-the-box scalability and integration features (like connectors, converters, SMTs)
?
Connect
w/ MQTT
connector
Connect
w/ MQTT
connector
56
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
57
MQTT Proxy
Kafka BrokerKafka BrokerKafka Broker
MQTT
ProxyMQTT
DevicesDevicesDevicesDevices
Kafka
Consumer
MQTT Proxy
MQTT is push-based
Horizontally scalable
Consumes push data from IoT devices and forwards it to Kafka Broker at low-latency
Kafka Producer under the hood
No MQTT Broker needed
Kafka Broker
Source of truth
Responsible for persistence, high availability, reliability
59
Kafka-Native Integration Options between MQTT and Apache Kafka
Kafka Connect
MQTT Proxy
REST Proxy
60
Confluent REST Proxy
REST Proxy
IoT Applicatons
Native Kafka
Applications
(Java, C, Go, …)
REST / HTTP(S)
TCP
The „simple alternative“ for IoT
• Simple and understood
• HTTP(S) Proxy à Push-based
• Security ”easier”
• Scalable with standard load balancer
(still synchronous HTTP)
• Not for very high throughput
• Implement Kafka Connect features in
your client app
62
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
5) End-to-End IoT Integration Architecture(s)
6) IoT Data Processing
7) Live Demo: End-to-End Sensor Analytics
6363
Processing Options for MQTT Data with Apache Kafka
Streams
Kafka native vs. additional big data cluster and technology
(or others, you name it …)
6464
IoT Data Processing
Kafka Client
Batch
System
AnalyticsKafka Cluster Kafka Connect
Kafka Streams
/
KSQL
MQTT Device
Kafka Ecosystem
Other Components
Real Time
System
All Data
Alerting
Process
Data
Continuously
Forward
Processed
Data
On premise DC / CloudAt the edge
6868
KSQL – Continuous Queries for Streaming ETL / Anomaly Detection
CREATE STREAM vip_actions AS
SELECT userid, page, action FROM clickstream c
LEFT JOIN users u ON c.userid = u.user_id
WHERE u.level = 'Platinum';
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 MINUTES)
GROUP BY card_number
HAVING count(*) > 3;
6969
Agenda
1) IoT Use Cases
2) MQTT Standard
3) Apache Kafka Ecosystem
4) TensorFlow for IoT Scenarios
5) End-to-End IoT Integration Architecture(s)
6) IoT Data Processing
7) Live Demo: End-to-End Sensor Analytics
7070
KSQL and Deep Learning (Auto Encoder) for Anomaly Detection
MQTT
Proxy
Elastic
search
Grafana
Kafka
Cluster
Kafka
Connect
KSQL
Car Sensors
Kafka Ecosystem
Other Components
Real Time
Emergency
System
All Data
PotentialDefect
Apply
Analytic
Model
Filter
Anomalies
On premise DCAt the edge
5858
KSQL and Deep Learning (Auto Encoder) for Anomaly Detection
MQTT
Proxy
Elastic
search
Grafana
Kafka
Cluster
Kafka
Connect
KSQL
Car Sensors
Kafka Ecosystem
Other Components
Real Time
Emergency
System
All Data
PotentialDefect
Apply
Analytic
Model
Filter
Anomalies
On premise DCAt the edge
7171
Model Training with Python, KSQL, TensorFlow, Keras and Jupyter
https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras
7272
Model Deployment with Apache Kafka, KSQL and TensorFlow
“CREATE STREAM AnomalyDetection AS
SELECT sensor_id, detectAnomaly(sensor_values)
FROM car_engine;“
User Defined Function (UDF)
73
Live Demo
End-to-End Sensor Analytics…
Python, Jupyter Notebook, TensorFlow, Keras, Apache Kafka, KSQL and MQTT
74
Model Training with Python, KSQL, TensorFlow, Keras and Jupyter
https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras
75
Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data
https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot
77
Kai Waehner
Technology Evangelist
kontakt@kai-waehner.de
@KaiWaehner
www.kai-waehner.de
www.confluent.io
LinkedIn
Questions? Feedback?
Please contact me!
Come to our booth
to find out more about
Kafka and Confluent

Contenu connexe

Tendances

The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kai Wähner
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Timothy Spann
 

Tendances (20)

Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache KafkaReal-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
 
Apache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and DevelopersApache Kafka Fundamentals for Architects, Admins and Developers
Apache Kafka Fundamentals for Architects, Admins and Developers
 
Introduction to OpenFlow
Introduction to OpenFlowIntroduction to OpenFlow
Introduction to OpenFlow
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
kafka
kafkakafka
kafka
 
Messaging queue - Kafka
Messaging queue - KafkaMessaging queue - Kafka
Messaging queue - Kafka
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022The Top 5 Apache Kafka Use Cases and Architectures in 2022
The Top 5 Apache Kafka Use Cases and Architectures in 2022
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
An Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a ServiceAn Introduction to Confluent Cloud: Apache Kafka as a Service
An Introduction to Confluent Cloud: Apache Kafka as a Service
 
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
Stream processing IoT time series data with Kafka & InfluxDB | Al Sargent, In...
 
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
Designing Event-Driven Applications with Apache NiFi, Apache Flink, Apache Sp...
 
Cassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWSCassandra Performance and Scalability on AWS
Cassandra Performance and Scalability on AWS
 
Kafka Tutorial: Kafka Security
Kafka Tutorial: Kafka SecurityKafka Tutorial: Kafka Security
Kafka Tutorial: Kafka Security
 

Similaire à IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, KSQL and MQTT

Similaire à IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, KSQL and MQTT (20)

IoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlowIoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
IoT Sensor Analytics with Kafka, ksqlDB and TensorFlow
 
Io t data streaming
Io t data streamingIo t data streaming
Io t data streaming
 
Beyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - Un tour de l'écosystème KafkaBeyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - Un tour de l'écosystème Kafka
 
Beyond the Brokers: A Tour of the Kafka Ecosystem
Beyond the Brokers: A Tour of the Kafka EcosystemBeyond the Brokers: A Tour of the Kafka Ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
 
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the brokers - A tour of the Kafka ecosystemBeyond the brokers - A tour of the Kafka ecosystem
Beyond the brokers - A tour of the Kafka ecosystem
 
MQTT. Kafka. InfluxDB. SQL. IoT Harmony. #tutorial by Stefan Bocutiu
MQTT. Kafka. InfluxDB. SQL. IoT Harmony. #tutorial by Stefan BocutiuMQTT. Kafka. InfluxDB. SQL. IoT Harmony. #tutorial by Stefan Bocutiu
MQTT. Kafka. InfluxDB. SQL. IoT Harmony. #tutorial by Stefan Bocutiu
 
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent RamièreAu delà des brokers, un tour de l’environnement Kafka | Florent Ramière
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
 
2016-09-eclipse-iot-cf-summit
2016-09-eclipse-iot-cf-summit2016-09-eclipse-iot-cf-summit
2016-09-eclipse-iot-cf-summit
 
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache KafkaTop 5 Event Streaming Use Cases for 2021 with Apache Kafka
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
 
The Top 5 Event Streaming Use Cases & Architectures in 2021
The Top 5 Event Streaming Use Cases & Architectures in 2021The Top 5 Event Streaming Use Cases & Architectures in 2021
The Top 5 Event Streaming Use Cases & Architectures in 2021
 
JHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka EcosystemJHipster conf 2019 - Kafka Ecosystem
JHipster conf 2019 - Kafka Ecosystem
 
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
IIoT / Industry 4.0 with Apache Kafka, Connect, KSQL, Apache PLC4X
 
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoTFlexible and Scalable Integration in the Automation Industry/Industrial IoT
Flexible and Scalable Integration in the Automation Industry/Industrial IoT
 
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
Safer Commutes & Streaming Data | George Padavick, Ohio Department of Transpo...
 
Introduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - MadridIntroduction to Apache Kafka and why it matters - Madrid
Introduction to Apache Kafka and why it matters - Madrid
 
Devoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en basDevoxx university - Kafka de haut en bas
Devoxx university - Kafka de haut en bas
 
NextGenML
NextGenML NextGenML
NextGenML
 
IoT on Blockchain Solution Overview
IoT on Blockchain Solution OverviewIoT on Blockchain Solution Overview
IoT on Blockchain Solution Overview
 
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
IoT Architectures for Apache Kafka and Event Streaming - Industry 4.0, Digita...
 
Fom io t_to_bigdata_step_by_step-final
Fom io t_to_bigdata_step_by_step-finalFom io t_to_bigdata_step_by_step-final
Fom io t_to_bigdata_step_by_step-final
 

Plus de Kai Wähner

Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
Kai Wähner
 

Plus de Kai Wähner (20)

Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
 
When NOT to use Apache Kafka?
When NOT to use Apache Kafka?When NOT to use Apache Kafka?
When NOT to use Apache Kafka?
 
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping MetaverseKafka for Live Commerce to Transform the Retail and Shopping Metaverse
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
 
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache KafkaThe Heart of the Data Mesh Beats in Real-Time with Apache Kafka
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
 
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform MiddlewareApache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
 
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
 
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse ArchitectureServerless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
 
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare IndustryApache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
 
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Apache Kafka for Real-time Supply Chainin the Food and Retail IndustryApache Kafka for Real-time Supply Chainin the Food and Retail Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
 
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid CloudKafka for Real-Time Replication between Edge and Hybrid Cloud
Kafka for Real-Time Replication between Edge and Hybrid Cloud
 
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
 
Apache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and ManufacturingApache Kafka Landscape for Automotive and Manufacturing
Apache Kafka Landscape for Automotive and Manufacturing
 
Kappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology ComparisonKappa vs Lambda Architectures and Technology Comparison
Kappa vs Lambda Architectures and Technology Comparison
 
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka ArchitecturesEvent Streaming CTO Roundtable for Cloud-native Kafka Architectures
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
 
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
 
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
 
Apache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and LogisticsApache Kafka in the Transportation and Logistics
Apache Kafka in the Transportation and Logistics
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
 

Dernier

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Dernier (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 

IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, KSQL and MQTT

  • 1. 1Confidential IoT Sensor Analytics with Apache Kafka, KSQL, TensorFlow and MQTT Kai Waehner Technology Evangelist kontakt@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de Kafka-Native End-to-End IoT Data Integration and Processing
  • 2. 3 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 5) End-to-End IoT Integration Architecture(s) 6) IoT Data Processing 7) Live Demo: End-to-End Sensor Analytics
  • 3. 4 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 5) End-to-End IoT Integration Architecture(s) 6) IoT Data Processing 7) Live Demo: End-to-End Sensor Analytics
  • 4. 6 Connected Intelligence (Cars, Machines, Robots, …)
  • 6. 8 Smart Retail and Customer 360
  • 7. 9 Intelligent Applications (Early Part Scrapping, Predictive Maintenance, …)
  • 8. 10 ? Architecture (High Level) Kafka BrokerKafka BrokerStreaming Platform Connect w/ MQTT connector GatewayDevicesDevicesDevicesDevice Device Tracking (Real Time) Predictive Maintenance (Near Real Time) Log Analytics (Batch) Edge Data Center / Cloud How to integrate?
  • 9. 13 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 5) End-to-End IoT Integration Architecture(s) 6) IoT Data Processing 7) Live Demo: End-to-End Sensor Analytics
  • 10. 14 MQTT - Publish / subscribe messaging protocol • Built on top of TCP/IP for constrained devices and unreliable networks • Many (open source) broker implementations • Many client libraries • IoT-specific features for bad network / connectivity • Widely used (mostly IoT, but also web and mobile apps via MQTT over WebSockets)
  • 11. 17 MQTT Architecture (large scale) Load Balancer MQTT Server 1 MQTT Server 2 MQTT Server 3 MQTT Server 4 topic: [deviceid]/car ... Processor 1 Processor 2 Processor 3 Processor 4
  • 12. 18 MQTT Trade-Offs Pros • Lightweight • Simple API • Built for poor connectivity / high latency scenario • Many client connections (tens of thousands per MQTT server) Cons • Queuing, not stream processing • Can’t handle usage surges (no buffering) • No high scalability (true for most MQTT brokers) • Very asynchronous processing (often offline for long time) • No good integration to the rest of the enterprise • No reprocessing of events
  • 13. 19 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 4) End-to-End IoT Integration Architecture(s) 5) IoT Data Processing 6) Live Demo: End-to-End Sensor Analytics
  • 14. 20 Apache Kafka – The Rise of a Streaming Platform The Log ConnectorsConnectors Producer Consumer Streaming Engine
  • 16. 23 Apache Kafka == Distributed Commit Log with Replication
  • 17. 25 Apache Kafka at Scale https://conferences.oreilly.com/strata/strata-ca/public/schedule/detail/63921 https://qconlondon.com/london2018/presentation/cloud-native-and-scalable-kafka-architecture (2018) (2018)
  • 18. 26 Kafka Trade-Offs (from IoT perspective) Pros • Stream processing, not just queuing • High throughput • Large scale • High availability • Long term storage and buffering • Reprocessing of events • Good integration to the rest of the enterprise Cons • Not built for tens of thousands connections • Requires stable network and good infrastructure • No IoT-specific features like keep alive, last will or testament
  • 19. 27 (De facto) Standards for Processing IoT Data A Match Made In Heaven + =
  • 20. 28 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 5) End-to-End IoT Integration Architecture(s) 6) IoT Data Processing 7) Live Demo: End-to-End Sensor Analytics
  • 21. 29 TensorFlow TensorFlow is an open source software library for high performance numerical computation. Its flexible architecture allows easy deployment of computation across a variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to mobile and edge devices. Originally developed by researchers and engineers from the Google Brain team within Google’s AI organization, it comes with strong support for machine learning and deep learning and the flexible numerical computation core is used across many other scientific domains. https://www.tensorflow.org/
  • 22. 30 The First Analytic Models How to deploy the models in production? …real-time processing? …at scale? …24/7 zero downtime?
  • 23. 31 Hidden Technical Debt in Machine Learning Systems https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
  • 24. 32 Apache Kafka’s Open Source Ecosystem as Infrastructure for ML
  • 25. 33 Apache Kafka’s Open Ecosystem as Infrastructure for ML Kafka Streams Kafka Connect Rest Proxy Schema Registry Go/.NET /Python Kafka Producer KSQL Kafka Streams
  • 26. 37 Replayability — a log never forgets! Time Model B Model XModel A Producer Distributed Commit Log Different models with same data Different ML frameworks AutoML compatible A/B testing Google Cloud Storage HDFS
  • 27. 38 Analytic Model (Autoencoder for Anomaly Detection)
  • 28. 39 Model Deployment #1: RPC Communication to do Model Inference Streams Input Event Prediction Request Response Model Serving TensorFlow Serving gRPC
  • 29. 40 Model deployment #2: Model interference natively in the App Streams Input Event Prediction
  • 30. 41 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 5) End-to-End IoT Integration Architecture(s) 6) IoT Data Processing 7) Live Demo: End-to-End Sensor Analytics
  • 31. 42 ? Architecture (High Level) Kafka BrokerKafka BrokerStreaming Platform Connect w/ MQTT connector GatewayDevicesDevicesDevicesDevice Device Tracking (Real Time) Predictive Maintenance (Near Real Time) Log Analytics (Batch) Edge Data Center / Cloud How to integrate?
  • 32. 43 ? Architecture (High Level) – Machine Learning Perspective Kafka BrokerKafka BrokerStreaming Platform Connect w/ MQTT connector GatewayDevicesDevicesDevicesDevice Edge Analytics Real Time Model Serving Predictive Maintenance Near Real Time Model Serving Model Training (Batch) Edge Data Center / Cloud
  • 33. 46 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 34. 47 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 35. 49 ? Integration with Kafka Connect (Source and Sink) Kafka BrokerKafka BrokerKafka Broker MQTT Broker Connect w/ MQTT connector Connect w/ MQTT connectorMQTT DevicesDevicesDevicesDevice Kafka Consumer MQTT Broker Persistent + offers MQTT-specific features Consumes push data from IoT devices Kafka Connect Kafka Consumer + Kafka Producer under the hood Pull-based (at own pace, without overwhelming the source or getting overwhelmed by the source) Out-of-the-box scalability and integration features (like connectors, converters, SMTs) ? Connect w/ MQTT connector Connect w/ MQTT connector
  • 36. 56 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 37. 57 MQTT Proxy Kafka BrokerKafka BrokerKafka Broker MQTT ProxyMQTT DevicesDevicesDevicesDevices Kafka Consumer MQTT Proxy MQTT is push-based Horizontally scalable Consumes push data from IoT devices and forwards it to Kafka Broker at low-latency Kafka Producer under the hood No MQTT Broker needed Kafka Broker Source of truth Responsible for persistence, high availability, reliability
  • 38. 59 Kafka-Native Integration Options between MQTT and Apache Kafka Kafka Connect MQTT Proxy REST Proxy
  • 39. 60 Confluent REST Proxy REST Proxy IoT Applicatons Native Kafka Applications (Java, C, Go, …) REST / HTTP(S) TCP The „simple alternative“ for IoT • Simple and understood • HTTP(S) Proxy à Push-based • Security ”easier” • Scalable with standard load balancer (still synchronous HTTP) • Not for very high throughput • Implement Kafka Connect features in your client app
  • 40. 62 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 5) End-to-End IoT Integration Architecture(s) 6) IoT Data Processing 7) Live Demo: End-to-End Sensor Analytics
  • 41. 6363 Processing Options for MQTT Data with Apache Kafka Streams Kafka native vs. additional big data cluster and technology (or others, you name it …)
  • 42. 6464 IoT Data Processing Kafka Client Batch System AnalyticsKafka Cluster Kafka Connect Kafka Streams / KSQL MQTT Device Kafka Ecosystem Other Components Real Time System All Data Alerting Process Data Continuously Forward Processed Data On premise DC / CloudAt the edge
  • 43. 6868 KSQL – Continuous Queries for Streaming ETL / Anomaly Detection CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum'; CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 MINUTES) GROUP BY card_number HAVING count(*) > 3;
  • 44. 6969 Agenda 1) IoT Use Cases 2) MQTT Standard 3) Apache Kafka Ecosystem 4) TensorFlow for IoT Scenarios 5) End-to-End IoT Integration Architecture(s) 6) IoT Data Processing 7) Live Demo: End-to-End Sensor Analytics
  • 45. 7070 KSQL and Deep Learning (Auto Encoder) for Anomaly Detection MQTT Proxy Elastic search Grafana Kafka Cluster Kafka Connect KSQL Car Sensors Kafka Ecosystem Other Components Real Time Emergency System All Data PotentialDefect Apply Analytic Model Filter Anomalies On premise DCAt the edge 5858 KSQL and Deep Learning (Auto Encoder) for Anomaly Detection MQTT Proxy Elastic search Grafana Kafka Cluster Kafka Connect KSQL Car Sensors Kafka Ecosystem Other Components Real Time Emergency System All Data PotentialDefect Apply Analytic Model Filter Anomalies On premise DCAt the edge
  • 46. 7171 Model Training with Python, KSQL, TensorFlow, Keras and Jupyter https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras
  • 47. 7272 Model Deployment with Apache Kafka, KSQL and TensorFlow “CREATE STREAM AnomalyDetection AS SELECT sensor_id, detectAnomaly(sensor_values) FROM car_engine;“ User Defined Function (UDF)
  • 48. 73 Live Demo End-to-End Sensor Analytics… Python, Jupyter Notebook, TensorFlow, Keras, Apache Kafka, KSQL and MQTT
  • 49. 74 Model Training with Python, KSQL, TensorFlow, Keras and Jupyter https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras
  • 50. 75 Deep Learning UDF for KSQL for Streaming Anomaly Detection of MQTT IoT Sensor Data https://github.com/kaiwaehner/ksql-udf-deep-learning-mqtt-iot
  • 51. 77 Kai Waehner Technology Evangelist kontakt@kai-waehner.de @KaiWaehner www.kai-waehner.de www.confluent.io LinkedIn Questions? Feedback? Please contact me! Come to our booth to find out more about Kafka and Confluent