SlideShare une entreprise Scribd logo
1  sur  33
BUILDING
EVENT-DRIVEN SYSTEMS
WITH APACHE KAFKA
BRIAN RITCHIE
CTO, XEOHEALTH
2016
@brian_ritchie
brian.ritchie@gmail.com
http://www.dotnetpowered.com
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS
Definition
Event-driven architecture, also known as message-driven architecture, is
a software architecture pattern promoting the production, detection,
consumption of, and reaction to events. An event can be defined as "a
significant change in state".
https://en.wikipedia.org/wiki/Event-driven_architecture
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS ARE ABOUT UNLOCKING DATA
• Data is the driving force behind innovation
• Event-driven systems allow you to unlock the data –
and unlock the innovation.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENTS ARE THE “WHAT HAPPENED” DATA
• It’s about recording “what happened”, but not coupling it to the “how”
• It’s the “transactions” of your system
• Product Views
• Completed Sales
• Page Visits
• Site Logins
• Shipping Notifications
• Inventory Received
• IoT
• …and much more
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENTS – A HEALTHCARE EXAMPLE
Event
Stream
Healthcare
Claim
Fraud
Detection
Data Lake
Archive
Disease
Trending
Contract &
Pricing
More… You don’t need to
integrate with
consumers or even
know about a future
uses of your data
What happened?
A patient received a set of
services
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS MAKE SCALABILITY EASIER
• Scalability of processing
• Scalability of design
• Scalability of change
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS REQUIRE INFRASTRUCTURE
• Queue / Stream
• Persistence
• Distribution
• Pub / Sub
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA IS THE INFRASTRUCTURE
• Apache Kafka is publish-subscribe messaging rethought as a distributed
commit log.
• Developed by LinkedIn
• Written in Java
• Open Sourced in 2011 and graduated Apache Incubator in 2012
• Unique features of Kafka
• Super fast
• Distributed & Replicated out of the box
• Extremely low cost
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
WHO USES APACHE KAFKA?
A few small companies you might have heard of…
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
MICROSOFT SUPPORTS KAFKA
Microsoft ♥ Linux
Microsoft ♥ Open Source
Nearly 1 in 3 VMs are Linux
Microsoft moves to GitHub
Microsoft sponsors the Kafka summit, releases Kafka .NET driver on GitHub, and
even buys LinkedIn. That is some Kafka love.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERFORMANCE
Kafka performs amazingly well on modest hardware.
https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
Producers and consumers
simultaneously accessing
cluster.
Test on the LinkedIn
Engineering Blog:
- 3 machines in Kafka
cluster, 3 to generate
load
- 6 SATA drives each, 32
GB RAM each
- 1 GB Ethernet
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERFORMANCE
Microsoft has one of the largest Kafka installations called “Siphon”
http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka
1.3 million
Events per second at peak
~1 trillion
Events per day at peak
3.5 petabytes
Processed per day
1,300
Production brokers
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERFORMANCE
Microsoft has one of the largest Kafka installations called “Siphon”
http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka
https://github.com/Microsoft/Availability-Monitor-for-Kafka
Availability & Latency monitor for Kafka using Canary messages
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – ARCHITECTURE
producer producer
consumer consumer consumer
Producers publish messages to a Kafka topic
Consumers subscribe to topics and process messages
Kafka cluster
broker
broker
broker
A Kafka cluster is made up of one or more brokers (nodes)
Zookeeper
Kafka uses Zookeeper for configuration
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – ROLE OF ZOOKEEPER
What is ZooKeeper?
ZooKeeper is a centralized service for maintaining
configuration information, naming, providing distributed
synchronization, and providing group services to
distributed applications.
Role of ZooKeeper in Kafka
It is responsible for: maintaining consumer offsets and
topic lists, leader election, and general state information.
Apache ZooKeeper
zk-web: Web UI for ZooKeeper
https://github.com/qiuxiafei/zk-web
Or get the Docker container
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – TOPICS
Kafka topic
producer
producer
0 1 2 3 4 5
writes
0 1 2 3 4
0 1 2 3 4
5
writes
consumer
consumer
reads
reads
Partition 0
Partition 1
Partition 2
Producers write messages to the end of a
partition
• Messages can be round robin load balanced across
partitions or assigned by a function.
Consumers read from the lowest offset to the
highest
• Unlike most queuing systems, state is not maintained on
the server. Each consumer tracks its own offset.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – MORE ON PARTITIONS
Partitions for scalability
• The more partitions you have, the more throughput you get when consuming data.
• Each partition must fit entirely on a single server.
Partitions for ordering
• Kafka only guarantees message order within the same partition.
• If you need strong ordering, make sure that data is pinned to a single partition based
on some sort of key
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERSISTENCE
Kafka topic
0 1 2 3 4 5
0 1 2 3 4
0 1 2 3 4
5
Partition 0
Partition 1
Partition 2
All messages are written to disk and
replicated.
Messages are not removed from Kafka when
they are read from a topic.
A cleanup process will remove old messages
based on a sliding timeframe.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – CONSUMER GROUPS
Kafka topic
consumer
1
consumer
2
consumer
reads
rea
ds
reads
Partition 0 Partition 1 Partition 2
Each consumer group is a “logical subscriber”
Messages are processed in parallel by
consumers
Only one consumer is assigned to a partition
in a consumer group.
consumer
3
reads
Consumer
Group 2
consumer
reads
Consumer
Group 1
Partition 3
consumer
4
reads
Note: consumers are responsible for handling duplicate
messages. These could be caused by failures of another
consumer in the group.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – SERIALIZATION
Pick a format!
• JSON
• BSON
http://bsonspec.org/implementations.html
• PROTOCOL BUFFERS
https://github.com/google/protobuf
• BOND
https://github.com/Microsoft/bond
• AVRO
https://avro.apache.org/index.html
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – GETTING STARTED
Install Kafka & ZooKeeper
https://dzone.com/articles/running-apache-kafka-on-windows-os
• Install JDK
• Install ZooKeeper
• Install Kafka
Start Kafka & ZooKeeper
Start ZooKeeper
C:binzookeeper-3.4.8bin>zkServer.cmd
Start Kafka
C:binkafka_2.11-0.8.2.2>.binwindowskafka-server-start.bat .configserver.properties
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – GETTING STARTED
Create a topic
kafka-topics.bat --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic SampleTopic1
Other Useful Topic Commands
List Topics
• kafka-topics.bat --list --zookeeper localhost:2181
Describe Topics
• kafka-topics.bat --describe --zookeeper localhost:2181 --topic [Topic Name]
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
KAFKA MANAGER
https://github.com/yahoo/kafka-manager
A tool for managing Apache Kafka
created by Yahoo.
Or get the Docker container
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
DEMO
Producing and consuming message in C#
Sample code:
https://github.com/dotnetpowered/StreamProcessingSample
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE
• Apache Spark is a fast and general engine for large-scale data
processing, Runs programs up to 100x faster than Hadoop MapReduce in
memory, or 10x faster on disk.
• Spark Streaming makes it easy to build scalable fault-tolerant streaming
applications.
https://spark.apache.org/streaming/
• Supports streaming directly from Apache Kafka.
http://spark.apache.org/docs/latest/streaming-kafka-integration.html
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE - FIRING UP THE CLUSTER
• Start the master
• Start one or more slaves
• Access the Spark cluster via browser
spark-class org.apache.spark.deploy.master.Master
spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077
http://spark-master:8080
Spark is made up of master and slave processes…
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE WITH MOBIUS
Mobius is a .NET language binding for Spark. It is a Java wrapper for building
workers in C# and other CLR-based languages.
• Reference the Microsoft.SparkCLR Nuget Package
• Build a console application utilizing the API
• Submit your program to Spark using the following script
sparkclr-submit.cmd
--master spark://spark-master:7077
--jars <path>runtimedependenciesspark-streaming-kafka-assembly_2.10-1.6.1.jar
--exe StreamingRulesEngineHost.exe
C:srcStreamProcessingStreamProcessingHostbinDebug
https://github.com/Microsoft/Mobius
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
DEMO
Consuming messages in C# using Spark
Sample code:
https://github.com/dotnetpowered/StreamProcessingSample
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
USING THE ELK STACK FOR INTEGRATION & VISUALIZATION
Use Logstack to ingest events and/or consume events. Allows for “ETL” and
integration with tools such as Elastic Search.
Shipper
(for non-Kafka
enabled producers)
Indexer
search
https://www.elastic.co/blog/just-enough-kafka-for-the-elastic-stack-part1
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
CONNECTING KAFKA TO ELASTIC SEARCH
For consumers: Configure a Kafka input
input {
kafka {
zk_connect => "kafka:2181"
group_id => "logstash"
topic_id => "apache_logs"
consumer_threads => 16
}
}
Don’t forget about to select a codec for serialization!
C:binlogstash-2.3.2bin>logstash -e "input { kafka { topic_id
=> 'SampleTopic2' } } output { elasticsearch { index=>'sample-
%{+YYYY.MM.dd}' document_id => '%{docid}' } }"
Putting it all together:
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
LET’S REVIEW
• Event-driven systems are a key ingredient to
unlocking your organization’s potential. Make data
available to current and future apps, improve
scalability, and decrease complexity.
• Kafka is foundational infrastructure for event-driven
systems and is battle tested at scale.
• The ecosystem building around Kafka is rich -
allowing you to connect using various tools.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
QUESTIONS?
THANK YOU!
BRIAN RITCHIE
CTO, XEOHEALTH
2016
@brian_ritchie
brian.ritchie@gmail.com
http://www.dotnetpowered.com
Sample code:
https://github.com/dotnetpowered/StreamProcessingSample

Contenu connexe

Tendances

クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス - クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス - SORACOM, INC
 
クラウド上のデータ活用デザインパターン
クラウド上のデータ活用デザインパターンクラウド上のデータ活用デザインパターン
クラウド上のデータ活用デザインパターンAmazon Web Services Japan
 
Oracle Cloud Infrastructure:2022年9月度サービス・アップデート
Oracle Cloud Infrastructure:2022年9月度サービス・アップデートOracle Cloud Infrastructure:2022年9月度サービス・アップデート
Oracle Cloud Infrastructure:2022年9月度サービス・アップデートオラクルエンジニア通信
 
IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)
IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)
IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)NTT DATA Technology & Innovation
 
脆弱性ハンドリングと耐える設計 -Vulnerability Response-
脆弱性ハンドリングと耐える設計 -Vulnerability Response-脆弱性ハンドリングと耐える設計 -Vulnerability Response-
脆弱性ハンドリングと耐える設計 -Vulnerability Response-Tomohiro Nakashima
 
20190424 AWS Black Belt Online Seminar Amazon Aurora MySQL
20190424 AWS Black Belt Online Seminar Amazon Aurora MySQL20190424 AWS Black Belt Online Seminar Amazon Aurora MySQL
20190424 AWS Black Belt Online Seminar Amazon Aurora MySQLAmazon Web Services Japan
 
Azure Service Fabric 概要
Azure Service Fabric 概要Azure Service Fabric 概要
Azure Service Fabric 概要Daiyu Hatakeyama
 
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来Hiromasa Oka
 
AWS Glueを使った Serverless ETL の実装パターン
AWS Glueを使った Serverless ETL の実装パターンAWS Glueを使った Serverless ETL の実装パターン
AWS Glueを使った Serverless ETL の実装パターンseiichi arai
 
Spanner移行について本気出して考えてみた
Spanner移行について本気出して考えてみたSpanner移行について本気出して考えてみた
Spanner移行について本気出して考えてみたtechgamecollege
 
[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告
[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告
[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告Amazon Web Services Japan
 
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNSAWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNSAmazon Web Services Japan
 
Presto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon AthenaPresto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon AthenaAmazon Web Services Japan
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.JapanKohei KaiGai
 
Navigating Disaster Recovery in Kubernetes and CNCF Crossplane
Navigating Disaster Recovery in Kubernetes and CNCF Crossplane Navigating Disaster Recovery in Kubernetes and CNCF Crossplane
Navigating Disaster Recovery in Kubernetes and CNCF Crossplane Carlos Santana
 
AWS Black Belt Tech シリーズ 2015 - AWS Data Pipeline
AWS Black Belt Tech シリーズ 2015 - AWS Data PipelineAWS Black Belt Tech シリーズ 2015 - AWS Data Pipeline
AWS Black Belt Tech シリーズ 2015 - AWS Data PipelineAmazon Web Services Japan
 

Tendances (20)

クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス - クラウドのためのアーキテクチャ設計 - ベストプラクティス -
クラウドのためのアーキテクチャ設計 - ベストプラクティス -
 
クラウド上のデータ活用デザインパターン
クラウド上のデータ活用デザインパターンクラウド上のデータ活用デザインパターン
クラウド上のデータ活用デザインパターン
 
Oracle Cloud Infrastructure:2022年9月度サービス・アップデート
Oracle Cloud Infrastructure:2022年9月度サービス・アップデートOracle Cloud Infrastructure:2022年9月度サービス・アップデート
Oracle Cloud Infrastructure:2022年9月度サービス・アップデート
 
IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)
IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)
IAM Roles Anywhereのない世界とある世界(2022年のAWSアップデートを振り返ろう ~Season 4~ 発表資料)
 
脆弱性ハンドリングと耐える設計 -Vulnerability Response-
脆弱性ハンドリングと耐える設計 -Vulnerability Response-脆弱性ハンドリングと耐える設計 -Vulnerability Response-
脆弱性ハンドリングと耐える設計 -Vulnerability Response-
 
20190424 AWS Black Belt Online Seminar Amazon Aurora MySQL
20190424 AWS Black Belt Online Seminar Amazon Aurora MySQL20190424 AWS Black Belt Online Seminar Amazon Aurora MySQL
20190424 AWS Black Belt Online Seminar Amazon Aurora MySQL
 
AWSの課金体系
AWSの課金体系AWSの課金体系
AWSの課金体系
 
Azure Service Fabric 概要
Azure Service Fabric 概要Azure Service Fabric 概要
Azure Service Fabric 概要
 
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
ZOZOTOWNのマルチクラウドへの挑戦と挫折、そして未来
 
Azure Key Vault
Azure Key VaultAzure Key Vault
Azure Key Vault
 
AWS Glueを使った Serverless ETL の実装パターン
AWS Glueを使った Serverless ETL の実装パターンAWS Glueを使った Serverless ETL の実装パターン
AWS Glueを使った Serverless ETL の実装パターン
 
Spanner移行について本気出して考えてみた
Spanner移行について本気出して考えてみたSpanner移行について本気出して考えてみた
Spanner移行について本気出して考えてみた
 
[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告
[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告
[よくわかるクラウドデータベース] Amazon RDS for PostgreSQL検証報告
 
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNSAWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
AWS Black Belt Tech シリーズ 2016 - Amazon SQS / Amazon SNS
 
Serverless時代のJavaについて
Serverless時代のJavaについてServerless時代のJavaについて
Serverless時代のJavaについて
 
Presto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon AthenaPresto ベースのマネージドサービス Amazon Athena
Presto ベースのマネージドサービス Amazon Athena
 
20191115-PGconf.Japan
20191115-PGconf.Japan20191115-PGconf.Japan
20191115-PGconf.Japan
 
Navigating Disaster Recovery in Kubernetes and CNCF Crossplane
Navigating Disaster Recovery in Kubernetes and CNCF Crossplane Navigating Disaster Recovery in Kubernetes and CNCF Crossplane
Navigating Disaster Recovery in Kubernetes and CNCF Crossplane
 
AWS Black Belt Tech シリーズ 2015 - AWS Data Pipeline
AWS Black Belt Tech シリーズ 2015 - AWS Data PipelineAWS Black Belt Tech シリーズ 2015 - AWS Data Pipeline
AWS Black Belt Tech シリーズ 2015 - AWS Data Pipeline
 
Zabbix概論
Zabbix概論Zabbix概論
Zabbix概論
 

Similaire à Building Event-Driven Systems with Apache Kafka

Connect K of SMACK:pykafka, kafka-python or?
Connect K of SMACK:pykafka, kafka-python or?Connect K of SMACK:pykafka, kafka-python or?
Connect K of SMACK:pykafka, kafka-python or?Micron Technology
 
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Michael Noll
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?confluent
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Guido Schmutz
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Data Con LA
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaSlim Baltagi
 
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...HostedbyConfluent
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...Athens Big Data
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedEdureka!
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Timothy Spann
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignMichael Noll
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsTimothy Spann
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemEdureka!
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache KafkaJoe Stein
 
Python Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuidePython Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuideInexture Solutions
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...HostedbyConfluent
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformGuido Schmutz
 

Similaire à Building Event-Driven Systems with Apache Kafka (20)

Connect K of SMACK:pykafka, kafka-python or?
Connect K of SMACK:pykafka, kafka-python or?Connect K of SMACK:pykafka, kafka-python or?
Connect K of SMACK:pykafka, kafka-python or?
 
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?What is Apache Kafka and What is an Event Streaming Platform?
What is Apache Kafka and What is an Event Streaming Platform?
 
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !Apache Kafka - Scalable Message-Processing and more !
Apache Kafka - Scalable Message-Processing and more !
 
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
 
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
Trivadis TechEvent 2016 Apache Kafka - Scalable Massage Processing and more! ...
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
Learnings From Shipping 1000+ Streaming Data Pipelines To Production with Hak...
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics RedefinedApache Kafka with Spark Streaming: Real-time Analytics Redefined
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
 
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
 
Apache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - VerisignApache Kafka 0.8 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign
 
OSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming AppsOSSNA Building Modern Data Streaming Apps
OSSNA Building Modern Data Streaming Apps
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
Apache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging SystemApache Kafka: Next Generation Distributed Messaging System
Apache Kafka: Next Generation Distributed Messaging System
 
Current and Future of Apache Kafka
Current and Future of Apache KafkaCurrent and Future of Apache Kafka
Current and Future of Apache Kafka
 
Python Kafka Integration: Developers Guide
Python Kafka Integration: Developers GuidePython Kafka Integration: Developers Guide
Python Kafka Integration: Developers Guide
 
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
Applying ML on your Data in Motion with AWS and Confluent | Joseph Morais, Co...
 
Apache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing PlatformApache Kafka - A modern Stream Processing Platform
Apache Kafka - A modern Stream Processing Platform
 

Plus de Brian Ritchie

Transforming your application with Elasticsearch
Transforming your application with ElasticsearchTransforming your application with Elasticsearch
Transforming your application with ElasticsearchBrian Ritchie
 
From Dev to Ops:Delivering an API to Production with Splunk
From Dev to Ops:Delivering an API to Production with SplunkFrom Dev to Ops:Delivering an API to Production with Splunk
From Dev to Ops:Delivering an API to Production with SplunkBrian Ritchie
 
Extending the Enterprise with MEF
Extending the Enterprise with MEFExtending the Enterprise with MEF
Extending the Enterprise with MEFBrian Ritchie
 
CQRS: Command/Query Responsibility Segregation
CQRS: Command/Query Responsibility SegregationCQRS: Command/Query Responsibility Segregation
CQRS: Command/Query Responsibility SegregationBrian Ritchie
 
IIS Always-On Services
IIS Always-On ServicesIIS Always-On Services
IIS Always-On ServicesBrian Ritchie
 
Document Databases & RavenDB
Document Databases & RavenDBDocument Databases & RavenDB
Document Databases & RavenDBBrian Ritchie
 

Plus de Brian Ritchie (7)

Transforming your application with Elasticsearch
Transforming your application with ElasticsearchTransforming your application with Elasticsearch
Transforming your application with Elasticsearch
 
From Dev to Ops:Delivering an API to Production with Splunk
From Dev to Ops:Delivering an API to Production with SplunkFrom Dev to Ops:Delivering an API to Production with Splunk
From Dev to Ops:Delivering an API to Production with Splunk
 
Extending the Enterprise with MEF
Extending the Enterprise with MEFExtending the Enterprise with MEF
Extending the Enterprise with MEF
 
CQRS: Command/Query Responsibility Segregation
CQRS: Command/Query Responsibility SegregationCQRS: Command/Query Responsibility Segregation
CQRS: Command/Query Responsibility Segregation
 
IIS Always-On Services
IIS Always-On ServicesIIS Always-On Services
IIS Always-On Services
 
Scaling Out .NET
Scaling Out .NETScaling Out .NET
Scaling Out .NET
 
Document Databases & RavenDB
Document Databases & RavenDBDocument Databases & RavenDB
Document Databases & RavenDB
 

Dernier

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 

Dernier (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 

Building Event-Driven Systems with Apache Kafka

  • 1. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA BRIAN RITCHIE CTO, XEOHEALTH 2016 @brian_ritchie brian.ritchie@gmail.com http://www.dotnetpowered.com
  • 2. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA EVENT-DRIVEN SYSTEMS Definition Event-driven architecture, also known as message-driven architecture, is a software architecture pattern promoting the production, detection, consumption of, and reaction to events. An event can be defined as "a significant change in state". https://en.wikipedia.org/wiki/Event-driven_architecture
  • 3. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA EVENT-DRIVEN SYSTEMS ARE ABOUT UNLOCKING DATA • Data is the driving force behind innovation • Event-driven systems allow you to unlock the data – and unlock the innovation.
  • 4. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA EVENTS ARE THE “WHAT HAPPENED” DATA • It’s about recording “what happened”, but not coupling it to the “how” • It’s the “transactions” of your system • Product Views • Completed Sales • Page Visits • Site Logins • Shipping Notifications • Inventory Received • IoT • …and much more
  • 5. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA EVENTS – A HEALTHCARE EXAMPLE Event Stream Healthcare Claim Fraud Detection Data Lake Archive Disease Trending Contract & Pricing More… You don’t need to integrate with consumers or even know about a future uses of your data What happened? A patient received a set of services
  • 6. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA EVENT-DRIVEN SYSTEMS MAKE SCALABILITY EASIER • Scalability of processing • Scalability of design • Scalability of change
  • 7. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA EVENT-DRIVEN SYSTEMS REQUIRE INFRASTRUCTURE • Queue / Stream • Persistence • Distribution • Pub / Sub
  • 8. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA IS THE INFRASTRUCTURE • Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. • Developed by LinkedIn • Written in Java • Open Sourced in 2011 and graduated Apache Incubator in 2012 • Unique features of Kafka • Super fast • Distributed & Replicated out of the box • Extremely low cost
  • 9. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA WHO USES APACHE KAFKA? A few small companies you might have heard of…
  • 10. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA MICROSOFT SUPPORTS KAFKA Microsoft ♥ Linux Microsoft ♥ Open Source Nearly 1 in 3 VMs are Linux Microsoft moves to GitHub Microsoft sponsors the Kafka summit, releases Kafka .NET driver on GitHub, and even buys LinkedIn. That is some Kafka love.
  • 11. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – PERFORMANCE Kafka performs amazingly well on modest hardware. https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Producers and consumers simultaneously accessing cluster. Test on the LinkedIn Engineering Blog: - 3 machines in Kafka cluster, 3 to generate load - 6 SATA drives each, 32 GB RAM each - 1 GB Ethernet
  • 12. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – PERFORMANCE Microsoft has one of the largest Kafka installations called “Siphon” http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka 1.3 million Events per second at peak ~1 trillion Events per day at peak 3.5 petabytes Processed per day 1,300 Production brokers
  • 13. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – PERFORMANCE Microsoft has one of the largest Kafka installations called “Siphon” http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka https://github.com/Microsoft/Availability-Monitor-for-Kafka Availability & Latency monitor for Kafka using Canary messages
  • 14. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – ARCHITECTURE producer producer consumer consumer consumer Producers publish messages to a Kafka topic Consumers subscribe to topics and process messages Kafka cluster broker broker broker A Kafka cluster is made up of one or more brokers (nodes) Zookeeper Kafka uses Zookeeper for configuration
  • 15. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – ROLE OF ZOOKEEPER What is ZooKeeper? ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services to distributed applications. Role of ZooKeeper in Kafka It is responsible for: maintaining consumer offsets and topic lists, leader election, and general state information. Apache ZooKeeper zk-web: Web UI for ZooKeeper https://github.com/qiuxiafei/zk-web Or get the Docker container
  • 16. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – TOPICS Kafka topic producer producer 0 1 2 3 4 5 writes 0 1 2 3 4 0 1 2 3 4 5 writes consumer consumer reads reads Partition 0 Partition 1 Partition 2 Producers write messages to the end of a partition • Messages can be round robin load balanced across partitions or assigned by a function. Consumers read from the lowest offset to the highest • Unlike most queuing systems, state is not maintained on the server. Each consumer tracks its own offset.
  • 17. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – MORE ON PARTITIONS Partitions for scalability • The more partitions you have, the more throughput you get when consuming data. • Each partition must fit entirely on a single server. Partitions for ordering • Kafka only guarantees message order within the same partition. • If you need strong ordering, make sure that data is pinned to a single partition based on some sort of key
  • 18. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – PERSISTENCE Kafka topic 0 1 2 3 4 5 0 1 2 3 4 0 1 2 3 4 5 Partition 0 Partition 1 Partition 2 All messages are written to disk and replicated. Messages are not removed from Kafka when they are read from a topic. A cleanup process will remove old messages based on a sliding timeframe.
  • 19. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – CONSUMER GROUPS Kafka topic consumer 1 consumer 2 consumer reads rea ds reads Partition 0 Partition 1 Partition 2 Each consumer group is a “logical subscriber” Messages are processed in parallel by consumers Only one consumer is assigned to a partition in a consumer group. consumer 3 reads Consumer Group 2 consumer reads Consumer Group 1 Partition 3 consumer 4 reads Note: consumers are responsible for handling duplicate messages. These could be caused by failures of another consumer in the group.
  • 20. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – SERIALIZATION Pick a format! • JSON • BSON http://bsonspec.org/implementations.html • PROTOCOL BUFFERS https://github.com/google/protobuf • BOND https://github.com/Microsoft/bond • AVRO https://avro.apache.org/index.html
  • 21. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – GETTING STARTED Install Kafka & ZooKeeper https://dzone.com/articles/running-apache-kafka-on-windows-os • Install JDK • Install ZooKeeper • Install Kafka Start Kafka & ZooKeeper Start ZooKeeper C:binzookeeper-3.4.8bin>zkServer.cmd Start Kafka C:binkafka_2.11-0.8.2.2>.binwindowskafka-server-start.bat .configserver.properties
  • 22. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE KAFKA – GETTING STARTED Create a topic kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic SampleTopic1 Other Useful Topic Commands List Topics • kafka-topics.bat --list --zookeeper localhost:2181 Describe Topics • kafka-topics.bat --describe --zookeeper localhost:2181 --topic [Topic Name]
  • 23. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA KAFKA MANAGER https://github.com/yahoo/kafka-manager A tool for managing Apache Kafka created by Yahoo. Or get the Docker container
  • 24. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA DEMO Producing and consuming message in C# Sample code: https://github.com/dotnetpowered/StreamProcessingSample
  • 25. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE • Apache Spark is a fast and general engine for large-scale data processing, Runs programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. • Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. https://spark.apache.org/streaming/ • Supports streaming directly from Apache Kafka. http://spark.apache.org/docs/latest/streaming-kafka-integration.html
  • 26. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE - FIRING UP THE CLUSTER • Start the master • Start one or more slaves • Access the Spark cluster via browser spark-class org.apache.spark.deploy.master.Master spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077 http://spark-master:8080 Spark is made up of master and slave processes…
  • 27. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA APACHE WITH MOBIUS Mobius is a .NET language binding for Spark. It is a Java wrapper for building workers in C# and other CLR-based languages. • Reference the Microsoft.SparkCLR Nuget Package • Build a console application utilizing the API • Submit your program to Spark using the following script sparkclr-submit.cmd --master spark://spark-master:7077 --jars <path>runtimedependenciesspark-streaming-kafka-assembly_2.10-1.6.1.jar --exe StreamingRulesEngineHost.exe C:srcStreamProcessingStreamProcessingHostbinDebug https://github.com/Microsoft/Mobius
  • 28. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA DEMO Consuming messages in C# using Spark Sample code: https://github.com/dotnetpowered/StreamProcessingSample
  • 29. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA USING THE ELK STACK FOR INTEGRATION & VISUALIZATION Use Logstack to ingest events and/or consume events. Allows for “ETL” and integration with tools such as Elastic Search. Shipper (for non-Kafka enabled producers) Indexer search https://www.elastic.co/blog/just-enough-kafka-for-the-elastic-stack-part1
  • 30. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA CONNECTING KAFKA TO ELASTIC SEARCH For consumers: Configure a Kafka input input { kafka { zk_connect => "kafka:2181" group_id => "logstash" topic_id => "apache_logs" consumer_threads => 16 } } Don’t forget about to select a codec for serialization! C:binlogstash-2.3.2bin>logstash -e "input { kafka { topic_id => 'SampleTopic2' } } output { elasticsearch { index=>'sample- %{+YYYY.MM.dd}' document_id => '%{docid}' } }" Putting it all together:
  • 31. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA LET’S REVIEW • Event-driven systems are a key ingredient to unlocking your organization’s potential. Make data available to current and future apps, improve scalability, and decrease complexity. • Kafka is foundational infrastructure for event-driven systems and is battle tested at scale. • The ecosystem building around Kafka is rich - allowing you to connect using various tools.
  • 32. BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA QUESTIONS?
  • 33. THANK YOU! BRIAN RITCHIE CTO, XEOHEALTH 2016 @brian_ritchie brian.ritchie@gmail.com http://www.dotnetpowered.com Sample code: https://github.com/dotnetpowered/StreamProcessingSample

Notes de l'éditeur

  1. http://blog.underdog.io/post/107602021862/inside-datadogs-tech-stack
  2. https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
  3. https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
  4. https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines