SlideShare une entreprise Scribd logo
1  sur  53
Télécharger pour lire hors ligne
Introducing Kafka-on-Pulsar:
Bring native Kafka protocol support
to Apache Pulsar
Sijie Guo / Pierre Zemb
2020-03-31
Who are we?
● Sijie Guo (@sijieg)
● Co-Founder & CEO, StreamNative
● PMC Member of Pulsar/BookKeeper
● Ex Co-Founder, Streamlio
● Ex-Twitter, Ex-Yahoo
● Work on messaging and streaming data
technologies for many years
Who are we?
● Pierre Zemb (@PierreZ)
● Tech lead
● Working around distributed systems
● Newcomer as an Apache contributor
● Involved into local communities
Poll Time!
Agenda
● What is Apache Pulsar?
● Why KoP?
● Introduction of protocol handler
● Kafka VS Pulsar, the protocol version
● How we implement KoP
● Demo
● Roadmap
● Q&A
What is Apache Pulsar?
Flexible pub/sub messaging
backed by durable log storage
Flexible pub/sub messaging
backed by durable log storage
Cloud-Native Event Streaming
Apache Pulsar
● Publish-subscribe: unified messaging model (streaming + queueing)
● Infinite event stream storage: Apache BookKeeper + Tiered Storage
● Connectors: ingest events without writing code
● Process events in real-time
○ Pulsar Functions for serverless / lightweight computation
○ Spark / Flink for unified data processing
○ Presto for interactive queries
Pulsar Highlights
● Multi-tenancy
● Unified messaging (queuing + streaming)
● Layered Architecture
● Tiered Storage
● Built-in schema support
● Built-in geo-replication
The Need of KoP
● Adoptions
● Inbound requests
● Migration
The Existing Efforts
● Kafka Java Wrapper
● Pulsar IO Connector
Implement Kafka protocol on Pulsar?
● Proxy / Gateway
● Implement Kafka protocol on Pulsar broker
KoP as a proxy, OVHcloud version
We first implemented KoP has a proxy PoC in Rust:
● Rust async was out in nightly compiler when we started
● We wanted no GC on proxy layers
● Rust has awesome libraries at TCP-level
Our goal was to convert TCP frames from Kafka to Pulsar
KoP as a proxy, OVHcloud version
Proxy layer
"Hyperion"
clients
KoP as a proxy, OVHcloud version
Everything is a state-machine:
● TCP cnx from Kafka clients
● TCP cnx to Pulsar brokers
Those event-driven finite-state machines were triggered by TCP frames
from their respective protocol.
A third one was above the two to provide synchronization
KoP as a proxy, OVHcloud version
Proxy layer
"Hyperion"
clients
State machines
KoP as a proxy, OVHcloud version
Pros
● Working at TCP layer enables
performance
● nice PoC to discover both protocols
● Rust is blazing fast
● Proxify production is easy
● We could bump old version of Kafka
frames for old Kafka clients
Cons
● Rewrite everything
● Some things were hard to proxify:
○ Group coordinator
○ Offsets management
● Difficult to open-source (different
language)
The group-coordinator/offsets problem
In Kafka, the group coordinator is an
elected actor within the cluster
responsible for:
● assigning partitions to consumers of a
consumer group
● managing offsets for each consumer
group
In Pulsar, partition assignment is managed
by broker on a per-partition basis.
Offset management is done by storing the
acknowledgements in cursors by the
owner broker of that partition.
The group-coordinator/offsets problem
In Kafka, the group coordinator is an
elected actor within the cluster
responsible for:
● assigning partitions to consumers of a
consumer group
● managing offsets for each consumer
group
In Pulsar, partition assignment is managed
by broker on a per-partition basis.
Offset management is done by storing the
acknowledgements in cursors by the
owner broker of that partition.
Simulate this at proxy-level is hard
(missing low-level info)
And then we saw this 😍
Which lead to our collaboration 🤝
What is Apache Pulsar??
How Pulsar implements its protocol
Protocol Handler
What is the protocol handler?
What is the protocol handler?
How to load plugins in a jvm without using classpath?
Pulsar is using NAR to load plugins!
- Pulsar Function
- Pulsar Connector
- Pulsar Offloader
- Pulsar Protocol Handler
How-to load protocol handlers?
1. Upgrade your cluster to 2.5
2. Set the following configurations:
3. Configure each protocol handlers
4. Restart your broker
5. Enjoy!
Kafka-on-Pulsar Protocol Handler
The KoP Implementation
● “distributedlog”
○ Kafka and Pulsar are built on same data structure
● Similarities
○ Topic Lookup
○ Topic / Partitions / Messages / Offset
○ Produce
○ Consume
○ Consumption State
KoP Implementation
● Topic flat map: Brokers set `kafkaNamespace`
● MessageID and offset: LedgerId + EntryId
● Message: Convert key/value/timestamps/headers
● Topic Lookup: Pulsar admin topic lookup -> owner broker
● Produce: Convert message, then PulsarTopic.publishMessage
● Consume: Convert requests, then nonDurableCursor.readEntries
● Group Coordinator: Keep group information in topic
`public/__kafka/__offsets`
KoP for production
❏ What Pulsar Provides
❏ Multi-Tenancy
❏ Security
❏ TLS Encryption
❏ Authentication, Authorization
❏ Data Encryption
❏ Geo-replication
❏ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP for production
❏ What Pulsar Provides
❏ Multi-Tenancy
❏ Security
❏ TLS Encryption
❏ Authentication, Authorization
❏ Data Encryption
✓ Geo-replication
✓ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP for production
❏ What Pulsar Provides
✓ Multi-Tenancy
✓ Security
✓ TLS Encryption
✓ Authentication, Authorization
❏ Data Encryption
✓ Geo-replication
✓ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP for production
❏ What Pulsar Provides
✓ Multi-Tenancy
✓ Security
✓ TLS Encryption
✓ Authentication, Authorization
❏ Data Encryption
✓ Geo-replication
✓ Tiered storage
❏ Schema
❏ Integrations with big data ecosystem (Flink / Spark / Presto)
KoP multi-tenancy
Pulsar has great support for multi-tenancy, how-to use it in KoP?
SASL-PLAIN is used to inject info:
● The username of Kafka JAAS is the tenant/namespace
● The password must be your classic Pulsar token authentication
parameters
TLS can be added over SASL-PLAIN
KoP multi-tenancy
String tenant = "ns1/tenant1";
String password = "token:xxx";
String jaasTemplate =
"org.apache.kafka.common.security.plain.PlainLoginModule required
username="%s" password="%s";";
String jaasCfg = String.format(jaasTemplate, tenant, password);
props.put("sasl.jaas.config", jaasCfg);
props.put("security.protocol", "SASL_PLAINTEXT");
props.put("sasl.mechanism", "PLAIN");
KoP Compatibility checklist
Integrations tests are runned with Kafka official Java client and popular
Kafka clients in other languages
Golang
● https://github.com/Shopify/sarama
● https://github.com/confluentinc/confluent-kafka-go
Rust
● https://github.com/fede1024/rust-rdkafka
NodeJS
● https://github.com/Blizzard/node-rdkafka
Demo time!
Demo
● K/P-Producer -> K/P-Consumer
○ TLS & SASL-PLAIN
● K-Producer -> Pulsar Functions
● P-Producer -> Kafka Connect
* All demos run with TLS and SASL-PLAIN
https://hackmd.io/nLj5M9BEQIacKcZsNrDxmQ
Demo 1: K-Producer -> K-Consumer
Demo 2: P-Producer -> K-Consumer
Demo 3: K-Producer -> P-Consumer
Demo 4: Kafka Connect
Demo 5: Pulsar Functions
Apache Pulsar + Apache Kafka
Roadmap / Future work
● KoP Proxy
● Schema
● Kafka transaction (Waiting for Pulsar transaction)
● Kafka 1.X support
● Kafka > 2.0 support
Try it now!
● Download and try it out today!
● https://github.com/streamnative/kop
More protocol handlers are coming!
● AoP - AMQP on Pulsar ← First week of May
● MoP - MQTT on Pulsar
● ...
2020 Pulsar User Survey Report
https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/
TGI Pulsar Weekly Live Stream
https://www.youtube.com/channel/UCywxUI5HlIyc0VEKYR4X9Pg/live
Q & A
Follow us!
● Follow us at Twitter
○ Pierre Zemb (@PierreZ)
○ Sijie Guo (@sijieg)
○ Apache Pulsar (@apache_pulsar)
○ StreamNative (@streamnativeio)
○ OVHcloud (@OVHcloud)
● Join us at #kop channel on Pulsar slack!

Contenu connexe

Tendances

Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
StreamNative
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
StreamNative
 

Tendances (20)

Query Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache FlinkQuery Pulsar Streams using Apache Flink
Query Pulsar Streams using Apache Flink
 
Interactive querying of streams using Apache Pulsar_Jerry peng
Interactive querying of streams using Apache Pulsar_Jerry pengInteractive querying of streams using Apache Pulsar_Jerry peng
Interactive querying of streams using Apache Pulsar_Jerry peng
 
What's new in apache pulsar 2.4.0
What's new in apache pulsar 2.4.0What's new in apache pulsar 2.4.0
What's new in apache pulsar 2.4.0
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...
 
Introduction Apache Kafka
Introduction Apache KafkaIntroduction Apache Kafka
Introduction Apache Kafka
 
Apache Pulsar and Github
Apache Pulsar and GithubApache Pulsar and Github
Apache Pulsar and Github
 
Building a FaaS with pulsar
Building a FaaS with pulsarBuilding a FaaS with pulsar
Building a FaaS with pulsar
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre ZembBuilding a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
Building a Messaging Solutions for OVHcloud with Apache Pulsar_Pierre Zemb
 
Apache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! JapanApache Pulsar at Yahoo! Japan
Apache Pulsar at Yahoo! Japan
 
Scaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsarScaling customer engagement with apache pulsar
Scaling customer engagement with apache pulsar
 
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
Unify Storage Backend for Batch and Streaming Computation with Apache Pulsar_...
 
Integrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data EcosystemIntegrating Apache Pulsar with Big Data Ecosystem
Integrating Apache Pulsar with Big Data Ecosystem
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Getting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison HighamGetting Pulsar Spinning_Addison Higham
Getting Pulsar Spinning_Addison Higham
 
Apache Kafka - Martin Podval
Apache Kafka - Martin PodvalApache Kafka - Martin Podval
Apache Kafka - Martin Podval
 
Kafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internalsKafka blr-meetup-presentation - Kafka internals
Kafka blr-meetup-presentation - Kafka internals
 
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
 
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar - Pulsar Sum...
 

Similaire à Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pulsar

ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
Timothy Spann
 
[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake
Timothy Spann
 
(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
Timothy Spann
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming
Timothy Spann
 

Similaire à Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pulsar (20)

Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&PierreKafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
Kafka on Pulsar:bringing native Kafka protocol support to Pulsar_Sijie&Pierre
 
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
 
Structured Streaming with Kafka
Structured Streaming with KafkaStructured Streaming with Kafka
Structured Streaming with Kafka
 
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
Big mountain data and dev conference   apache pulsar with mqtt for edge compu...Big mountain data and dev conference   apache pulsar with mqtt for edge compu...
Big mountain data and dev conference apache pulsar with mqtt for edge compu...
 
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache PulsarApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
ApacheCon2022_Deep Dive into Building Streaming Applications with Apache Pulsar
 
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
OSS EU:  Deep Dive into Building Streaming Applications with Apache PulsarOSS EU:  Deep Dive into Building Streaming Applications with Apache Pulsar
OSS EU: Deep Dive into Building Streaming Applications with Apache Pulsar
 
Cloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azureCloud lunch and learn real-time streaming in azure
Cloud lunch and learn real-time streaming in azure
 
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache PulsarCODEONTHEBEACH_Streaming Applications with Apache Pulsar
CODEONTHEBEACH_Streaming Applications with Apache Pulsar
 
[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake[March sn meetup] apache pulsar + apache nifi for cloud data lake
[March sn meetup] apache pulsar + apache nifi for cloud data lake
 
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
Machine Intelligence Guild_ Build ML Enhanced Event Streaming Applications wi...
 
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...Data science online camp   using the flipn stack for edge ai (flink, nifi, pu...
Data science online camp using the flipn stack for edge ai (flink, nifi, pu...
 
A Tour of Apache Kafka
A Tour of Apache KafkaA Tour of Apache Kafka
A Tour of Apache Kafka
 
bigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Appsbigdata 2022_ FLiP Into Pulsar Apps
bigdata 2022_ FLiP Into Pulsar Apps
 
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
Pulsar summit asia 2021   apache pulsar with mqtt for edge computingPulsar summit asia 2021   apache pulsar with mqtt for edge computing
Pulsar summit asia 2021 apache pulsar with mqtt for edge computing
 
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
Let’s Monitor Conditions at the Conference With Timothy Spann & David Kjerrum...
 
(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference(Current22) Let's Monitor The Conditions at the Conference
(Current22) Let's Monitor The Conditions at the Conference
 
DBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data LakesDBCC 2021 - FLiP Stack for Cloud Data Lakes
DBCC 2021 - FLiP Stack for Cloud Data Lakes
 
Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1Introduction to Apache Kafka- Part 1
Introduction to Apache Kafka- Part 1
 
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
Creating Connector to Bridge the Worlds of Kafka and gRPC at Wework (Anoop Di...
 
[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming[AI Dev World 2022] Build ML Enhanced Event Streaming
[AI Dev World 2022] Build ML Enhanced Event Streaming
 

Plus de StreamNative

Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
StreamNative
 

Plus de StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 

Dernier

Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
ellan12
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
sexy call girls service in goa
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 

Dernier (20)

Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Saket Delhi 💯Call Us 🔝8264348440🔝
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Pratap Nagar Delhi 💯Call Us 🔝8264348440🔝
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptxAWS Community DAY Albertini-Ellan Cloud Security (1).pptx
AWS Community DAY Albertini-Ellan Cloud Security (1).pptx
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call GirlVIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
VIP 7001035870 Find & Meet Hyderabad Call Girls LB Nagar high-profile Call Girl
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
Call Now ☎ 8264348440 !! Call Girls in Shahpur Jat Escort Service Delhi N.C.R.
 
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
Hire↠Young Call Girls in Tilak nagar (Delhi) ☎️ 9205541914 ☎️ Independent Esc...
 
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
Russian Call Girls in %(+971524965298  )#  Call Girls in DubaiRussian Call Girls in %(+971524965298  )#  Call Girls in Dubai
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 

Introducing Kafka-on-Pulsar: bring native Kafka protocol support to Apache Pulsar

  • 1. Introducing Kafka-on-Pulsar: Bring native Kafka protocol support to Apache Pulsar Sijie Guo / Pierre Zemb 2020-03-31
  • 2. Who are we? ● Sijie Guo (@sijieg) ● Co-Founder & CEO, StreamNative ● PMC Member of Pulsar/BookKeeper ● Ex Co-Founder, Streamlio ● Ex-Twitter, Ex-Yahoo ● Work on messaging and streaming data technologies for many years
  • 3. Who are we? ● Pierre Zemb (@PierreZ) ● Tech lead ● Working around distributed systems ● Newcomer as an Apache contributor ● Involved into local communities
  • 5. Agenda ● What is Apache Pulsar? ● Why KoP? ● Introduction of protocol handler ● Kafka VS Pulsar, the protocol version ● How we implement KoP ● Demo ● Roadmap ● Q&A
  • 6. What is Apache Pulsar?
  • 7. Flexible pub/sub messaging backed by durable log storage
  • 8. Flexible pub/sub messaging backed by durable log storage
  • 10. Apache Pulsar ● Publish-subscribe: unified messaging model (streaming + queueing) ● Infinite event stream storage: Apache BookKeeper + Tiered Storage ● Connectors: ingest events without writing code ● Process events in real-time ○ Pulsar Functions for serverless / lightweight computation ○ Spark / Flink for unified data processing ○ Presto for interactive queries
  • 11. Pulsar Highlights ● Multi-tenancy ● Unified messaging (queuing + streaming) ● Layered Architecture ● Tiered Storage ● Built-in schema support ● Built-in geo-replication
  • 12. The Need of KoP ● Adoptions ● Inbound requests ● Migration
  • 13. The Existing Efforts ● Kafka Java Wrapper ● Pulsar IO Connector
  • 14. Implement Kafka protocol on Pulsar? ● Proxy / Gateway ● Implement Kafka protocol on Pulsar broker
  • 15. KoP as a proxy, OVHcloud version We first implemented KoP has a proxy PoC in Rust: ● Rust async was out in nightly compiler when we started ● We wanted no GC on proxy layers ● Rust has awesome libraries at TCP-level Our goal was to convert TCP frames from Kafka to Pulsar
  • 16. KoP as a proxy, OVHcloud version Proxy layer "Hyperion" clients
  • 17. KoP as a proxy, OVHcloud version Everything is a state-machine: ● TCP cnx from Kafka clients ● TCP cnx to Pulsar brokers Those event-driven finite-state machines were triggered by TCP frames from their respective protocol. A third one was above the two to provide synchronization
  • 18. KoP as a proxy, OVHcloud version Proxy layer "Hyperion" clients State machines
  • 19. KoP as a proxy, OVHcloud version Pros ● Working at TCP layer enables performance ● nice PoC to discover both protocols ● Rust is blazing fast ● Proxify production is easy ● We could bump old version of Kafka frames for old Kafka clients Cons ● Rewrite everything ● Some things were hard to proxify: ○ Group coordinator ○ Offsets management ● Difficult to open-source (different language)
  • 20. The group-coordinator/offsets problem In Kafka, the group coordinator is an elected actor within the cluster responsible for: ● assigning partitions to consumers of a consumer group ● managing offsets for each consumer group In Pulsar, partition assignment is managed by broker on a per-partition basis. Offset management is done by storing the acknowledgements in cursors by the owner broker of that partition.
  • 21. The group-coordinator/offsets problem In Kafka, the group coordinator is an elected actor within the cluster responsible for: ● assigning partitions to consumers of a consumer group ● managing offsets for each consumer group In Pulsar, partition assignment is managed by broker on a per-partition basis. Offset management is done by storing the acknowledgements in cursors by the owner broker of that partition. Simulate this at proxy-level is hard (missing low-level info)
  • 22. And then we saw this 😍
  • 23. Which lead to our collaboration 🤝
  • 24. What is Apache Pulsar??
  • 25. How Pulsar implements its protocol
  • 27. What is the protocol handler?
  • 28. What is the protocol handler? How to load plugins in a jvm without using classpath? Pulsar is using NAR to load plugins! - Pulsar Function - Pulsar Connector - Pulsar Offloader - Pulsar Protocol Handler
  • 29. How-to load protocol handlers? 1. Upgrade your cluster to 2.5 2. Set the following configurations: 3. Configure each protocol handlers 4. Restart your broker 5. Enjoy!
  • 31. The KoP Implementation ● “distributedlog” ○ Kafka and Pulsar are built on same data structure ● Similarities ○ Topic Lookup ○ Topic / Partitions / Messages / Offset ○ Produce ○ Consume ○ Consumption State
  • 32. KoP Implementation ● Topic flat map: Brokers set `kafkaNamespace` ● MessageID and offset: LedgerId + EntryId ● Message: Convert key/value/timestamps/headers ● Topic Lookup: Pulsar admin topic lookup -> owner broker ● Produce: Convert message, then PulsarTopic.publishMessage ● Consume: Convert requests, then nonDurableCursor.readEntries ● Group Coordinator: Keep group information in topic `public/__kafka/__offsets`
  • 33. KoP for production ❏ What Pulsar Provides ❏ Multi-Tenancy ❏ Security ❏ TLS Encryption ❏ Authentication, Authorization ❏ Data Encryption ❏ Geo-replication ❏ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 34. KoP for production ❏ What Pulsar Provides ❏ Multi-Tenancy ❏ Security ❏ TLS Encryption ❏ Authentication, Authorization ❏ Data Encryption ✓ Geo-replication ✓ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 35. KoP for production ❏ What Pulsar Provides ✓ Multi-Tenancy ✓ Security ✓ TLS Encryption ✓ Authentication, Authorization ❏ Data Encryption ✓ Geo-replication ✓ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 36. KoP for production ❏ What Pulsar Provides ✓ Multi-Tenancy ✓ Security ✓ TLS Encryption ✓ Authentication, Authorization ❏ Data Encryption ✓ Geo-replication ✓ Tiered storage ❏ Schema ❏ Integrations with big data ecosystem (Flink / Spark / Presto)
  • 37. KoP multi-tenancy Pulsar has great support for multi-tenancy, how-to use it in KoP? SASL-PLAIN is used to inject info: ● The username of Kafka JAAS is the tenant/namespace ● The password must be your classic Pulsar token authentication parameters TLS can be added over SASL-PLAIN
  • 38. KoP multi-tenancy String tenant = "ns1/tenant1"; String password = "token:xxx"; String jaasTemplate = "org.apache.kafka.common.security.plain.PlainLoginModule required username="%s" password="%s";"; String jaasCfg = String.format(jaasTemplate, tenant, password); props.put("sasl.jaas.config", jaasCfg); props.put("security.protocol", "SASL_PLAINTEXT"); props.put("sasl.mechanism", "PLAIN");
  • 39. KoP Compatibility checklist Integrations tests are runned with Kafka official Java client and popular Kafka clients in other languages Golang ● https://github.com/Shopify/sarama ● https://github.com/confluentinc/confluent-kafka-go Rust ● https://github.com/fede1024/rust-rdkafka NodeJS ● https://github.com/Blizzard/node-rdkafka
  • 41. Demo ● K/P-Producer -> K/P-Consumer ○ TLS & SASL-PLAIN ● K-Producer -> Pulsar Functions ● P-Producer -> Kafka Connect * All demos run with TLS and SASL-PLAIN https://hackmd.io/nLj5M9BEQIacKcZsNrDxmQ
  • 42. Demo 1: K-Producer -> K-Consumer
  • 43. Demo 2: P-Producer -> K-Consumer
  • 44. Demo 3: K-Producer -> P-Consumer
  • 45. Demo 4: Kafka Connect
  • 46. Demo 5: Pulsar Functions
  • 47. Apache Pulsar + Apache Kafka
  • 48. Roadmap / Future work ● KoP Proxy ● Schema ● Kafka transaction (Waiting for Pulsar transaction) ● Kafka 1.X support ● Kafka > 2.0 support
  • 49. Try it now! ● Download and try it out today! ● https://github.com/streamnative/kop More protocol handlers are coming! ● AoP - AMQP on Pulsar ← First week of May ● MoP - MQTT on Pulsar ● ...
  • 50. 2020 Pulsar User Survey Report https://streamnative.io/whitepaper/sn-apache-pulsar-user-survey-report-2020/
  • 51. TGI Pulsar Weekly Live Stream https://www.youtube.com/channel/UCywxUI5HlIyc0VEKYR4X9Pg/live
  • 52. Q & A
  • 53. Follow us! ● Follow us at Twitter ○ Pierre Zemb (@PierreZ) ○ Sijie Guo (@sijieg) ○ Apache Pulsar (@apache_pulsar) ○ StreamNative (@streamnativeio) ○ OVHcloud (@OVHcloud) ● Join us at #kop channel on Pulsar slack!