SlideShare une entreprise Scribd logo
1  sur  44
©︎2021 Yahoo Japan Corporation All rights reserved.
Reduce redundant producers from
partitioned producer
Yuri Mizushima - Yahoo Japan Corporation
©︎2021 Yahoo Japan Corporation All rights reserved.
1. Background of the issue
2. Solution
3. Benchmark
4. Conclusion
2
Agenda
©︎2021 Yahoo Japan Corporation All rights reserved.
1. Background of the issue
2. Solution
3. Benchmark
4. Conclusion
3
Agenda
©︎2021 Yahoo Japan Corporation All rights reserved.
Apache Pulsar in Yahoo! JAPAN
1. Background of the issue
4
In Yahoo! JAPAN, Apache Pulsar is used in many use cases.
• Notification of contents update
• Job queuing
• etc.
©︎2021 Yahoo Japan Corporation All rights reserved.
One of use cases, metrics/logs streaming pipeline
1. Background of the issue
5
Also, Pulsar is used in metrics and logs streaming pipeline.
©︎2021 Yahoo Japan Corporation All rights reserved.
One of use case, metrics/logs stream pipeline
1. Background of the issue
6
In this case, an unspecified number of producers connect to partitioned topics. It causes
1. The number of producers exceeds the limit
2. Redundant producers are created
per instances
©︎2021 Yahoo Japan Corporation All rights reserved.
The number of producers exceeds the limit
1. Background of the issue
7
Pulsar has a config maxProducersPerTopic. "ill-behaved" clients which increase producers
infinitely can make the topic producer-full.
To solve this issue,
• Limit the number of producers/consumers that can connect per topic for each IP address
(https://github.com/apache/pulsar/pull/10188)
• Even if the number of producer is increased by a client, the influence is suppressed within only
single IP address
• We don’t explain the detail in this session
©︎2021 Yahoo Japan Corporation All rights reserved.
Redundant producers are created per instances
1. Background of the issue
8
When a producer connects to the partitioned topic, sometimes it has redundant internal
producers.
1. Relatively "low-rate" producers
• The number of partitions needs to be increased according to total throughput
• However, creating internal producers for all partitions is inefficient for producers whose
throughput is small enough to be handled by a few partitions
2. SinglePartition routing mode
• Each producer uses only one partition
• other internal producers are redundant
©︎2021 Yahoo Japan Corporation All rights reserved.
Next...
1. Background of the issue
9
In this session,
1. The number of producers exceeds the limit
2. Redundant producers are created per instances
©︎2021 Yahoo Japan Corporation All rights reserved.
1. Background of the issue
2. Solution
3. Benchmark
4. Conclusion
10
Agenda
©︎2021 Yahoo Japan Corporation All rights reserved.
Concepts
2. Solution
11
Reduce the number of producers to use system resources (e.g. client heap) more efficiently.
©︎2021 Yahoo Japan Corporation All rights reserved.
Detail - limiting by number of internal producers
2. Solution
12
When a partitioned producer connects to the topic, the client can randomly choose the
limiting number of internal producers and internal partitions as well.
©︎2021 Yahoo Japan Corporation All rights reserved.
Detail - lazy-loading
2. Solution
13
First, at initialization step, a partitioned producer connects to only one of partitions for authn.
and authz. instead of all partitions.
When the internal producer is created, validate authn., and validate authz. at this topic.
©︎2021 Yahoo Japan Corporation All rights reserved.
Detail - lazy-loading
2. Solution
14
Second, at message sending step, partitions are chosen by message router. Each internal
producer is created on the first time to be chosen by message router.
©︎2021 Yahoo Japan Corporation All rights reserved.
Detail - partial round-robin
2. Solution
15
Also, we add new custom routing mode PartialRoundRobinMessageRouter. This mode
supports round-robin with limiting the number of partitions.
©︎2021 Yahoo Japan Corporation All rights reserved.
Detail - partitioned producer stats
2. Solution
16
Partitioned producer stats are accumulated for all partitions. Accumulating procedure
supposes “all the partitions have the same producer”.
©︎2021 Yahoo Japan Corporation All rights reserved.
Detail - partitioned producer stats
2. Solution
17
We introduce producerStatsKey. Publisher stats with the same value for this property are
accumulated as same producer.
©︎2021 Yahoo Japan Corporation All rights reserved.
Next...
2. Solution
18
Check the performance by proposed and existing.
©︎2021 Yahoo Japan Corporation All rights reserved.
1. Background of the issue
2. Solution
3. Benchmark
4. Conclusion
19
Agenda
©︎2021 Yahoo Japan Corporation All rights reserved.
Criterion
3. Benchmark
20
• Broker heap usage, CPU percentage
• Client heap usage, CPU percentage
• Number of TCP connections between client and broker
• Producer initialization time
• 99pct latency average
©︎2021 Yahoo Japan Corporation All rights reserved.
Assumption
3. Benchmark
21
• Send any messages without batching
• Send any message with PartialRoundRobinMessageRouter(with proposed) and
RoundRobinPartition(with existing)
• Create a partitioned topic with 30 partitions
• Send 1024 bytes messages with 5000 rps
• Existing
• commit hash: 0b71b13
©︎2021 Yahoo Japan Corporation All rights reserved.
Procedure and Variables
3. Benchmark
22
• Procedure
1. Run b broker servers
2. Create a producer limiting l partitions
3. Send messages for 10 minutes
4. (Check stats)
• Variables
• b=5, l=3
• b=5, l=5
• b=10, l=5
©︎2021 Yahoo Japan Corporation All rights reserved.
Environment
3. Benchmark
23
• macOS 10.15.7
• 2.5GHz dual-core Intel Core i7
• 16 GB 2133 MHz LPDDR3
• AdoptOpenJDK-11.0.11+9
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=3, proposed
3. Benchmark
24
• Number of TCP connections between client and broker
• 3
• Producer initialization time
• 463 [ms]
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=3, proposed
3. Benchmark
25
client heap/CPU broker heap/CPU
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=3, existing
3. Benchmark
26
• Number of TCP connections between client and broker
• 5
• Producer initialization time
• 1389 [ms]
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=3, existing
3. Benchmark
27
client heap/CPU broker heap/CPU
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=3
3. Benchmark
28
99pct latency average
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=5, proposed
3. Benchmark
29
• Number of TCP connections between client and broker
• 5
• Producer initialization time
• 862 [ms]
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=5, proposed
3. Benchmark
30
client heap/CPU broker heap/CPU
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=5, l=5
3. Benchmark
31
99pct latency average
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=10, l=5, proposed
3. Benchmark
32
• Number of TCP connections between client and broker
• 4
• Producer initialization time
• 1070 [ms]
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=10, l=5, proposed
3. Benchmark
33
client heap/CPU broker heap/CPU
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=10, l=5, existing
3. Benchmark
34
• Number of TCP connections between client and broker
• 5
• Producer initialization time
• 1147 [ms]
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=10, l=5, existing
3. Benchmark
35
client heap/CPU broker heap/CPU
©︎2021 Yahoo Japan Corporation All rights reserved.
Result - b=10, l=5
3. Benchmark
36
99pct latency average
©︎2021 Yahoo Japan Corporation All rights reserved.
Consideration
3. Benchmark
37
• Client side
• Producer initialization time is faster than “existing”
• Number of TCP connections is less than or equal to “existing”
• Suppose a cluster has enough brokers. If we use the feature, number of TCP connections is less than or equal to
“limit+1”
• Heap usage is less than “existing”
• Some cases of CPU percentage is stabler than “existing”
©︎2021 Yahoo Japan Corporation All rights reserved.
Consideration
3. Benchmark
38
Number of TCP connections
Producer initialization time
[ms]
proposed existing proposed existing
b=5, l=3 3 5 463 1389
b=5, l=5 5 5 862 1389
b=10, l=5 4 5 1070 1147
©︎2021 Yahoo Japan Corporation All rights reserved.
Consideration
3. Benchmark
39
• Broker side
• No significant effects to heap usage
• Some brokers of CPU percentage is less than “existing” at “b=5, l=3”
• Maybe because topic load isn’t distributed to all brokers
• CPU percentage per broker is greater than “existing” at “b=10, l=5”
• Maybe because topic load isn’t distributed to all brokers
• Number of running brokers (whose CPU/heap usages were increased) is less than "existing"
• Number of running brokers depends on "active topics" where producers are actually connected
©︎2021 Yahoo Japan Corporation All rights reserved.
Consideration
3. Benchmark
40
Number of brokers which
system resources were
increased
proposed existing
b=5, l=3 2 4
b=5, l=5 4 4
b=10, l=5 3 4
b=5, l=3
Number of topics which is
loaded
proposed existing
broker0 0 8
broker1 1 0
broker2 2 7
broker3 0 7
broker4 0 8
©︎2021 Yahoo Japan Corporation All rights reserved.
1. Background of the issue
2. Solution
3. Benchmark
4. Conclusion
41
Agenda
©︎2021 Yahoo Japan Corporation All rights reserved.
Conclusion
4. Conclusion
42
In conclusion,
• Implement producer lazy-loading and partial round-robin feature
• and fix partitioned producer stats collecting logic
• Check the performance by toy example
• Future tasks
• Also implement the feature to other clients
• e.g. C++, Go, etc.
I’m really excited to be involved in Apache Pulsar.
©︎2021 Yahoo Japan Corporation All rights reserved.
©︎2021 Yahoo Japan Corporation All rights reserved.
Consideration - latency
Appendix
44
• Result of latency is strange by existting one
• Retest it and reproduce strange behavior

Contenu connexe

Tendances

Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...StreamNative
 
Pulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless EvolutionPulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless EvolutionStreamNative
 
Large scale log pipeline using Apache Pulsar_Nozomi
Large scale log pipeline using Apache Pulsar_NozomiLarge scale log pipeline using Apache Pulsar_Nozomi
Large scale log pipeline using Apache Pulsar_NozomiStreamNative
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafkaconfluent
 
ONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINAONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINAJunho Suh
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...Marco Tusa
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaJiangjie Qin
 
Openstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud NetworkingOpenstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud NetworkingShannon McFarland
 
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC 2018 - Monitoring PostgreSQL at ScalePGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC 2018 - Monitoring PostgreSQL at ScalePGConf APAC
 
Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019
Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019
Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019confluent
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceTimothy Spann
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaJoe Stein
 
Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...StreamNative
 
Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)Timothy Spann
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...JinfengHuang3
 
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...PGConf APAC
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC
 

Tendances (20)

Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...
 
Pulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless EvolutionPulsar Storage on BookKeeper _Seamless Evolution
Pulsar Storage on BookKeeper _Seamless Evolution
 
Large scale log pipeline using Apache Pulsar_Nozomi
Large scale log pipeline using Apache Pulsar_NozomiLarge scale log pipeline using Apache Pulsar_Nozomi
Large scale log pipeline using Apache Pulsar_Nozomi
 
War Stories: DIY Kafka
War Stories: DIY KafkaWar Stories: DIY Kafka
War Stories: DIY Kafka
 
Load balancing at tuenti
Load balancing at tuentiLoad balancing at tuenti
Load balancing at tuenti
 
ONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINAONS Summit 2017 SKT TINA
ONS Summit 2017 SKT TINA
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...Comparing high availability solutions with percona xtradb cluster and percona...
Comparing high availability solutions with percona xtradb cluster and percona...
 
Producer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
 
Openstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud NetworkingOpenstack Summit Vancouver 2018 - Multicloud Networking
Openstack Summit Vancouver 2018 - Multicloud Networking
 
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC 2018 - Monitoring PostgreSQL at ScalePGConf APAC 2018 - Monitoring PostgreSQL at Scale
PGConf APAC 2018 - Monitoring PostgreSQL at Scale
 
Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019
Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019
Help, my Kafka is broken! (Emma Humber, IBM) Kafka Summit SF 2019
 
Apache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open SourceApache Deep Learning 201 - Philly Open Source
Apache Deep Learning 201 - Philly Open Source
 
Developing Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache KafkaDeveloping Real-Time Data Pipelines with Apache Kafka
Developing Real-Time Data Pipelines with Apache Kafka
 
Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...Five years of operating a large scale globally replicated Pulsar installation...
Five years of operating a large scale globally replicated Pulsar installation...
 
Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)Using the flipn stack for edge ai (flink, nifi, pulsar)
Using the flipn stack for edge ai (flink, nifi, pulsar)
 
How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...How Orange Financial combat financial frauds over 50M transactions a day usin...
How Orange Financial combat financial frauds over 50M transactions a day usin...
 
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...
PGConf APAC 2018 - PostgreSQL HA with Pgpool-II and whats been happening in P...
 
High Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and SolutionsHigh Performance Erlang - Pitfalls and Solutions
High Performance Erlang - Pitfalls and Solutions
 
PGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from TrenchesPGConf APAC 2018 - Tale from Trenches
PGConf APAC 2018 - Tale from Trenches
 

Similaire à Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021

Precomputing recommendations with Apache Beam
Precomputing recommendations with Apache BeamPrecomputing recommendations with Apache Beam
Precomputing recommendations with Apache BeamTatiana Al-Chueyr
 
Beginner's Guide to High Availability for Postgres - French
Beginner's Guide to High Availability for Postgres - FrenchBeginner's Guide to High Availability for Postgres - French
Beginner's Guide to High Availability for Postgres - FrenchEDB
 
Public Sector Virtual Town Hall: High Availability for PostgreSQL
Public Sector Virtual Town Hall: High Availability for PostgreSQLPublic Sector Virtual Town Hall: High Availability for PostgreSQL
Public Sector Virtual Town Hall: High Availability for PostgreSQLEDB
 
MIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3C
MIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3CMIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3C
MIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3CMIPI Alliance
 
Beginners Guide to High Availability for Postgres
Beginners Guide to High Availability for PostgresBeginners Guide to High Availability for Postgres
Beginners Guide to High Availability for PostgresEDB
 
WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0
WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0
WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0Amir Zmora
 
Beginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for PostgresBeginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for PostgresEDB
 
Debugging Complex Issues in Web Applications
Debugging Complex Issues in Web ApplicationsDebugging Complex Issues in Web Applications
Debugging Complex Issues in Web ApplicationsVMware Tanzu
 
Beginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for Postgres Beginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for Postgres EDB
 
Supercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the EdgeSupercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the EdgeOptimizely
 
Advanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsAdvanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsRogue Wave Software
 
Brushing skills on SignalR for ASP.NET developers
Brushing skills on SignalR for ASP.NET developersBrushing skills on SignalR for ASP.NET developers
Brushing skills on SignalR for ASP.NET developersONE BCG
 
booting-booster-final-20160420-0700
booting-booster-final-20160420-0700booting-booster-final-20160420-0700
booting-booster-final-20160420-0700Samsung Electronics
 
Continuously Integrating Distributed Code at Netflix
Continuously Integrating Distributed Code at NetflixContinuously Integrating Distributed Code at Netflix
Continuously Integrating Distributed Code at NetflixAtlassian
 
E- Mortgage Syatem or Girvi System
E- Mortgage Syatem or Girvi SystemE- Mortgage Syatem or Girvi System
E- Mortgage Syatem or Girvi SystemMukul Agarwal
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamTatiana Al-Chueyr
 
Troubleshooting and Best Practices with WSO2 Enterprise Integrator
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorTroubleshooting and Best Practices with WSO2 Enterprise Integrator
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorWSO2
 
Tokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdfTokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdfssuser2ae721
 

Similaire à Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021 (20)

Precomputing recommendations with Apache Beam
Precomputing recommendations with Apache BeamPrecomputing recommendations with Apache Beam
Precomputing recommendations with Apache Beam
 
Beginner's Guide to High Availability for Postgres - French
Beginner's Guide to High Availability for Postgres - FrenchBeginner's Guide to High Availability for Postgres - French
Beginner's Guide to High Availability for Postgres - French
 
Public Sector Virtual Town Hall: High Availability for PostgreSQL
Public Sector Virtual Town Hall: High Availability for PostgreSQLPublic Sector Virtual Town Hall: High Availability for PostgreSQL
Public Sector Virtual Town Hall: High Availability for PostgreSQL
 
MIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3C
MIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3CMIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3C
MIPI DevCon 2020 | Interoperability Challenges and Solutions for MIPI I3C
 
Beginners Guide to High Availability for Postgres
Beginners Guide to High Availability for PostgresBeginners Guide to High Availability for Postgres
Beginners Guide to High Availability for Postgres
 
WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0
WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0
WebRTC Standards & Implementation Q&A - Testing WebRTC 1.0
 
Beginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for PostgresBeginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for Postgres
 
Debugging Complex Issues in Web Applications
Debugging Complex Issues in Web ApplicationsDebugging Complex Issues in Web Applications
Debugging Complex Issues in Web Applications
 
Beginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for Postgres Beginner's Guide to High Availability for Postgres
Beginner's Guide to High Availability for Postgres
 
Supercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the EdgeSupercharging Optimizely Performance by Moving Decisions to the Edge
Supercharging Optimizely Performance by Moving Decisions to the Edge
 
Performance test
Performance testPerformance test
Performance test
 
Advanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applicationsAdvanced technologies and techniques for debugging HPC applications
Advanced technologies and techniques for debugging HPC applications
 
Brushing skills on SignalR for ASP.NET developers
Brushing skills on SignalR for ASP.NET developersBrushing skills on SignalR for ASP.NET developers
Brushing skills on SignalR for ASP.NET developers
 
booting-booster-final-20160420-0700
booting-booster-final-20160420-0700booting-booster-final-20160420-0700
booting-booster-final-20160420-0700
 
Continuously Integrating Distributed Code at Netflix
Continuously Integrating Distributed Code at NetflixContinuously Integrating Distributed Code at Netflix
Continuously Integrating Distributed Code at Netflix
 
E- Mortgage Syatem or Girvi System
E- Mortgage Syatem or Girvi SystemE- Mortgage Syatem or Girvi System
E- Mortgage Syatem or Girvi System
 
Bof4162 kovalsky
Bof4162 kovalskyBof4162 kovalsky
Bof4162 kovalsky
 
Scaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache BeamScaling machine learning to millions of users with Apache Beam
Scaling machine learning to millions of users with Apache Beam
 
Troubleshooting and Best Practices with WSO2 Enterprise Integrator
Troubleshooting and Best Practices with WSO2 Enterprise IntegratorTroubleshooting and Best Practices with WSO2 Enterprise Integrator
Troubleshooting and Best Practices with WSO2 Enterprise Integrator
 
Tokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdfTokyo AK Meetup Speedtest - Share.pdf
Tokyo AK Meetup Speedtest - Share.pdf
 

Plus de StreamNative

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...StreamNative
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...StreamNative
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022StreamNative
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...StreamNative
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...StreamNative
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022StreamNative
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...StreamNative
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022StreamNative
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...StreamNative
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022StreamNative
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022StreamNative
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022StreamNative
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022StreamNative
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022StreamNative
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022StreamNative
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...StreamNative
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...StreamNative
 

Plus de StreamNative (20)

Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
Is Using KoP (Kafka-on-Pulsar) a Good Idea? - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
Blue-green deploys with Pulsar & Envoy in an event-driven microservice ecosys...
 
Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...Distributed Database Design Decisions to Support High Performance Event Strea...
Distributed Database Design Decisions to Support High Performance Event Strea...
 
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
Simplify Pulsar Functions Development with SQL - Pulsar Summit SF 2022
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
Validating Apache Pulsar’s Behavior under Failure Conditions - Pulsar Summit ...
 
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
Cross the Streams! Creating Streaming Data Pipelines with Apache Flink + Apac...
 
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
Message Redelivery: An Unexpected Journey - Pulsar Summit SF 2022
 
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
Unlocking the Power of Lakehouse Architectures with Apache Pulsar and Apache ...
 
Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022Understanding Broker Load Balancing - Pulsar Summit SF 2022
Understanding Broker Load Balancing - Pulsar Summit SF 2022
 
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
Building an Asynchronous Application Framework with Python and Pulsar - Pulsa...
 
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
Pulsar's Journey in Yahoo!: On-prem, Cloud and Hybrid - Pulsar Summit SF 2022
 
Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022Event-Driven Applications Done Right - Pulsar Summit SF 2022
Event-Driven Applications Done Right - Pulsar Summit SF 2022
 
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
Pulsar @ Scale. 200M RPM and 1K instances - Pulsar Summit SF 2022
 
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
Data Democracy: Journey to User-Facing Analytics - Pulsar Summit SF 2022
 
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
Beam + Pulsar: Powerful Stream Processing at Scale - Pulsar Summit SF 2022
 
Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022Welcome and Opening Remarks - Pulsar Summit SF 2022
Welcome and Opening Remarks - Pulsar Summit SF 2022
 
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
Log System As Backbone – How We Built the World’s Most Advanced Vector Databa...
 
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
MoP(MQTT on Pulsar) - a Powerful Tool for Apache Pulsar in IoT - Pulsar Summi...
 

Dernier

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 

Dernier (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021

  • 1. ©︎2021 Yahoo Japan Corporation All rights reserved. Reduce redundant producers from partitioned producer Yuri Mizushima - Yahoo Japan Corporation
  • 2. ©︎2021 Yahoo Japan Corporation All rights reserved. 1. Background of the issue 2. Solution 3. Benchmark 4. Conclusion 2 Agenda
  • 3. ©︎2021 Yahoo Japan Corporation All rights reserved. 1. Background of the issue 2. Solution 3. Benchmark 4. Conclusion 3 Agenda
  • 4. ©︎2021 Yahoo Japan Corporation All rights reserved. Apache Pulsar in Yahoo! JAPAN 1. Background of the issue 4 In Yahoo! JAPAN, Apache Pulsar is used in many use cases. • Notification of contents update • Job queuing • etc.
  • 5. ©︎2021 Yahoo Japan Corporation All rights reserved. One of use cases, metrics/logs streaming pipeline 1. Background of the issue 5 Also, Pulsar is used in metrics and logs streaming pipeline.
  • 6. ©︎2021 Yahoo Japan Corporation All rights reserved. One of use case, metrics/logs stream pipeline 1. Background of the issue 6 In this case, an unspecified number of producers connect to partitioned topics. It causes 1. The number of producers exceeds the limit 2. Redundant producers are created per instances
  • 7. ©︎2021 Yahoo Japan Corporation All rights reserved. The number of producers exceeds the limit 1. Background of the issue 7 Pulsar has a config maxProducersPerTopic. "ill-behaved" clients which increase producers infinitely can make the topic producer-full. To solve this issue, • Limit the number of producers/consumers that can connect per topic for each IP address (https://github.com/apache/pulsar/pull/10188) • Even if the number of producer is increased by a client, the influence is suppressed within only single IP address • We don’t explain the detail in this session
  • 8. ©︎2021 Yahoo Japan Corporation All rights reserved. Redundant producers are created per instances 1. Background of the issue 8 When a producer connects to the partitioned topic, sometimes it has redundant internal producers. 1. Relatively "low-rate" producers • The number of partitions needs to be increased according to total throughput • However, creating internal producers for all partitions is inefficient for producers whose throughput is small enough to be handled by a few partitions 2. SinglePartition routing mode • Each producer uses only one partition • other internal producers are redundant
  • 9. ©︎2021 Yahoo Japan Corporation All rights reserved. Next... 1. Background of the issue 9 In this session, 1. The number of producers exceeds the limit 2. Redundant producers are created per instances
  • 10. ©︎2021 Yahoo Japan Corporation All rights reserved. 1. Background of the issue 2. Solution 3. Benchmark 4. Conclusion 10 Agenda
  • 11. ©︎2021 Yahoo Japan Corporation All rights reserved. Concepts 2. Solution 11 Reduce the number of producers to use system resources (e.g. client heap) more efficiently.
  • 12. ©︎2021 Yahoo Japan Corporation All rights reserved. Detail - limiting by number of internal producers 2. Solution 12 When a partitioned producer connects to the topic, the client can randomly choose the limiting number of internal producers and internal partitions as well.
  • 13. ©︎2021 Yahoo Japan Corporation All rights reserved. Detail - lazy-loading 2. Solution 13 First, at initialization step, a partitioned producer connects to only one of partitions for authn. and authz. instead of all partitions. When the internal producer is created, validate authn., and validate authz. at this topic.
  • 14. ©︎2021 Yahoo Japan Corporation All rights reserved. Detail - lazy-loading 2. Solution 14 Second, at message sending step, partitions are chosen by message router. Each internal producer is created on the first time to be chosen by message router.
  • 15. ©︎2021 Yahoo Japan Corporation All rights reserved. Detail - partial round-robin 2. Solution 15 Also, we add new custom routing mode PartialRoundRobinMessageRouter. This mode supports round-robin with limiting the number of partitions.
  • 16. ©︎2021 Yahoo Japan Corporation All rights reserved. Detail - partitioned producer stats 2. Solution 16 Partitioned producer stats are accumulated for all partitions. Accumulating procedure supposes “all the partitions have the same producer”.
  • 17. ©︎2021 Yahoo Japan Corporation All rights reserved. Detail - partitioned producer stats 2. Solution 17 We introduce producerStatsKey. Publisher stats with the same value for this property are accumulated as same producer.
  • 18. ©︎2021 Yahoo Japan Corporation All rights reserved. Next... 2. Solution 18 Check the performance by proposed and existing.
  • 19. ©︎2021 Yahoo Japan Corporation All rights reserved. 1. Background of the issue 2. Solution 3. Benchmark 4. Conclusion 19 Agenda
  • 20. ©︎2021 Yahoo Japan Corporation All rights reserved. Criterion 3. Benchmark 20 • Broker heap usage, CPU percentage • Client heap usage, CPU percentage • Number of TCP connections between client and broker • Producer initialization time • 99pct latency average
  • 21. ©︎2021 Yahoo Japan Corporation All rights reserved. Assumption 3. Benchmark 21 • Send any messages without batching • Send any message with PartialRoundRobinMessageRouter(with proposed) and RoundRobinPartition(with existing) • Create a partitioned topic with 30 partitions • Send 1024 bytes messages with 5000 rps • Existing • commit hash: 0b71b13
  • 22. ©︎2021 Yahoo Japan Corporation All rights reserved. Procedure and Variables 3. Benchmark 22 • Procedure 1. Run b broker servers 2. Create a producer limiting l partitions 3. Send messages for 10 minutes 4. (Check stats) • Variables • b=5, l=3 • b=5, l=5 • b=10, l=5
  • 23. ©︎2021 Yahoo Japan Corporation All rights reserved. Environment 3. Benchmark 23 • macOS 10.15.7 • 2.5GHz dual-core Intel Core i7 • 16 GB 2133 MHz LPDDR3 • AdoptOpenJDK-11.0.11+9
  • 24. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=3, proposed 3. Benchmark 24 • Number of TCP connections between client and broker • 3 • Producer initialization time • 463 [ms]
  • 25. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=3, proposed 3. Benchmark 25 client heap/CPU broker heap/CPU
  • 26. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=3, existing 3. Benchmark 26 • Number of TCP connections between client and broker • 5 • Producer initialization time • 1389 [ms]
  • 27. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=3, existing 3. Benchmark 27 client heap/CPU broker heap/CPU
  • 28. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=3 3. Benchmark 28 99pct latency average
  • 29. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=5, proposed 3. Benchmark 29 • Number of TCP connections between client and broker • 5 • Producer initialization time • 862 [ms]
  • 30. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=5, proposed 3. Benchmark 30 client heap/CPU broker heap/CPU
  • 31. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=5, l=5 3. Benchmark 31 99pct latency average
  • 32. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=10, l=5, proposed 3. Benchmark 32 • Number of TCP connections between client and broker • 4 • Producer initialization time • 1070 [ms]
  • 33. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=10, l=5, proposed 3. Benchmark 33 client heap/CPU broker heap/CPU
  • 34. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=10, l=5, existing 3. Benchmark 34 • Number of TCP connections between client and broker • 5 • Producer initialization time • 1147 [ms]
  • 35. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=10, l=5, existing 3. Benchmark 35 client heap/CPU broker heap/CPU
  • 36. ©︎2021 Yahoo Japan Corporation All rights reserved. Result - b=10, l=5 3. Benchmark 36 99pct latency average
  • 37. ©︎2021 Yahoo Japan Corporation All rights reserved. Consideration 3. Benchmark 37 • Client side • Producer initialization time is faster than “existing” • Number of TCP connections is less than or equal to “existing” • Suppose a cluster has enough brokers. If we use the feature, number of TCP connections is less than or equal to “limit+1” • Heap usage is less than “existing” • Some cases of CPU percentage is stabler than “existing”
  • 38. ©︎2021 Yahoo Japan Corporation All rights reserved. Consideration 3. Benchmark 38 Number of TCP connections Producer initialization time [ms] proposed existing proposed existing b=5, l=3 3 5 463 1389 b=5, l=5 5 5 862 1389 b=10, l=5 4 5 1070 1147
  • 39. ©︎2021 Yahoo Japan Corporation All rights reserved. Consideration 3. Benchmark 39 • Broker side • No significant effects to heap usage • Some brokers of CPU percentage is less than “existing” at “b=5, l=3” • Maybe because topic load isn’t distributed to all brokers • CPU percentage per broker is greater than “existing” at “b=10, l=5” • Maybe because topic load isn’t distributed to all brokers • Number of running brokers (whose CPU/heap usages were increased) is less than "existing" • Number of running brokers depends on "active topics" where producers are actually connected
  • 40. ©︎2021 Yahoo Japan Corporation All rights reserved. Consideration 3. Benchmark 40 Number of brokers which system resources were increased proposed existing b=5, l=3 2 4 b=5, l=5 4 4 b=10, l=5 3 4 b=5, l=3 Number of topics which is loaded proposed existing broker0 0 8 broker1 1 0 broker2 2 7 broker3 0 7 broker4 0 8
  • 41. ©︎2021 Yahoo Japan Corporation All rights reserved. 1. Background of the issue 2. Solution 3. Benchmark 4. Conclusion 41 Agenda
  • 42. ©︎2021 Yahoo Japan Corporation All rights reserved. Conclusion 4. Conclusion 42 In conclusion, • Implement producer lazy-loading and partial round-robin feature • and fix partitioned producer stats collecting logic • Check the performance by toy example • Future tasks • Also implement the feature to other clients • e.g. C++, Go, etc. I’m really excited to be involved in Apache Pulsar.
  • 43. ©︎2021 Yahoo Japan Corporation All rights reserved.
  • 44. ©︎2021 Yahoo Japan Corporation All rights reserved. Consideration - latency Appendix 44 • Result of latency is strange by existting one • Retest it and reproduce strange behavior

Notes de l'éditeur

  1. I’m glad to attend great summit as a speaker. Now, let’s start to talk about “Reduce redundant producers from partitioned producer”. (If we have much time, add demonstration step)
  2. Here is today’s agenda. First, I will talk about background of the issue about producer connections. Second is the solution of the issue. Third is benchmarking about the solution. And last is conclusion.
  3. Let’s start to talk about the background of the issue.
  4. In Yahoo! JAPAN, Apache Pulsar is used in many use cases. For example, notification of contents update, job queuing, etc.
  5. Also, Pulsar is used in metrics and logs streaming pipeline. Both metrics and logs is sent to the topic from computing instances such as IaaS, PaaS, CaaS, etc. These are received by metrics or logging platform.
  6. In this case, an unspecified number of producers connect to partitioned topic from computing instances. It causes these issues. First, the number of producers exceeds the limit. Second, redundant producers are created per computing instances.
  7. I’ll explain about first issue. Pulsar has a config maxProducersPerTopic. "ill-behaved" clients which increase producers infinitely can make the topic producer-full. To solve this issue, introduce the config to restrict number of producers and consumers for each IP address. We don’t explain the detail in this session because it is already merged. If interested, please check this link.
  8. I’ll explain about second issue. When a producer connects to the partitioned topic, sometimes it has redundant internal producers. Some cases as below. First, relatively "low-rate" producers. The number of partitions needs to be increased according to total throughput. However, creating internal producers for all partitions is inefficient for producers whose throughput is small enough to be handled by a few partitions. Second, using single partition routing mode. A partitioned producer creates internal producer for all partitions. In this case, each producer use only one partition. Therefore, other internal producers are redundant.
  9. In this session, I will talk about second one.
  10. Now, let’s talk about the solution.
  11. Here is the concept of solving the issue. Reduce the number of producers to use system resources more efficiently. As you can see, each partitioned producer connects to part of partitions.
  12. When a partitioned producer connects to the topic, the client can randomly choose the limiting number of internal producer and internal partitions as well. To implement, I introduce producer lazy-loading feature and custom routing mode. From now, I will explain detailed solutions.
  13. First, at initialization step, a partitioned producer connects to only one of partitions for authentication and authorization instead of all partitions. When the internal producer is created, validate authentication and validate authorization at this topic.
  14. Second, at message sending step, partitions are chosen by message router. Each internal producer is created on the first time to be chosen by message router. Therefore, number of internal producer depends on message routing policy.
  15. Also, we add new custom routing mode PartialRoundRobinMessageRouter. This mode supports round-robin with limiting number of partitions. Also, when producer creates by SinglePartition routing mode, then creates only one internal producer.
  16. In previous implementation, a partitioned producer could connect to part of partition. It causes another issue about partitioned producer stats. Partitioned producer stats are accumulated for all partitions. Accumulating procedure supposes “all partitions have same producer”. Therefore, we couldn’t get correct stats by partial producer like right one. We would like to get correct stats not only total producer but also partial producer.
  17. To solve this issue, we introduce producerStatsKey. Publisher stats with the same value for this property are accumulated as same producer. Also add behavior that a partitioned producer sets same producerStatsKey to internal producer. By this feature, we could get correct partitioned producer stats.
  18. Next, I will talk about benchmarking result with and without this feature by toy example.
  19. Now, let’s talk about the benchmarking.
  20. Criterion of this benchmark is here.
  21. Assumption of this benchmark is here.
  22. Procedure and variables of this benchmark is here.
  23. Environment of this benchmark is here. In this benchmarking, each brokers and client are run as a process on single laptop.
  24. Result of b=5, l=3, proposed is here.
  25. and here.
  26. Result of b=5, l=3, existing is here.
  27. and here.
  28. Latency of b=5, l=3 is here.
  29. Result of b=5, l=5, proposed is here.
  30. and here.
  31. Latency of b=5, l=5 is here. For existing side, conditions are equal to b=5, l=3. So, reuse the same result.
  32. Result of b=10, l=5, proposed is here.
  33. and here.
  34. Result of b=10, l=5, existing is here.
  35. and here.
  36. Latency of b=10, l=5 is here.
  37. Now, consider about the result. For client side, the result suggests these behavior. Particularly, number of TCP connections is less than or equal to “existing”. If we use the feature, number of TCP connections is less than or equal to “limit+1”. That is trivial.
  38. Here is a part of result. Please look at b=5, l=3 row. As you can see, both number of TCP connections and producer initialization time are less than “existing”.
  39. For broker side, the result suggests these behavior. Particularly, number of brokers which these values were increased in is less than “existing”. Probably because number of running brokers is depends on active topics. Moreover, CPU percentage per broker is greater than “existing” in some brokers. Maybe because topic load isn’t distributed to all brokers.
  40. Here is a part of result. Please look at right one. This matrix shows the number of topics which is loaded by broker. As you can see, proposed one is not completely distributed. The ratio is 1 vs 2. In contrast, existing one is distributed. It causes load and system resource bias between brokers. Therefore, before using this feature we should take care the config about limiting number of partitions or implement smart message router like considering current load.
  41. Next, I will talk about conclusion.
  42. In conclusion, I talked about implementation of producer lazy-loading and partial round-robin feature and its performance by toy example. The one of future tasks is to implement the feature to other clients such as C++, Go, etc. I’m really excited to be involved in Apache Pulsar.
  43. Thank you for your attention. My talk is all finished.