Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021

©︎2021 Yahoo Japan Corporation All rights reserved.
Reduce redundant producers from
partitioned producer
Yuri Mizushima - Yahoo Japan Corporation

1. Background of the issue
2. Solution
3. Benchmark
4. Conclusion
2
Agenda

2. Solution
3. Benchmark
4. Conclusion
3
Agenda

Apache Pulsar in Yahoo! JAPAN
4
In Yahoo! JAPAN, Apache Pulsar is used in many use cases.
• Notification of contents update
• Job queuing
• etc.

One of use cases, metrics/logs streaming pipeline
5
Also, Pulsar is used in metrics and logs streaming pipeline.

One of use case, metrics/logs stream pipeline
6
In this case, an unspecified number of producers connect to partitioned topics. It causes
1. The number of producers exceeds the limit
2. Redundant producers are created
per instances

The number of producers exceeds the limit
7
Pulsar has a config maxProducersPerTopic. "ill-behaved" clients which increase producers
infinitely can make the topic producer-full.
To solve this issue,
• Limit the number of producers/consumers that can connect per topic for each IP address
(https://github.com/apache/pulsar/pull/10188)
• Even if the number of producer is increased by a client, the influence is suppressed within only
single IP address
• We don’t explain the detail in this session

Redundant producers are created per instances
8
When a producer connects to the partitioned topic, sometimes it has redundant internal
producers.
1. Relatively "low-rate" producers
• The number of partitions needs to be increased according to total throughput
• However, creating internal producers for all partitions is inefficient for producers whose
throughput is small enough to be handled by a few partitions
2. SinglePartition routing mode
• Each producer uses only one partition
• other internal producers are redundant

Next...
9
In this session,
1. The number of producers exceeds the limit
2. Redundant producers are created per instances

2. Solution
3. Benchmark
4. Conclusion
10
Agenda

Concepts
2. Solution
11
Reduce the number of producers to use system resources (e.g. client heap) more efficiently.

Detail - limiting by number of internal producers
2. Solution
12
When a partitioned producer connects to the topic, the client can randomly choose the
limiting number of internal producers and internal partitions as well.

Detail - lazy-loading
2. Solution
13
First, at initialization step, a partitioned producer connects to only one of partitions for authn.
and authz. instead of all partitions.
When the internal producer is created, validate authn., and validate authz. at this topic.

Detail - lazy-loading
2. Solution
14
Second, at message sending step, partitions are chosen by message router. Each internal
producer is created on the first time to be chosen by message router.

Detail - partial round-robin
2. Solution
15
Also, we add new custom routing mode PartialRoundRobinMessageRouter. This mode
supports round-robin with limiting the number of partitions.

Detail - partitioned producer stats
2. Solution
16
Partitioned producer stats are accumulated for all partitions. Accumulating procedure
supposes “all the partitions have the same producer”.

Detail - partitioned producer stats
2. Solution
17
We introduce producerStatsKey. Publisher stats with the same value for this property are
accumulated as same producer.

Next...
2. Solution
18
Check the performance by proposed and existing.

2. Solution
3. Benchmark
4. Conclusion
19
Agenda

Criterion
3. Benchmark
20
• Broker heap usage, CPU percentage
• Client heap usage, CPU percentage
• Number of TCP connections between client and broker
• Producer initialization time
• 99pct latency average

Assumption
3. Benchmark
21
• Send any messages without batching
• Send any message with PartialRoundRobinMessageRouter(with proposed) and
RoundRobinPartition(with existing)
• Create a partitioned topic with 30 partitions
• Send 1024 bytes messages with 5000 rps
• Existing
• commit hash: 0b71b13

Procedure and Variables
3. Benchmark
22
• Procedure
1. Run b broker servers
2. Create a producer limiting l partitions
3. Send messages for 10 minutes
4. (Check stats)
• Variables
• b=5, l=3
• b=5, l=5
• b=10, l=5

Environment
3. Benchmark
23
• macOS 10.15.7
• 2.5GHz dual-core Intel Core i7
• 16 GB 2133 MHz LPDDR3
• AdoptOpenJDK-11.0.11+9

Result - b=5, l=3, proposed
3. Benchmark
24
• 3
• 463 [ms]

3. Benchmark
25
client heap/CPU broker heap/CPU

Result - b=5, l=3, existing
3. Benchmark
26
• 5
• 1389 [ms]

3. Benchmark
27

Result - b=5, l=3
3. Benchmark
28
99pct latency average

3. Benchmark
29
• 5
• 862 [ms]

3. Benchmark
30

Result - b=5, l=5
3. Benchmark
31

3. Benchmark
32
• 4
• 1070 [ms]

3. Benchmark
33

3. Benchmark
34
• 5
• 1147 [ms]

3. Benchmark
35

Result - b=10, l=5
3. Benchmark
36

Consideration
3. Benchmark
37
• Client side
• Producer initialization time is faster than “existing”
• Number of TCP connections is less than or equal to “existing”
• Suppose a cluster has enough brokers. If we use the feature, number of TCP connections is less than or equal to
“limit+1”
• Heap usage is less than “existing”
• Some cases of CPU percentage is stabler than “existing”

Consideration
3. Benchmark
38
Number of TCP connections
Producer initialization time
[ms]
proposed existing proposed existing
b=5, l=3 3 5 463 1389
b=5, l=5 5 5 862 1389
b=10, l=5 4 5 1070 1147

Consideration
3. Benchmark
39
• Broker side
• No significant effects to heap usage
• Some brokers of CPU percentage is less than “existing” at “b=5, l=3”
• Maybe because topic load isn’t distributed to all brokers
• CPU percentage per broker is greater than “existing” at “b=10, l=5”
• Maybe because topic load isn’t distributed to all brokers
• Number of running brokers (whose CPU/heap usages were increased) is less than "existing"
• Number of running brokers depends on "active topics" where producers are actually connected

Consideration
3. Benchmark
40
Number of brokers which
system resources were
increased
proposed existing
b=5, l=3 2 4
b=5, l=5 4 4
b=10, l=5 3 4
b=5, l=3
Number of topics which is
loaded
proposed existing
broker0 0 8
broker1 1 0
broker2 2 7
broker3 0 7
broker4 0 8

2. Solution
3. Benchmark
4. Conclusion
41
Agenda

Conclusion
4. Conclusion
42
In conclusion,
• Implement producer lazy-loading and partial round-robin feature
• and fix partitioned producer stats collecting logic
• Check the performance by toy example
• Future tasks
• Also implement the feature to other clients
• e.g. C++, Go, etc.
I’m really excited to be involved in Apache Pulsar.

Consideration - latency
Appendix
44
• Result of latency is strange by existting one
• Retest it and reproduce strange behavior

Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021

Similaire à Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021 (20)

Plus de StreamNative

Plus de StreamNative (20)

Dernier

Dernier (20)

Reduce Redundant Producers from Partitioned Producer - Pulsar Summit NA 2021

Notes de l'éditeur