High performance messaging with Apache Pulsar

HIGH PERFORMANCE MESSAGING
WITH APACHE PULSAR
http://pulsar.apache.org

WHAT IS PULSAR?
“Pub-Sub messaging backed by durable log storage”

WHAT IS APACHE PULSAR?
3
Multi-tenancy
A single cluster can
support many tenants
and use cases
Ordering
Guaranteed ordering
Durability
Data replicated and
synced to disk
Delivery Guarantees
At least once, at most
once and effectively
once
Highly scalable
Can support millions
of topics
Uniﬁed messaging model
Support both Topic &
Queue semantic in a
single model
Geo-replication
Out of box support for
geographically distributed
applications
High throughput
Can reach 1.8 M
messages/s in a single
partition
Low Latency
Low publish latency of
5ms at 99pct

DISTRIBUTEDVSVERTICAL
• Focus is very different
• While both target overall system performance
• Distributed systems generally focus on distributed perf
6

DISTRIBUTED SYSTEMS PERFORMANCE
• Key factors:
• How different components interact
• How data is replicated
• Can each component make progress while waiting for other
7

STATEFUL SYSTEMS
• Macro optimizations typically yield orders of magnitude differences
• Ensure throughput is not bottlenecked by waiting
• In failure path (eg: how to replace a failed node)
• In ops tasks (eg: how to expand a cluster)
8

STATEFUL SYSTEMS
• Stateful systems can become unbalanced when trafﬁc changes
• The system needs to be designed to allow for quick reaction,
distributing the load across all nodes
9

VERTICAL OPTIMIZATIONS
• Ensure a single machine can give the max throughput
• Optimize thread access
• Concurrent data structures
• Micro-Proﬁling
10

SEGMENT
CENTRIC
STORAGE
• In addition to partitioning,
messages are stored in segments
(based on time and size)
• Segments are independent from
each others and spread across
all storage nodes

SEGMENT CENTRIC
• Unbounded log storage
• Instant scaling without data rebalancing
• High write and read availability via maximized data placement options
• Fast replica repair — many-to-many read
13

COMPARISON WITH APACHE KAFKA
• In Kafka, partitions are sticky to brokers
• A single partition is stored entirely in a single node
• Retention is limited by a single node storage capacity
• Failure recovery and capacity expansion require “rebalancing”
• Rebalancing has a big impact over the system, affecting regular trafﬁc
15

DATA PATH
1 — Publisher sends message to broker

DATA PATH
2 — Broker writes in parallel to N replicas

DATA PATH
3 — Wait for a quorum of acks from bookies

DATA PATH
4 — Send ack to producer — Dispatch to consumer

BOOKKEEPER REPLICATION MODEL
• Single writer (Pulsar broker)
• Write in parallel to multiple storage nodes
• Wait for a conﬁgurable number of acks
• Supports quorum writes (eg: write 3 nodes — wait 2 acks)
• Perform recovery only after writer crashes
• Establish what was the last committed entry and “seal” the segment
20

KAFKA REPLICATION MODEL
ISR replication

LIMITATIONS OF KAFKA REPLICATION
• When followers are “in-sync”, the leader will have to wait for them —
cannot prune the slowest follower
• Leader election happens per partition
• To ensure ordering, only 1 message (or batch) can be outstanding —
limits throughput
• Reads can only happen from leader broker
22

STORAGE
• Disk access patterns can lead to order of magnitude differences
• Systems that rely on page-cache have unpredictable performance
• Page cache is RAM speed until the system is under stress
• After that, memory accesses can take 100s of millis
24

BOOKKEEPER STORAGE
• IO isolation between write and read operations
• Slow consumers won’t impact latency
• Very effective IO patterns:
• Journal — append only and no reads
• Storage device — bulk write and sequential reads
• Number of ﬁles is independent from number of topics
26

KAFKA STORAGE
• Multiple ﬁles per each partitions — each segment has data
• Lot of ﬁle descriptors needed
• Page cache only works well until active data set exceed RAM size
• IO is scattered throughout the disk
27

OPTIMIZATIONS
• Payload Buffer pooling — Direct memory — No heap pollution
• Object pooling in data path — minimize GC work
• Serialize operations to thread to avoid mutex contention
• Pulsar brokers acts as a “proxy” — Payloads are forwarded with
zero-copies from producers to storage and consumers
29

OPENMESSAGING
BENCHMARK
openmessaging.cloud
openmessaging.cloud/docs/benchmarks

BENCHMARK FRAMEWORK
• Designed to measure performance of distributed messaging systems
• Supports various “drivers” (Kafka, Pulsar, RocketMQ, RabbitMQ)
• Automated deployment in EC2
• Conﬁgure workloads through aYAML ﬁle
32

DISTRIBUTED EXECUTION
Coordinator will take the workload deﬁnition and propagate to multiple
workers — Collects and reports stats

BENCHMARK RESULTS
• Testing goals
• Throughput & latency under different conditions
• Min 2 guaranteed copies
• Running on 3 EC2VMs with local SSDs
34

KAFKA SETTINGS
• Topic settings
replicationFactor=3
min.insync.replicas=2
log.flush.interval.ms= # Using default: means no fsyncs
• Kafka producer conﬁg
acks=all
linger.ms=1
batch.size=131072
35

PULSAR / BOOKKEEPER SETTINGS
• Use ensemble=3 write=3 ack=2
• Write to 3 bookies and wait for 2 acks
• Data synced on disk before ack
36

MaxThroughput
1Topic
1 Partition
1KB payload

MaxThroughput
—
Exactly once
producer
1Topic
1 Partition
1KB payload
—
Kafka settings: 
enable.idempotence=true
max.in.flight.requests.per.connection=1
retries=2147483647

Latency at ﬁxed
throughput
50K msg/s
1Topic
1 Partition
1KB payload

Latency at ﬁxed
throughput
—
99pct
50K msg/s
1Topic
1 Partition
1KB payload

Latency at ﬁxed
throughput
—
(including Kafka-sync)
50K msg/s
1Topic
1 Partition
1KB payload

OPTIMIZING FOR LOW LATENCY
• Testing at a smaller throughput for sub-millisecond latency
• Tested on bare metal server
• Single machine to isolate impact of slow networks
43

HARDWARE SETUP
• 1 Machine — Bare metal
• 12 CPU cores — Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz
• 128 GB RAM
• 2 x 1.2TB NVMe disks
44

Latency at low
throughput
1K msg/s
1Topic
1 Partition
1KB payload

High performance messaging with Apache Pulsar

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à High performance messaging with Apache Pulsar

Similaire à High performance messaging with Apache Pulsar (20)

Dernier

Dernier (20)

High performance messaging with Apache Pulsar