SlideShare une entreprise Scribd logo
1  sur  29
High performance
queues with Cassandra
Mikalai Alimenkou
http://xpinjection.com
@xpinjection
Kiev, Ukraine
#евромайдан
How to
process
data, master
?

Asynchronously!
Queues usage scenario
More
realistic
scenario
More
realistic
scenario
Are you crazy?
Cassandra for
queues?
So many cool MQ providers
Initial expected loading
Some specific requirements
1

Message
Queue

External Service
Provider
Some specific requirements
Message
Queue

1

External Service
Provider
Some specific requirements
2

Message
Queue

External Service
Provider
Some specific requirements
Message
Queue

2
1
Redeliver
after 1 hour

External Service
Provider
Some specific requirements
2

Message
Queue

External Service
Provider

Redeliver
after 1 hour

Redelivery business logic and external service
hourly based usage limits
Idea came from railways
“Body flow” in regular life
Message batches “station”
QUEUE
NOW

ALMOST READY

+1 HOUR

WAITING

+6 HOURS

WAITING

+12 HOURS

WAITING
System components: Message
MESSAGE = REAL REQUEST DATA WITH UNIQUE ID
1ST FIELD

3rd FIELD

{ field data}

MESSAGE ID

2nd FIELD
{ field data}

{ field data}

ID FORMAT ALLOWS 4096 MESSAGES
PER MILLISECOND FROM ONE NODE
Timestamp

44 bits

Counter

12 bits

Cluster node ID

8 bits
System components: Batch
• Open for at least 1 second
• Closing if opened for > 10 seconds
• Closing if has > 100 messages
Ascending columns ordering
1ST MESSSAGE ID

2nd MESSAGE ID

3rd MESSAGE ID

BATCH ID { opt message data} { opt message data} { opt message data}

ID FORMAT REQUIRE BATCH TO BE OPENED FOR > 1 SECOND
Timestamp
Rounded to seconds

Cluster node ID + Batch Type
Last 3 digits
System components: Queue
• Similar to batch
• Unlimited
• May have batches with past time

Ascending columns ordering
1ST BATCH ID
QUEUE NAME

2nd BATCH ID

3rd BATCH ID

{ processed at }

{ processed at }

{ processed at }
System components: Broker
batches polling
BROKER
check batch time
process batch
PROCESSOR

QUEUE

lock batch for
processing

ZOOKEEPER

•
•
•
•
•

Natural pre-fetch thanks to batches
Easy to control messages processing
Simple concurrency model
Easy scalable between nodes
No high loading on Cassandra
System components: Processor
PROCESSOR
System components: Processor
PROCESSOR
OK
System components: Processor
PROCESSOR
System components: Processor
PROCESSOR

redeliver
on failure
ANOTHER BATCH

•
•
•
•

Tries to process messages as quickly as possible
On error just redeliver message
Messages are processed concurrently
Any redelivery business logic is easy to implement
Warnings and benefits
• Message and batch must be
checked before processing
• Hard to explain “queue” size
• Separate columns for status
tracking of message
• Perform correct compaction
from time to time

• Expected loading is handled
with single node
• Everything works on
commodity hardware
• Single storage for all data
• System is easily scalable and
reliable (no message was
lost)
Show me the code!
@xpinjection
http://xpinjection.com
mikalai.alimenkou@xpinjection.com

Contenu connexe

Tendances

Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
DataWorks Summit
 

Tendances (20)

Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014Storm@Twitter, SIGMOD 2014
Storm@Twitter, SIGMOD 2014
 
Dawid Weiss- Finite state automata in lucene
 Dawid Weiss- Finite state automata in lucene Dawid Weiss- Finite state automata in lucene
Dawid Weiss- Finite state automata in lucene
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr Consuming RealTime Signals in Solr
Consuming RealTime Signals in Solr
 
Understanding and Improving Code Generation
Understanding and Improving Code GenerationUnderstanding and Improving Code Generation
Understanding and Improving Code Generation
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
Apache Pulsar with MQTT for Edge Computing - Pulsar Summit Asia 2021
 
카프카, 산전수전 노하우
카프카, 산전수전 노하우카프카, 산전수전 노하우
카프카, 산전수전 노하우
 
Advanced nGrinder
Advanced nGrinderAdvanced nGrinder
Advanced nGrinder
 
FSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory ReportingFSI301 An Architecture for Trade Capture and Regulatory Reporting
FSI301 An Architecture for Trade Capture and Regulatory Reporting
 
Using LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache ArrowUsing LLVM to accelerate processing of data in Apache Arrow
Using LLVM to accelerate processing of data in Apache Arrow
 
Event Sourcing from the Trenches (with examples from .NET)
Event Sourcing from the Trenches (with examples from .NET)Event Sourcing from the Trenches (with examples from .NET)
Event Sourcing from the Trenches (with examples from .NET)
 
ksqlDB로 시작하는 스트림 프로세싱
ksqlDB로 시작하는 스트림 프로세싱ksqlDB로 시작하는 스트림 프로세싱
ksqlDB로 시작하는 스트림 프로세싱
 
Presto At Treasure Data
Presto At Treasure DataPresto At Treasure Data
Presto At Treasure Data
 
카프카(kafka) 성능 테스트 환경 구축 (JMeter, ELK)
카프카(kafka) 성능 테스트 환경 구축 (JMeter, ELK)카프카(kafka) 성능 테스트 환경 구축 (JMeter, ELK)
카프카(kafka) 성능 테스트 환경 구축 (JMeter, ELK)
 
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
 
Stephan Ewen - Scaling to large State
Stephan Ewen - Scaling to large StateStephan Ewen - Scaling to large State
Stephan Ewen - Scaling to large State
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay...
Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay...Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay...
Concord: Simple & Flexible Stream Processing on Apache Mesos: Data By The Bay...
 

Similaire à High performance queues with Cassandra

Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIs
Tanya Denisyuk
 
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Paul Brebner
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
Paul Brebner
 

Similaire à High performance queues with Cassandra (20)

Donatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIsDonatas Mažionis, Building low latency web APIs
Donatas Mažionis, Building low latency web APIs
 
Network protocols and vulnerabilities
Network protocols and vulnerabilitiesNetwork protocols and vulnerabilities
Network protocols and vulnerabilities
 
Debunking Common Myths in Stream Processing
Debunking Common Myths in Stream ProcessingDebunking Common Myths in Stream Processing
Debunking Common Myths in Stream Processing
 
Stephan Ewen - Experiences running Flink at Very Large Scale
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
 
Prometheus Is Good for Your Small Startup - ShuttleCloud Corp. - 2016
Prometheus Is Good for Your Small Startup - ShuttleCloud Corp. - 2016Prometheus Is Good for Your Small Startup - ShuttleCloud Corp. - 2016
Prometheus Is Good for Your Small Startup - ShuttleCloud Corp. - 2016
 
EXPERTALKS: Nov 2012 - Web Application Clustering
EXPERTALKS: Nov 2012 - Web Application ClusteringEXPERTALKS: Nov 2012 - Web Application Clustering
EXPERTALKS: Nov 2012 - Web Application Clustering
 
Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28Openstack meetup lyon_2017-09-28
Openstack meetup lyon_2017-09-28
 
Kafka overview v0.1
Kafka overview v0.1Kafka overview v0.1
Kafka overview v0.1
 
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQLKafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
Kafka Summit SF 2017 - Kafka Stream Processing for Everyone with KSQL
 
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
(BDT403) Best Practices for Building Real-time Streaming Applications with Am...
 
3.2 Streaming and Messaging
3.2 Streaming and Messaging3.2 Streaming and Messaging
3.2 Streaming and Messaging
 
Ingestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache ApexIngestion and Dimensions Compute and Enrich using Apache Apex
Ingestion and Dimensions Compute and Enrich using Apache Apex
 
2.communcation in distributed system
2.communcation in distributed system2.communcation in distributed system
2.communcation in distributed system
 
Scalable IoT platform
Scalable IoT platformScalable IoT platform
Scalable IoT platform
 
OpenStack High Availability
OpenStack High AvailabilityOpenStack High Availability
OpenStack High Availability
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
Melbourne Big Data Meetup Talk: Scaling a Real-Time Anomaly Detection Applica...
 
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...ApacheCon2019 Talk: Kafka, Cassandra and Kubernetesat Scale – Real-time Ano...
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
 
Performance
PerformancePerformance
Performance
 
Data Stream Processing with Apache Flink
Data Stream Processing with Apache FlinkData Stream Processing with Apache Flink
Data Stream Processing with Apache Flink
 

Plus de Mikalai Alimenkou

Бытовая классификация тестировщиков с точки зрения разработчика
Бытовая классификация тестировщиков с точки зрения разработчикаБытовая классификация тестировщиков с точки зрения разработчика
Бытовая классификация тестировщиков с точки зрения разработчика
Mikalai Alimenkou
 

Plus de Mikalai Alimenkou (20)

Rise and fall of Story Points. Capacity based planning from the trenches.
Rise and fall of Story Points. Capacity based planning from the trenches.Rise and fall of Story Points. Capacity based planning from the trenches.
Rise and fall of Story Points. Capacity based planning from the trenches.
 
Static analysis tools as the best friend of QA
Static analysis tools as the best friend of QAStatic analysis tools as the best friend of QA
Static analysis tools as the best friend of QA
 
Modern CI/CD in the microservices world with Kubernetes
Modern CI/CD in the microservices world with KubernetesModern CI/CD in the microservices world with Kubernetes
Modern CI/CD in the microservices world with Kubernetes
 
Saga about distributed business transactions in microservices world
Saga about distributed business transactions in microservices worldSaga about distributed business transactions in microservices world
Saga about distributed business transactions in microservices world
 
Effectiveness tips from Kubernetes trenches by Captain Obvious
Effectiveness tips from Kubernetes trenches by Captain ObviousEffectiveness tips from Kubernetes trenches by Captain Obvious
Effectiveness tips from Kubernetes trenches by Captain Obvious
 
Ride the database in JUnit tests with Database Rider
Ride the database in JUnit tests with Database RiderRide the database in JUnit tests with Database Rider
Ride the database in JUnit tests with Database Rider
 
Wastful waste or why everything is so slow in development
Wastful waste or why everything is so slow in developmentWastful waste or why everything is so slow in development
Wastful waste or why everything is so slow in development
 
Hexagonal architecture with Spring Boot
Hexagonal architecture with Spring BootHexagonal architecture with Spring Boot
Hexagonal architecture with Spring Boot
 
Wastful waste or why everything is so slow in development
Wastful waste or why everything is so slow in developmentWastful waste or why everything is so slow in development
Wastful waste or why everything is so slow in development
 
DevOps checklist or how to understand where is your team in DevOps landscape ...
DevOps checklist or how to understand where is your team in DevOps landscape ...DevOps checklist or how to understand where is your team in DevOps landscape ...
DevOps checklist or how to understand where is your team in DevOps landscape ...
 
DevOps checklist or how to understand where is your team in DevOps landscape
DevOps checklist or how to understand where is your team in DevOps landscapeDevOps checklist or how to understand where is your team in DevOps landscape
DevOps checklist or how to understand where is your team in DevOps landscape
 
Практические трудности в разработке Медкарты для целой страны
Практические трудности в разработке Медкарты для целой страныПрактические трудности в разработке Медкарты для целой страны
Практические трудности в разработке Медкарты для целой страны
 
Hexagonal architecture with Spring Boot [EPAM Java online conference]
Hexagonal architecture with Spring Boot [EPAM Java online conference]Hexagonal architecture with Spring Boot [EPAM Java online conference]
Hexagonal architecture with Spring Boot [EPAM Java online conference]
 
Bro, manage test data like a pro! [QA Fest 2018]
Bro, manage test data like a pro! [QA Fest 2018]Bro, manage test data like a pro! [QA Fest 2018]
Bro, manage test data like a pro! [QA Fest 2018]
 
Agile antipatterns: review after 10 years of practice
Agile antipatterns: review after 10 years of practiceAgile antipatterns: review after 10 years of practice
Agile antipatterns: review after 10 years of practice
 
Hexagonal architecture with Spring Boot
Hexagonal architecture with Spring BootHexagonal architecture with Spring Boot
Hexagonal architecture with Spring Boot
 
Bro, manage test data like a pro!
Bro, manage test data like a pro!Bro, manage test data like a pro!
Bro, manage test data like a pro!
 
Бытовая классификация тестировщиков с точки зрения разработчика
Бытовая классификация тестировщиков с точки зрения разработчикаБытовая классификация тестировщиков с точки зрения разработчика
Бытовая классификация тестировщиков с точки зрения разработчика
 
Code Review tool for personal effectiveness and waste analysis
Code Review tool for personal effectiveness and waste analysisCode Review tool for personal effectiveness and waste analysis
Code Review tool for personal effectiveness and waste analysis
 
Funny stories and anti-patterns from DevOps landscape
Funny stories and anti-patterns from DevOps landscapeFunny stories and anti-patterns from DevOps landscape
Funny stories and anti-patterns from DevOps landscape
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

High performance queues with Cassandra