SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
Building Next
Generation Drivers:
Optimizing Performance
in Go and Rust
Piotr Grabowski, Software Team Leader
Piotr Grabowski
■ Software Team Leader at ScyllaDB responsible for all
ScyllaDB drivers, ScyllaDB Kafka Connectors (ScyllaDB
Sink Connector and ScyllaDB CDC Source Connector)
■ Joined ScyllaDB 2.5 year ago
■ Introduction
■ ScyllaDB Go Driver
■ ScyllaDB Rust Driver
■ Bindings to ScyllaDB Rust Driver
Agenda
Introduction
Drivers 101
■ Drivers (in this presentation) - libraries that allow sending queries to ScyllaDB
■ Primary protocol: CQL (Cassandra Query Language) protocol
■ TCP
■ ScyllaDB supports CQL v4
■ Frame-based protocol, supporting multiple streams
■ Supports LZ4 and Snappy compression
■ ScyllaDB drivers support shard awareness:
■ Driver can connect to a specific shard of ScyllaDB
Drivers 101 - Role of Drivers
■ The role of drivers:
■ Serialization/deserialization of CQL frames
■ Serialization/deserialization of ScyllaDB types
■ Querying and maintaining metadata about tables/nodes
■ Routing requests to correct nodes (and shards)
■ Sending request across network
■ Conveniently constructing and executing queries in your language of choice:
■ gocqlx
■ Java Driver’s Mapper interface
Drivers 101 - Performance
■ How can the driver improve performance?
■ Shard awareness: sending the query to a correct shard
■ Partitioners: ScyllaDB’s CDC (Change Data Capture) implements a custom
partitioner which determines a node to send the query to
■ LWT Optimization: consistently prefer a single replica when executing a LWT
query to avoid Paxos conflicts
■ Optimizing hot paths in the driver:
■ Serialization/deserialization
■ Routing code
■ Avoiding copies, allocations and locks
ScyllaDB Go Driver
■ A new ScyllaDB driver for the Go language
■ Developed by a University of Warsaw student team for their Bachelor’s
thesis
■ Focus of the project: performance
■ Avoiding mutexes
■ Avoiding allocations - reducing GC impact
■ Faster B-trees with generics
ScyllaDB Go Driver
ScyllaDB Go Driver - Mutexes
gocql ScyllaDB Go Driver
■ Approach inspired by ScyllaDB’s shared-nothing architecture
■ Techniques:
■ Better overall picture of the architecture
■ Clear, efficient communication between components
■ Using idiomatic Go channels and atomics
■ Result:
■ The driver only uses a single mutex,
local to a TCP connection
■ Reducing race conditions
ScyllaDB Go Driver - Mutexes
■ Initial benchmarks of gocql have shown that the performance was heavily
influenced by the Go’s Garbage Collector
■ ScyllaDB Go Driver performs 5x less allocations compared to gocql
■ The result of:
■ Meticulously designing each component of the code
■ Faster B-Trees with Generics
■ Reusing buffers owned by writer/reader
ScyllaDB Go Driver - Memory
■ Go 1.18 introduced support for generics
■ Using generics in B-tree implementation yielded 40% performance
improvement compared to Google’s btree implementation
■ The implementation performs zero allocations in all operations (except
cloning)
■ By replacing interface usage with generics, this approach
avoids variables escaping to the heap
■ Google’s btree maintainers implemented BTreeG
Faster B-Trees with Generics
■ During profiling of different drivers, we noticed a difference in number of
syscalls different drivers perform
■ Difference in network syscalls:
■ Observed using syscount and socksize tools
■ Fewer sendmsg (or similar) syscalls…
■ …but with larger data buffers
■ Instead of sending each CQL frame separately,
better to send multiple ones at a time
ScyllaDB Go Driver - Request Coalescing
ScyllaDB Go Driver - Request Coalescing
ScyllaDB Go Driver - Benchmark Results
ScyllaDB Rust Driver
■ The idea was born during a hackathon in 2020
■ Over the last 3 years we continued the development
■ Uses Tokio framework
■ The driver is now feature complete, supporting many advanced features:
■ Shard awareness
■ Asynchronous interface with support for large concurrency
■ Compression
■ All CQL types
■ Speculative execution
■ TLS support
ScyllaDB Rust Driver
■ Issue raised by the author of latte - a benchmark tool for ScyllaDB
and Cassandra
■ The driver had problems scaling with high concurrency of requests
■ We managed to identify a root cause in the implementation of
FuturesUnordered, a utility to gather many futures and wait for them
■ Due to cooperative scheduling in Tokio, it was possible for
FuturesUnordered to iterate over all futures each time
it was polled
■ A fix was merged to Tokio to limit the number of
Futures iterated over in each poll
ScyllaDB Rust Driver - O(N²) in Tokio?
■ Rack-aware load balancing
■ Reduce the cost of querying ScyllaDB nodes in other racks (corresponding for example to
AWS Availability Zones)
■ Reduce the latency by querying the nearest rack
■ Iterator-based deserialization
■ The current implementation deserializes row data
into equivalent of Vec<Vec<Option<CqlValue>>
■ Skip materializing all rows into vector,
deserialize on-the-fly
■ Load balancing refactor
■ Main goal: reduce number of allocations and atomic
operations while building the query plan,
especially on the happy path
ScyllaDB Rust Driver - Ongoing Efforts
Bindings to ScyllaDB
Rust Driver
■ When benchmarking ScyllaDB Rust Driver against other drivers, we
measured it was the most performant driver, beating the C++ driver
■ Why not develop a way to use ScyllaDB Rust Driver from C++ code?
■ Benefits of a unified core:
■ Higher performance
■ Easier maintenance
■ Fewer bugs
Bindings to ScyllaDB Rust Driver
■ We started development for the C/C++ language
■ C++ bindings to the Rust driver; the same API as the original C++ driver
■ Drop-in replacement (just replacing .so file)
■ The resulting project has an order-of-magnitude fewer LoC
■ Better stability, fewer problems compared to the original C++ driver
Bindings to ScyllaDB Rust Driver - C/C++
Thank You
Stay in Touch
Piotr Grabowski
avelanarius

Contenu connexe

Similaire à Building Next Generation Drivers: Optimizing Performance in Go and Rust

DevEx | there’s no place like k3s
DevEx | there’s no place like k3sDevEx | there’s no place like k3s
DevEx | there’s no place like k3sHaggai Philip Zagury
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APIshareddatamsft
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster CassandraTzach Livyatan
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversScyllaDB
 
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsAMD Developer Central
 
State of serverless
State of serverlessState of serverless
State of serverlessAnurag Saran
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestDatabricks
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBScyllaDB
 
Survey of High Performance NoSQL Systems
Survey of High Performance NoSQL SystemsSurvey of High Performance NoSQL Systems
Survey of High Performance NoSQL SystemsScyllaDB
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...ScyllaDB
 
Our Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudOur Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudHostedbyConfluent
 
Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0ScyllaDB
 
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanJimin Hsieh
 
Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replicationVenu Ryali
 
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2Linaro
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...ScyllaDB
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaItai Yaffe
 

Similaire à Building Next Generation Drivers: Optimizing Performance in Go and Rust (20)

DevEx | there’s no place like k3s
DevEx | there’s no place like k3sDevEx | there’s no place like k3s
DevEx | there’s no place like k3s
 
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration LogisticsNoSQL Database Migration Masterclass - Session 3: Migration Logistics
NoSQL Database Migration Masterclass - Session 3: Migration Logistics
 
Seattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp APISeattle Spark Meetup Mobius CSharp API
Seattle Spark Meetup Mobius CSharp API
 
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB,  or how we implemented a 10-times faster CassandraSeastar / ScyllaDB,  or how we implemented a 10-times faster Cassandra
Seastar / ScyllaDB, or how we implemented a 10-times faster Cassandra
 
Optimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database DriversOptimizing Performance in Rust for Low-Latency Database Drivers
Optimizing Performance in Rust for Low-Latency Database Drivers
 
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIsDX12 & Vulkan: Dawn of a New Generation of Graphics APIs
DX12 & Vulkan: Dawn of a New Generation of Graphics APIs
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
State of serverless
State of serverlessState of serverless
State of serverless
 
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in PinterestMigrating ETL Workflow to Apache Spark at Scale in Pinterest
Migrating ETL Workflow to Apache Spark at Scale in Pinterest
 
Build Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDBBuild Low-Latency Applications in Rust on ScyllaDB
Build Low-Latency Applications in Rust on ScyllaDB
 
Survey of High Performance NoSQL Systems
Survey of High Performance NoSQL SystemsSurvey of High Performance NoSQL Systems
Survey of High Performance NoSQL Systems
 
ArangoDB
ArangoDBArangoDB
ArangoDB
 
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
 
Our Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent CloudOur Multi-Year Journey to a 10x Faster Confluent Cloud
Our Multi-Year Journey to a 10x Faster Confluent Cloud
 
Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0Introducing Scylla Open Source 4.0
Introducing Scylla Open Source 4.0
 
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
 
Using Kafka to scale database replication
Using Kafka to scale database replicationUsing Kafka to scale database replication
Using Kafka to scale database replication
 
LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2LCU14 310- Cisco ODP v2
LCU14 310- Cisco ODP v2
 
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and Kafka
 

Plus de ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLScyllaDB
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasScyllaDB
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDBScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101ScyllaDB
 
Top NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesTop NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesScyllaDB
 
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesNoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesScyllaDB
 

Plus de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
What Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQLWhat Developers Need to Unlearn for High Performance NoSQL
What Developers Need to Unlearn for High Performance NoSQL
 
Low Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & PitfallsLow Latency at Extreme Scale: Proven Practices & Pitfalls
Low Latency at Extreme Scale: Proven Practices & Pitfalls
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDBBeyond Linear Scaling: A New Path for Performance with ScyllaDB
Beyond Linear Scaling: A New Path for Performance with ScyllaDB
 
Dissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance DilemmasDissecting Real-World Database Performance Dilemmas
Dissecting Real-World Database Performance Dilemmas
 
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
Database Performance at Scale Masterclass: Workload Characteristics by Felipe...
 
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...
 
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaDatabase Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna
 
Replacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDBReplacing Your Cache with ScyllaDB
Replacing Your Cache with ScyllaDB
 
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityPowering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability
 
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx7 Reasons Not to Put an External Cache in Front of Your Database.pptx
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
 
Getting the most out of ScyllaDB
Getting the most out of ScyllaDBGetting the most out of ScyllaDB
Getting the most out of ScyllaDB
 
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationNoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
 
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesNoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges
 
ScyllaDB Virtual Workshop
ScyllaDB Virtual WorkshopScyllaDB Virtual Workshop
ScyllaDB Virtual Workshop
 
DBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & TradeoffsDBaaS in the Real World: Risks, Rewards & Tradeoffs
DBaaS in the Real World: Risks, Rewards & Tradeoffs
 
NoSQL Data Modeling 101
NoSQL Data Modeling 101NoSQL Data Modeling 101
NoSQL Data Modeling 101
 
Top NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling MistakesTop NoSQL Data Modeling Mistakes
Top NoSQL Data Modeling Mistakes
 
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & PrinciplesNoSQL Data Modeling Foundations — Introducing Concepts & Principles
NoSQL Data Modeling Foundations — Introducing Concepts & Principles
 

Dernier

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Dernier (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Building Next Generation Drivers: Optimizing Performance in Go and Rust

  • 1. Building Next Generation Drivers: Optimizing Performance in Go and Rust Piotr Grabowski, Software Team Leader
  • 2. Piotr Grabowski ■ Software Team Leader at ScyllaDB responsible for all ScyllaDB drivers, ScyllaDB Kafka Connectors (ScyllaDB Sink Connector and ScyllaDB CDC Source Connector) ■ Joined ScyllaDB 2.5 year ago
  • 3. ■ Introduction ■ ScyllaDB Go Driver ■ ScyllaDB Rust Driver ■ Bindings to ScyllaDB Rust Driver Agenda
  • 5. Drivers 101 ■ Drivers (in this presentation) - libraries that allow sending queries to ScyllaDB ■ Primary protocol: CQL (Cassandra Query Language) protocol ■ TCP ■ ScyllaDB supports CQL v4 ■ Frame-based protocol, supporting multiple streams ■ Supports LZ4 and Snappy compression ■ ScyllaDB drivers support shard awareness: ■ Driver can connect to a specific shard of ScyllaDB
  • 6. Drivers 101 - Role of Drivers ■ The role of drivers: ■ Serialization/deserialization of CQL frames ■ Serialization/deserialization of ScyllaDB types ■ Querying and maintaining metadata about tables/nodes ■ Routing requests to correct nodes (and shards) ■ Sending request across network ■ Conveniently constructing and executing queries in your language of choice: ■ gocqlx ■ Java Driver’s Mapper interface
  • 7. Drivers 101 - Performance ■ How can the driver improve performance? ■ Shard awareness: sending the query to a correct shard ■ Partitioners: ScyllaDB’s CDC (Change Data Capture) implements a custom partitioner which determines a node to send the query to ■ LWT Optimization: consistently prefer a single replica when executing a LWT query to avoid Paxos conflicts ■ Optimizing hot paths in the driver: ■ Serialization/deserialization ■ Routing code ■ Avoiding copies, allocations and locks
  • 9. ■ A new ScyllaDB driver for the Go language ■ Developed by a University of Warsaw student team for their Bachelor’s thesis ■ Focus of the project: performance ■ Avoiding mutexes ■ Avoiding allocations - reducing GC impact ■ Faster B-trees with generics ScyllaDB Go Driver
  • 10. ScyllaDB Go Driver - Mutexes gocql ScyllaDB Go Driver
  • 11. ■ Approach inspired by ScyllaDB’s shared-nothing architecture ■ Techniques: ■ Better overall picture of the architecture ■ Clear, efficient communication between components ■ Using idiomatic Go channels and atomics ■ Result: ■ The driver only uses a single mutex, local to a TCP connection ■ Reducing race conditions ScyllaDB Go Driver - Mutexes
  • 12. ■ Initial benchmarks of gocql have shown that the performance was heavily influenced by the Go’s Garbage Collector ■ ScyllaDB Go Driver performs 5x less allocations compared to gocql ■ The result of: ■ Meticulously designing each component of the code ■ Faster B-Trees with Generics ■ Reusing buffers owned by writer/reader ScyllaDB Go Driver - Memory
  • 13. ■ Go 1.18 introduced support for generics ■ Using generics in B-tree implementation yielded 40% performance improvement compared to Google’s btree implementation ■ The implementation performs zero allocations in all operations (except cloning) ■ By replacing interface usage with generics, this approach avoids variables escaping to the heap ■ Google’s btree maintainers implemented BTreeG Faster B-Trees with Generics
  • 14. ■ During profiling of different drivers, we noticed a difference in number of syscalls different drivers perform ■ Difference in network syscalls: ■ Observed using syscount and socksize tools ■ Fewer sendmsg (or similar) syscalls… ■ …but with larger data buffers ■ Instead of sending each CQL frame separately, better to send multiple ones at a time ScyllaDB Go Driver - Request Coalescing
  • 15. ScyllaDB Go Driver - Request Coalescing
  • 16. ScyllaDB Go Driver - Benchmark Results
  • 18. ■ The idea was born during a hackathon in 2020 ■ Over the last 3 years we continued the development ■ Uses Tokio framework ■ The driver is now feature complete, supporting many advanced features: ■ Shard awareness ■ Asynchronous interface with support for large concurrency ■ Compression ■ All CQL types ■ Speculative execution ■ TLS support ScyllaDB Rust Driver
  • 19. ■ Issue raised by the author of latte - a benchmark tool for ScyllaDB and Cassandra ■ The driver had problems scaling with high concurrency of requests ■ We managed to identify a root cause in the implementation of FuturesUnordered, a utility to gather many futures and wait for them ■ Due to cooperative scheduling in Tokio, it was possible for FuturesUnordered to iterate over all futures each time it was polled ■ A fix was merged to Tokio to limit the number of Futures iterated over in each poll ScyllaDB Rust Driver - O(N²) in Tokio?
  • 20. ■ Rack-aware load balancing ■ Reduce the cost of querying ScyllaDB nodes in other racks (corresponding for example to AWS Availability Zones) ■ Reduce the latency by querying the nearest rack ■ Iterator-based deserialization ■ The current implementation deserializes row data into equivalent of Vec<Vec<Option<CqlValue>> ■ Skip materializing all rows into vector, deserialize on-the-fly ■ Load balancing refactor ■ Main goal: reduce number of allocations and atomic operations while building the query plan, especially on the happy path ScyllaDB Rust Driver - Ongoing Efforts
  • 22. ■ When benchmarking ScyllaDB Rust Driver against other drivers, we measured it was the most performant driver, beating the C++ driver ■ Why not develop a way to use ScyllaDB Rust Driver from C++ code? ■ Benefits of a unified core: ■ Higher performance ■ Easier maintenance ■ Fewer bugs Bindings to ScyllaDB Rust Driver
  • 23. ■ We started development for the C/C++ language ■ C++ bindings to the Rust driver; the same API as the original C++ driver ■ Drop-in replacement (just replacing .so file) ■ The resulting project has an order-of-magnitude fewer LoC ■ Better stability, fewer problems compared to the original C++ driver Bindings to ScyllaDB Rust Driver - C/C++
  • 24. Thank You Stay in Touch Piotr Grabowski avelanarius