Avoiding Data Hotspots at Scale

•

0 j'aime•501 vues

There are two key choices when scaling a NoSQL data store: choosing between a hash or a range based sharding and choosing the right sharding key. Any choice is a trade-off between scalability of read, append, and update workloads. In this talk I will present the standard scaling techniques, some non-universal sharding tricks, less obvious reasons for hotspots, as well as techniques to avoid them.

Technologie

Brought to you by
Avoiding Data Hotspots
At Scale
Konstantin Osipov
Engineering at

Konstantin Osipov
Director of Engineering
■ Worked on lightweight transactions in Scylla
■ Rarely happy with the status quo (AKA the stubborn one)
■ A very happy father
■ Career and public speaking coach

What this talk is not
● replication
● Re-sharding and re-balancing data
● distributed queries & jobs
will focus on principles data distribution only

Deﬁne sharding
Sharding - horizontal partitioning of data across multiple servers. Can be used to
scale capacity and (possibly) throughput of the database. 3 key challenges:
● Choosing a way to split data across nodes
● Re-balancing data and maintaining location information
● Routing queries to the data

Hash based sharding
Hash
ring
Hashed keys
Consistent hash Ketama hash

Sharding: hash + virtual buckets in Couchbase

Sharding: chunk splits and migrations in
MongoDB

mongodb
For queries that don’t include the shard key, mongos must query all shards, wait
for their response and then return the result to the application. These
“scatter/gather” queries can be long running operations.
However, range based partitioning can result in an uneven distribution of data,
which may negate some of the beneﬁts of sharding. For example, if the shard key
is a linearly increasing ﬁeld, such as time, then all requests for a given time range
will map to the same chunk, and thus the same shard. In this situation, a small set
of shards may receive the majority of requests and the system would not scale
very well.

spanner
One cause of hotspots is having a column whose value monotonically increases
as the ﬁrst key part, because this results in all inserts occurring at the end of your
key space. This pattern is undesirable because Cloud Spanner divides data among
servers by key ranges, which means all your inserts will be directed at a single
server that will end up doing all the work.

Descending order for timestamp-based keys
CREATE TABLE UserAccessLog (
UserId INT64 NOT NULL,
LastAccess TIMESTAMP NOT NULL,
...
) PRIMARY KEY (UserId, LastAccess DESC);

voltdb
To further optimize performance, VoltDB allows selected tables to be replicated
on all partitions of the cluster. This strategy minimizes cross-partition join
operations. For example, a retail merchandising database that uses product codes
as the primary key may have one table that simply correlates the product code
with the product's category and full name. Since this table is relatively small and
does not change frequently (unlike inventory and orders) it can be replicated to all
partitions. This way stored procedures can retrieve and return user-friendly
product information when searching by product code without impacting the
performance of order and inventory updates and searches.

Good and bad shard keys
■ good: user session, shopping order
■ maybe: user_id (if user data isn’t too thick)
■ Better: (user_id, post_id)
■ bad: inventory item, order date

Scaling in a data warehouse
■ Data warehouses usually don’t check unique constraints
■ Data is sorted multiple times, according to multiple dimensions
■ Sharding can be done according to a hash of multiple ﬁelds

Summary: design choices
Hash Range
Write heavy/monotonic//time
series
Linear scaling Hotspots
Primary key read Linear scaling Linear scaling
Partial key read Hotspots Linear scaling
Indexed range read Hotspots Linear scaling
Non-indexed read Hotspots Hotspots

Brought to you by
Konstantin Osipov
kostja@scylladb.com
@kostja_osipov

Recommandé

DB Latency Using DRAM + PMem in App Direct & Memory ModesScyllaDB

How to Meet Your P99 Goal While Overcommitting Another WorkloadScyllaDB

Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...ScyllaDB

Realtime Indexing for Fast Queries on Massive Semi-Structured DataScyllaDB

M|18 Where and How to Optimize for PerformanceMariaDB plc

EVCache & Moneta (GoSF)Scott Mansfield

Application Caching: The Hidden Microservice (SAConf)Scott Mansfield

Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...ScyllaDB

Recommandé

DB Latency Using DRAM + PMem in App Direct & Memory ModesScyllaDB

How to Meet Your P99 Goal While Overcommitting Another WorkloadScyllaDB

Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storag...ScyllaDB

Realtime Indexing for Fast Queries on Massive Semi-Structured DataScyllaDB

M|18 Where and How to Optimize for PerformanceMariaDB plc

EVCache & Moneta (GoSF)Scott Mansfield

Application Caching: The Hidden Microservice (SAConf)Scott Mansfield

Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...ScyllaDB

Date-tiered Compaction Policy for Time-series DataHBaseCon

Application Caching: The Hidden MicroserviceScott Mansfield

Sharding: Past, Present and Future with Krutika DhananjayGluster.org

M|18 Choosing the Right High Availability Strategy for YouMariaDB plc

BigTable PreReadingeverestsun

Rit 2011 atsLeif Hedstrom

hbaseconasia2017: hbase-2.0.0HBaseCon

Allocation of Frames & Thrashingarifmollick8578

Virtual memory ,Allocaton of frame & TrashingCOMSATS Institute of Information Technology

Islamabad PUG - 7th meetup - performance tuningUmair Shahid

IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...In-Memory Computing Summit

RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiCeph Community

hbaseconasia2017: Apache HBase at NeteaseHBaseCon

How to be Successful with ScyllaScyllaDB

Life And Times Of An Address SpaceMartin Packer

Accordion HBaseCon 2017Edward Bortnikov

HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon

Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance SpecialistMartin Packer

hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon

PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting

Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services

Challenges of Implementing an Advanced SQL Engine on HadoopDataWorks Summit

Contenu connexe

Tendances

Date-tiered Compaction Policy for Time-series DataHBaseCon

Application Caching: The Hidden MicroserviceScott Mansfield

Sharding: Past, Present and Future with Krutika DhananjayGluster.org

M|18 Choosing the Right High Availability Strategy for YouMariaDB plc

BigTable PreReadingeverestsun

Rit 2011 atsLeif Hedstrom

hbaseconasia2017: hbase-2.0.0HBaseCon

Allocation of Frames & Thrashingarifmollick8578

Virtual memory ,Allocaton of frame & TrashingCOMSATS Institute of Information Technology

Islamabad PUG - 7th meetup - performance tuningUmair Shahid

IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...In-Memory Computing Summit

RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu ChaiCeph Community

hbaseconasia2017: Apache HBase at NeteaseHBaseCon

How to be Successful with ScyllaScyllaDB

Life And Times Of An Address SpaceMartin Packer

Accordion HBaseCon 2017Edward Bortnikov

HBaseCon2017 gohbase: Pure Go HBase ClientHBaseCon

Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance SpecialistMartin Packer

hbaseconasia2017: Large scale data near-line loading method and architectureHBaseCon

PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya KosmodemianskyPostgreSQL-Consulting

Tendances (20)

Date-tiered Compaction Policy for Time-series Data

Application Caching: The Hidden Microservice

Sharding: Past, Present and Future with Krutika Dhananjay

M|18 Choosing the Right High Availability Strategy for You

BigTable PreReading

Rit 2011 ats

hbaseconasia2017: hbase-2.0.0

Allocation of Frames & Thrashing

Virtual memory ,Allocaton of frame & Trashing

Islamabad PUG - 7th meetup - performance tuning

IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...

RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai

hbaseconasia2017: Apache HBase at Netease

How to be Successful with Scylla

Life And Times Of An Address Space

Accordion HBaseCon 2017

HBaseCon2017 gohbase: Pure Go HBase Client

Munich 2016 - Z011597 Martin Packer - How To Be A Better Performance Specialist

hbaseconasia2017: Large scale data near-line loading method and architecture

PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky

Similaire à Avoiding Data Hotspots at Scale

Best Practices for Migrating Your Data Warehouse to Amazon RedshiftAmazon Web Services

Challenges of Implementing an Advanced SQL Engine on HadoopDataWorks Summit

Beyond Aurora. Scale-out SQL databases for AWS Clustrix

DB2 LUW V11.1 CERTIFICATION TRAINING PART #1sunildupakuntla

Enterprise NoSQL: Silver Bullet or Poison PillBilly Newport

CS636-olap.pptIftikharbaig7

AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...Amazon Web Services

Mysql For DevelopersCarol McDonald

Best Practices for Supercharging Cloud Analytics on Amazon RedshiftSnapLogic

Mohan Testingsmittal81

Argus Production Monitoring at SalesforceHBaseCon

Argus Production Monitoring at Salesforce HBaseCon

Building scalable application with sql serverChris Adkin

Best Practices for Migrating your Data Warehouse to Amazon RedshiftAmazon Web Services

A tour of Amazon RedshiftKel Graham

The thinking persons guide to data warehouse designCalpont

Handling the growth of dataPiyush Katariya

GIDS 2016 Understanding and Building No SQLstechmaddy

SPL_ALL_EN.pptx政宏张

AWS June 2016 Webinar Series - Amazon Redshift or Big Data AnalyticsAmazon Web Services

Similaire à Avoiding Data Hotspots at Scale (20)

Best Practices for Migrating Your Data Warehouse to Amazon Redshift

Challenges of Implementing an Advanced SQL Engine on Hadoop

Beyond Aurora. Scale-out SQL databases for AWS

DB2 LUW V11.1 CERTIFICATION TRAINING PART #1

Enterprise NoSQL: Silver Bullet or Poison Pill

CS636-olap.ppt

AWS re:Invent 2016| DAT318 | Migrating from RDBMS to NoSQL: How Sony Moved fr...

Mysql For Developers

Best Practices for Supercharging Cloud Analytics on Amazon Redshift

Mohan Testing

Argus Production Monitoring at Salesforce

Building scalable application with sql server

Best Practices for Migrating your Data Warehouse to Amazon Redshift

A tour of Amazon Redshift

The thinking persons guide to data warehouse design

Handling the growth of data

GIDS 2016 Understanding and Building No SQLs

SPL_ALL_EN.pptx

AWS June 2016 Webinar Series - Amazon Redshift or Big Data Analytics

Plus de ScyllaDB

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

What Developers Need to Unlearn for High Performance NoSQLScyllaDB

Low Latency at Extreme Scale: Proven Practices & PitfallsScyllaDB

Dissecting Real-World Database Performance DilemmasScyllaDB

Beyond Linear Scaling: A New Path for Performance with ScyllaDBScyllaDB

Dissecting Real-World Database Performance DilemmasScyllaDB

Database Performance at Scale Masterclass: Workload Characteristics by Felipe...ScyllaDB

Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...ScyllaDB

Database Performance at Scale Masterclass: Driver Strategies by Piotr SarnaScyllaDB

Replacing Your Cache with ScyllaDBScyllaDB

Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear ScalabilityScyllaDB

7 Reasons Not to Put an External Cache in Front of Your Database.pptxScyllaDB

Getting the most out of ScyllaDBScyllaDB

NoSQL Database Migration Masterclass - Session 2: The Anatomy of a MigrationScyllaDB

NoSQL Database Migration Masterclass - Session 3: Migration LogisticsScyllaDB

NoSQL Data Migration Masterclass - Session 1 Migration Strategies and ChallengesScyllaDB

ScyllaDB Virtual WorkshopScyllaDB

DBaaS in the Real World: Risks, Rewards & TradeoffsScyllaDB

Build Low-Latency Applications in Rust on ScyllaDBScyllaDB

NoSQL Data Modeling 101ScyllaDB

Plus de ScyllaDB (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL

What Developers Need to Unlearn for High Performance NoSQL

Low Latency at Extreme Scale: Proven Practices & Pitfalls

Dissecting Real-World Database Performance Dilemmas

Beyond Linear Scaling: A New Path for Performance with ScyllaDB

Dissecting Real-World Database Performance Dilemmas

Database Performance at Scale Masterclass: Workload Characteristics by Felipe...

Database Performance at Scale Masterclass: Database Internals by Pavel Emelya...

Database Performance at Scale Masterclass: Driver Strategies by Piotr Sarna

Replacing Your Cache with ScyllaDB

Powering Real-Time Apps with ScyllaDB_ Low Latency & Linear Scalability

7 Reasons Not to Put an External Cache in Front of Your Database.pptx

Getting the most out of ScyllaDB

NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration

NoSQL Database Migration Masterclass - Session 3: Migration Logistics

NoSQL Data Migration Masterclass - Session 1 Migration Strategies and Challenges

ScyllaDB Virtual Workshop

DBaaS in the Real World: Risks, Rewards & Tradeoffs

Build Low-Latency Applications in Rust on ScyllaDB

NoSQL Data Modeling 101

Dernier

How to convert PDF to text with Nanonetsnaman860154

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

A Year of the Servo Reboot: Where Are We Now?Igalia

Histor y of HAM Radio presentation slidevu2urc

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science

08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Real Time Object Detection Using Open CVKhem

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

GenCyber Cyber Security Day PresentationMichael W. Hawkins

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

Dernier (20)

How to convert PDF to text with Nanonets

Advantages of Hiring UIUX Design Service Providers for Your Business

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

A Year of the Servo Reboot: Where Are We Now?

Histor y of HAM Radio presentation slide

Exploring the Future Potential of AI-Enabled Smartphone Processors

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Presentation on how to chat with PDF using ChatGPT code interpreter

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx

08448380779 Call Girls In Civil Lines Women Seeking Men

How to Troubleshoot Apps for the Modern Connected Worker

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Scaling API-first – The story of a global engineering organization

Real Time Object Detection Using Open CV

Finology Group – Insurtech Innovation Award 2024

Tata AIG General Insurance Company - Insurer Innovation Award 2024

The 7 Things I Know About Cyber Security After 25 Years | April 2024

08448380779 Call Girls In Friends Colony Women Seeking Men

GenCyber Cyber Security Day Presentation

What Are The Drone Anti-jamming Systems Technology?

Avoiding Data Hotspots at Scale

1. Brought to you by Avoiding Data Hotspots At Scale Konstantin Osipov Engineering at

2. Konstantin Osipov Director of Engineering ■ Worked on lightweight transactions in Scylla ■ Rarely happy with the status quo (AKA the stubborn one) ■ A very happy father ■ Career and public speaking coach

3. RUM conjecture and scalability

4. What this talk is not ● replication ● Re-sharding and re-balancing data ● distributed queries & jobs will focus on principles data distribution only

5. Ways to shard

6. Deﬁne sharding Sharding - horizontal partitioning of data across multiple servers. Can be used to scale capacity and (possibly) throughput of the database. 3 key challenges: ● Choosing a way to split data across nodes ● Re-balancing data and maintaining location information ● Routing queries to the data

7. Hash based sharding Hash ring Hashed keys Consistent hash Ketama hash

8. Sharding: hash + virtual buckets in Couchbase

9. Sharding: chunk splits and migrations in MongoDB

10. Hotspots

11. Range based sharding

12. Sharding: ranges in CockroachDB

13. mongodb For queries that don’t include the shard key, mongos must query all shards, wait for their response and then return the result to the application. These “scatter/gather” queries can be long running operations. However, range based partitioning can result in an uneven distribution of data, which may negate some of the beneﬁts of sharding. For example, if the shard key is a linearly increasing ﬁeld, such as time, then all requests for a given time range will map to the same chunk, and thus the same shard. In this situation, a small set of shards may receive the majority of requests and the system would not scale very well.

14. spanner One cause of hotspots is having a column whose value monotonically increases as the ﬁrst key part, because this results in all inserts occurring at the end of your key space. This pattern is undesirable because Cloud Spanner divides data among servers by key ranges, which means all your inserts will be directed at a single server that will end up doing all the work.

15. Avoiding hotspots

16. Bit-reversing the partition key

17. Descending order for timestamp-based keys CREATE TABLE UserAccessLog ( UserId INT64 NOT NULL, LastAccess TIMESTAMP NOT NULL, ... ) PRIMARY KEY (UserId, LastAccess DESC);

18. Replicating dimension tables everywhere

19. voltdb To further optimize performance, VoltDB allows selected tables to be replicated on all partitions of the cluster. This strategy minimizes cross-partition join operations. For example, a retail merchandising database that uses product codes as the primary key may have one table that simply correlates the product code with the product's category and full name. Since this table is relatively small and does not change frequently (unlike inventory and orders) it can be replicated to all partitions. This way stored procedures can retrieve and return user-friendly product information when searching by product code without impacting the performance of order and inventory updates and searches.

20. Good and bad shard keys ■ good: user session, shopping order ■ maybe: user_id (if user data isn’t too thick) ■ Better: (user_id, post_id) ■ bad: inventory item, order date

21. Special cases

22. Scaling a message queue

23. Scaling in a data warehouse ■ Data warehouses usually don’t check unique constraints ■ Data is sorted multiple times, according to multiple dimensions ■ Sharding can be done according to a hash of multiple ﬁelds

24. Let’s recap

25. Summary: design choices Hash Range Write heavy/monotonic//time series Linear scaling Hotspots Primary key read Linear scaling Linear scaling Partial key read Hotspots Linear scaling Indexed range read Hotspots Linear scaling Non-indexed read Hotspots Hotspots

26. Brought to you by Konstantin Osipov kostja@scylladb.com @kostja_osipov