SlideShare une entreprise Scribd logo
1  sur  37
Télécharger pour lire hors ligne
Playlists at Spotify
Using Cassandra to store version
controlled objects at large scale
Jimmy Mårdell <yarin@spotify.com>

#CassandraEU

October 18, 2013
Intro

About me
• Jimmy Mårdell
• Software Engineer
• 3 years at Spotify

#CassandraEU

2
Intro

About Spotify
• 24 million active users
– 6 million paying subscribers
• 4 000 servers in 4 data centers
• Over 1 billion playlists created

#CassandraEU

3
#CassandraEU

Intro

Contents
•Why version control?
•Playlists at Spotify
•Cassandra data model
•Lessons learned

4
Why version control?

#CassandraEU

What is version control?
• “Version control is the management of changes to documents” (Wikipedia)
• Stand-alone (most common)
– GIT, Subversion etc

• Embedded
– Google Docs

5
Why version control?

Embedded usage
• Collaborative editing
• Undo functionality
• Performance
• Business logic depends on document history

#CassandraEU

6
Playlists at Spotify

Playlists

#CassandraEU

7
Playlists at Spotify

#CassandraEU

8
Playlists at Spotify

Playlist challenges
• More than 1 billion playlists
• >40 000 requests/second at peak
• Offline mode
• Concurrent changes

#CassandraEU

9
Playlists at Spotify

Playlist client-server
• Every playlist is a version controlled object
• All playlists are synced on login
– Fetch all new changes

#CassandraEU

10
Playlists at Spotify

Playlist client-server
• Local queue of playlist modifications
– Clients optimistically accept changes - fast UI

• Queue flushed to server when possible
– Offline changes
– Fault tolerant

#CassandraEU

11
#CassandraEU

Playlists at Spotify

12

Playlist version control

3,038f...: REM(from=2, len=1)

A
C

2,19ca...: MOV(from=2, to=1, len=1)

A
C
B

1,4ed2...: ADD(ix=0, track=A,B,C)

A
B
C

0,ROOT

Representation of a playlist in the backend
#CassandraEU

Playlists at Spotify

Playlist branching
• Concurrent changes
– Offline
A

B

13
#CassandraEU

Playlists at Spotify

Playlist branching
merge

• Concurrent changes
– Offline

• Conflict resolution
– Operational Transformation

• Clients oblivious of branches

B’

A’

A

B

14
Cassandra data model

Cassandra data model

#CassandraEU

15
Cassandra data model

Cassandra at Spotify
• Playlist first system to use Cassandra
– Now we use it a lot...

• Started with Cassandra 0.7
• Using limited set of Cassandra features
– No super columns
– No CQL

#CassandraEU

16
Cassandra data model

Planning a data model
• Start with the queries!
• Three common playlist queries
– SYNC: Get all changes since a particular revision
– GET: Get the most recent snapshot
– APPEND: Add/move/delete tracks

#CassandraEU

17
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

1,4ed2...
parent=0,ROOT
op=ADD(ix=0, track=A,B,C)

2,19ca...
parent=1,4ed2...
op=MOV(from=2, to=1, len=1)

3,038f...
parent=2,19ca
op=REM(from=2, len=1)

18
#CassandraEU

Cassandra data model

19

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key

1,4ed2...

2,19ca...

parent=0,ROOT
op=ADD(ix=0, track=A,B,C)

parent=1,4ed2...
op=MOV(from=2, to=1, len=1)

1,8a20...

2,dd07...

spotify:user:yarin:playlist:
prnt=0,ROOT
4Pj4dCOEEYWDixfYyJwxEf op=...

2,b783...
prnt=1,8a20...
op=...

prnt=1,8a20...
op=...

3,39ef...

3,038f...
parent=2,19ca
op=REM(from=2, len=1)

3,5a9c...

prnt=2,dd07... prnt=2,b783...
op=...
op=...

4,03fc...
prnt=2,39ef...
prnt=3,5a9c...
Cassandra data model

Playlists in Cassandra
• Which revision is the latest?
– Changes with no children

• Multiple heads possible!
– Heads may appear anywhere within the row

#CassandraEU

20
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

1,4ed2...
prnt=0,ROOT
op=...

CF playlist_head
2,19ca...
prnt=1,4ed2...
op=...

3,038f...
prnt=2,19ca
op=...

Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

3,038f...

21
#CassandraEU

Cassandra data model

22

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

1,4ed2...
prnt=0,ROOT
op=...

1,8a20...
prnt=0,ROOT
op=...

CF playlist_head
2,19ca...
prnt=1,4ed2...
op=...

2,b783...
prnt=1,8a20...
op=...

3,038f...
prnt=2,19ca
op=...

2,dd07...
prnt=1,8a20...
op=...

Row key

3,038f...

spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

2,b783... 2,dd07...
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key

CF playlist_head

1,4ed2...
prnt=0,ROOT
op=...

2,19ca...
prnt=1,4ed2...
op=...

3,038f...
prnt=2,19ca
op=...

1,8a20. 2,b783. 2,dd07. 3,39ef. 3,5a9c. 4,03fc.

spotify:user:yarin:p
laylist:4Pj4dCOEE prt=0,ROOT
YWDixfYyJwxEf
op=...

prnt=1,8a20
op=...

prnt=1,8a20
op=...

prnt=2,dd07
op=...

prnt=2,b783
op=...

prnt=2,39ef
prnt=3,5a9c

Row key

3,038f...

spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

4,03fc...

23
Cassandra data model

Playlist heads
• playlist_head is a small CF
– Fits in RAM

• 95% of playlist request only read from playlist_head
– Most playlists are already up-to-date

#CassandraEU

24
Cassandra data model

Playlist snapshots
• playlist_change works well when syncing playlists
• Not so well for fetching new playlists
– Snapshot cache

#CassandraEU

25
#CassandraEU

Cassandra data model

Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

Row key
spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

1,4ed2...
prnt=0,ROOT
op=...

1,8a20...
prnt=0,ROOT
op=...

CF playlist_snapshot
2,19ca...
prnt=1,4ed2...
op=...

2,b783...
prnt=1,8a20...
op=...

3,038f...
prnt=2,19ca
op=...

2,dd07...
prnt=1,8a20...
op=...

Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA

cache
version=3,038f...
contents=A,C

Row key

cache

spotify:user:yarin:playlist:
4Pj4dCOEEYWDixfYyJwxEf

version=2,b783...
contents=...

26
Cassandra data model

Updating playlists
• Validate change
– Locate snapshot
– Client may append to old version

• Update all tables
– playlist_head last

#CassandraEU

27
Cassandra data model

Cassandra consistency levels
• Replication factor 3
• All writes using CL_QUORUM
• Reads from playlist_head
– CL_QUORUM

• Reads from playlist_change and playlist_snapshot
– CL_ONE but may fallback to CL_QUORUM

#CassandraEU

28
Lessons learned

Lessons learned

#CassandraEU

29
Lessons learned

Optimizations
• Leveled compaction
– Improved performance a lot
• Compression
– Not as impressive
– CRC checks

#CassandraEU

30
Lessons learned

Optimizations
• Trusted Linux page cache to ensure playlist_head kept in RAM
– Didn’t work

• Tried Cassandra row cache
– NO!

• mlock to the rescue

#CassandraEU

31
Lessons learned

#CassandraEU

An enterprise ready solution
bash# while true; do
vmtouch -m 10000000000 -l *head* & sleep 10m
kill %vmtouch
done

32
Lessons learned

No moving parts
• Flash disks are awesome
• Reduced size of cluster from 60 to 30 nodes
– Thanks FusionIO!

• IOPS no longer the bottleneck

#CassandraEU

33
Lessons learned

Tombstone hell
• Noticed requests to playlist_head took several seconds
– Huh?

• Every change causes a value to be deleted in playlist_head
• playlist_head is essentially a queue
– Well-known anti-pattern

#CassandraEU

34
Lessons learned

Tombstone hell
• We had rows with >500,000 tombstones
• Solution: major compaction
– Relatively fast since playlist_head is in RAM

#CassandraEU

35
Lessons learned

And more...
• Large rows in playlist_change
– Modify version graph

• Reduce amount of requests
– Group playlists by owner

Sounds interesting? We’re hiring!

#CassandraEU

36
Questions?

Contenu connexe

Tendances

Tendances (20)

Optimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL JoinsOptimizing Apache Spark SQL Joins
Optimizing Apache Spark SQL Joins
 
Linux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performanceLinux tuning to improve PostgreSQL performance
Linux tuning to improve PostgreSQL performance
 
Amazon Aurora
Amazon AuroraAmazon Aurora
Amazon Aurora
 
AWS Aurora 운영사례 (by 배은미)
AWS Aurora 운영사례 (by 배은미)AWS Aurora 운영사례 (by 배은미)
AWS Aurora 운영사례 (by 배은미)
 
PMM database open source monitoring solution
PMM database open source monitoring solutionPMM database open source monitoring solution
PMM database open source monitoring solution
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Real-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache PinotReal-time Analytics with Trino and Apache Pinot
Real-time Analytics with Trino and Apache Pinot
 
Performance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark MetricsPerformance Troubleshooting Using Apache Spark Metrics
Performance Troubleshooting Using Apache Spark Metrics
 
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
Conquering the Lambda architecture in LinkedIn metrics platform with Apache C...
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System TablesObservability of InfluxDB IOx: Tracing, Metrics and System Tables
Observability of InfluxDB IOx: Tracing, Metrics and System Tables
 
Facebook Messages & HBase
Facebook Messages & HBaseFacebook Messages & HBase
Facebook Messages & HBase
 
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...Performant Streaming in Production: Preventing Common Pitfalls when Productio...
Performant Streaming in Production: Preventing Common Pitfalls when Productio...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
Common Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta LakehouseCommon Strategies for Improving Performance on Your Delta Lakehouse
Common Strategies for Improving Performance on Your Delta Lakehouse
 
Dive into PySpark
Dive into PySparkDive into PySpark
Dive into PySpark
 

Similaire à Playlists at Spotify - Using Cassandra to store version controlled objects

Tutorial(release)
Tutorial(release)Tutorial(release)
Tutorial(release)
Oshin Hung
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
guru100
 

Similaire à Playlists at Spotify - Using Cassandra to store version controlled objects (20)

Playlists at Spotify
Playlists at SpotifyPlaylists at Spotify
Playlists at Spotify
 
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
Hecuba2: Cassandra Operations Made Easy (Radovan Zvoncek, Spotify) | C* Summi...
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ Spotify
 
Recsys Challenge 2018 - Creamy Fireflies - Artist-driven layering and user’s...
Recsys Challenge 2018 - Creamy Fireflies -  Artist-driven layering and user’s...Recsys Challenge 2018 - Creamy Fireflies -  Artist-driven layering and user’s...
Recsys Challenge 2018 - Creamy Fireflies - Artist-driven layering and user’s...
 
Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ Spotify
 
Spotify cassandra london
Spotify cassandra londonSpotify cassandra london
Spotify cassandra london
 
guider: a system-wide performance analyzer
guider: a system-wide performance analyzerguider: a system-wide performance analyzer
guider: a system-wide performance analyzer
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Recommendation @Deezer
Recommendation @DeezerRecommendation @Deezer
Recommendation @Deezer
 
Tutorial(release)
Tutorial(release)Tutorial(release)
Tutorial(release)
 
Automatic Discovery of Service Metadata for Systems at Scale
Automatic Discovery of Service Metadata for Systems at ScaleAutomatic Discovery of Service Metadata for Systems at Scale
Automatic Discovery of Service Metadata for Systems at Scale
 
Last.fm API workshop - Stockholm
Last.fm API workshop - StockholmLast.fm API workshop - Stockholm
Last.fm API workshop - Stockholm
 
Spotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairsSpotify: Automating Cassandra repairs
Spotify: Automating Cassandra repairs
 
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
Using Compuware Strobe to Save CPU: 4 Real-life Cases from the Files of CPT G...
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Miyagawa
MiyagawaMiyagawa
Miyagawa
 
Last.fm - Lessons from building the World's largest social music platform
Last.fm - Lessons from building the World's largest social music platform Last.fm - Lessons from building the World's largest social music platform
Last.fm - Lessons from building the World's largest social music platform
 
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
Analyze one year of radio station songs aired with Spark SQL, Spotify, and Da...
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 

Playlists at Spotify - Using Cassandra to store version controlled objects

  • 1. Playlists at Spotify Using Cassandra to store version controlled objects at large scale Jimmy Mårdell <yarin@spotify.com> #CassandraEU October 18, 2013
  • 2. Intro About me • Jimmy Mårdell • Software Engineer • 3 years at Spotify #CassandraEU 2
  • 3. Intro About Spotify • 24 million active users – 6 million paying subscribers • 4 000 servers in 4 data centers • Over 1 billion playlists created #CassandraEU 3
  • 4. #CassandraEU Intro Contents •Why version control? •Playlists at Spotify •Cassandra data model •Lessons learned 4
  • 5. Why version control? #CassandraEU What is version control? • “Version control is the management of changes to documents” (Wikipedia) • Stand-alone (most common) – GIT, Subversion etc • Embedded – Google Docs 5
  • 6. Why version control? Embedded usage • Collaborative editing • Undo functionality • Performance • Business logic depends on document history #CassandraEU 6
  • 9. Playlists at Spotify Playlist challenges • More than 1 billion playlists • >40 000 requests/second at peak • Offline mode • Concurrent changes #CassandraEU 9
  • 10. Playlists at Spotify Playlist client-server • Every playlist is a version controlled object • All playlists are synced on login – Fetch all new changes #CassandraEU 10
  • 11. Playlists at Spotify Playlist client-server • Local queue of playlist modifications – Clients optimistically accept changes - fast UI • Queue flushed to server when possible – Offline changes – Fault tolerant #CassandraEU 11
  • 12. #CassandraEU Playlists at Spotify 12 Playlist version control 3,038f...: REM(from=2, len=1) A C 2,19ca...: MOV(from=2, to=1, len=1) A C B 1,4ed2...: ADD(ix=0, track=A,B,C) A B C 0,ROOT Representation of a playlist in the backend
  • 13. #CassandraEU Playlists at Spotify Playlist branching • Concurrent changes – Offline A B 13
  • 14. #CassandraEU Playlists at Spotify Playlist branching merge • Concurrent changes – Offline • Conflict resolution – Operational Transformation • Clients oblivious of branches B’ A’ A B 14
  • 15. Cassandra data model Cassandra data model #CassandraEU 15
  • 16. Cassandra data model Cassandra at Spotify • Playlist first system to use Cassandra – Now we use it a lot... • Started with Cassandra 0.7 • Using limited set of Cassandra features – No super columns – No CQL #CassandraEU 16
  • 17. Cassandra data model Planning a data model • Start with the queries! • Three common playlist queries – SYNC: Get all changes since a particular revision – GET: Get the most recent snapshot – APPEND: Add/move/delete tracks #CassandraEU 17
  • 18. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA 1,4ed2... parent=0,ROOT op=ADD(ix=0, track=A,B,C) 2,19ca... parent=1,4ed2... op=MOV(from=2, to=1, len=1) 3,038f... parent=2,19ca op=REM(from=2, len=1) 18
  • 19. #CassandraEU Cassandra data model 19 Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key 1,4ed2... 2,19ca... parent=0,ROOT op=ADD(ix=0, track=A,B,C) parent=1,4ed2... op=MOV(from=2, to=1, len=1) 1,8a20... 2,dd07... spotify:user:yarin:playlist: prnt=0,ROOT 4Pj4dCOEEYWDixfYyJwxEf op=... 2,b783... prnt=1,8a20... op=... prnt=1,8a20... op=... 3,39ef... 3,038f... parent=2,19ca op=REM(from=2, len=1) 3,5a9c... prnt=2,dd07... prnt=2,b783... op=... op=... 4,03fc... prnt=2,39ef... prnt=3,5a9c...
  • 20. Cassandra data model Playlists in Cassandra • Which revision is the latest? – Changes with no children • Multiple heads possible! – Heads may appear anywhere within the row #CassandraEU 20
  • 21. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA 1,4ed2... prnt=0,ROOT op=... CF playlist_head 2,19ca... prnt=1,4ed2... op=... 3,038f... prnt=2,19ca op=... Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA 3,038f... 21
  • 22. #CassandraEU Cassandra data model 22 Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 1,4ed2... prnt=0,ROOT op=... 1,8a20... prnt=0,ROOT op=... CF playlist_head 2,19ca... prnt=1,4ed2... op=... 2,b783... prnt=1,8a20... op=... 3,038f... prnt=2,19ca op=... 2,dd07... prnt=1,8a20... op=... Row key 3,038f... spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 2,b783... 2,dd07...
  • 23. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key CF playlist_head 1,4ed2... prnt=0,ROOT op=... 2,19ca... prnt=1,4ed2... op=... 3,038f... prnt=2,19ca op=... 1,8a20. 2,b783. 2,dd07. 3,39ef. 3,5a9c. 4,03fc. spotify:user:yarin:p laylist:4Pj4dCOEE prt=0,ROOT YWDixfYyJwxEf op=... prnt=1,8a20 op=... prnt=1,8a20 op=... prnt=2,dd07 op=... prnt=2,b783 op=... prnt=2,39ef prnt=3,5a9c Row key 3,038f... spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 4,03fc... 23
  • 24. Cassandra data model Playlist heads • playlist_head is a small CF – Fits in RAM • 95% of playlist request only read from playlist_head – Most playlists are already up-to-date #CassandraEU 24
  • 25. Cassandra data model Playlist snapshots • playlist_change works well when syncing playlists • Not so well for fetching new playlists – Snapshot cache #CassandraEU 25
  • 26. #CassandraEU Cassandra data model Playlist data model CF playlist_change Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA Row key spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf 1,4ed2... prnt=0,ROOT op=... 1,8a20... prnt=0,ROOT op=... CF playlist_snapshot 2,19ca... prnt=1,4ed2... op=... 2,b783... prnt=1,8a20... op=... 3,038f... prnt=2,19ca op=... 2,dd07... prnt=1,8a20... op=... Row key spotify:user:spotify:playlist: 3ZgmfR6lsnCwdffZUan8EA cache version=3,038f... contents=A,C Row key cache spotify:user:yarin:playlist: 4Pj4dCOEEYWDixfYyJwxEf version=2,b783... contents=... 26
  • 27. Cassandra data model Updating playlists • Validate change – Locate snapshot – Client may append to old version • Update all tables – playlist_head last #CassandraEU 27
  • 28. Cassandra data model Cassandra consistency levels • Replication factor 3 • All writes using CL_QUORUM • Reads from playlist_head – CL_QUORUM • Reads from playlist_change and playlist_snapshot – CL_ONE but may fallback to CL_QUORUM #CassandraEU 28
  • 30. Lessons learned Optimizations • Leveled compaction – Improved performance a lot • Compression – Not as impressive – CRC checks #CassandraEU 30
  • 31. Lessons learned Optimizations • Trusted Linux page cache to ensure playlist_head kept in RAM – Didn’t work • Tried Cassandra row cache – NO! • mlock to the rescue #CassandraEU 31
  • 32. Lessons learned #CassandraEU An enterprise ready solution bash# while true; do vmtouch -m 10000000000 -l *head* & sleep 10m kill %vmtouch done 32
  • 33. Lessons learned No moving parts • Flash disks are awesome • Reduced size of cluster from 60 to 30 nodes – Thanks FusionIO! • IOPS no longer the bottleneck #CassandraEU 33
  • 34. Lessons learned Tombstone hell • Noticed requests to playlist_head took several seconds – Huh? • Every change causes a value to be deleted in playlist_head • playlist_head is essentially a queue – Well-known anti-pattern #CassandraEU 34
  • 35. Lessons learned Tombstone hell • We had rows with >500,000 tombstones • Solution: major compaction – Relatively fast since playlist_head is in RAM #CassandraEU 35
  • 36. Lessons learned And more... • Large rows in playlist_change – Modify version graph • Reduce amount of requests – Group playlists by owner Sounds interesting? We’re hiring! #CassandraEU 36