Automating Google Workspace (GWS) & more with Apps Script
Playlists at Spotify - Using Cassandra to store version controlled objects
1. Playlists at Spotify
Using Cassandra to store version
controlled objects at large scale
Jimmy Mårdell <yarin@spotify.com>
#CassandraEU
October 18, 2013
3. Intro
About Spotify
• 24 million active users
– 6 million paying subscribers
• 4 000 servers in 4 data centers
• Over 1 billion playlists created
#CassandraEU
3
5. Why version control?
#CassandraEU
What is version control?
• “Version control is the management of changes to documents” (Wikipedia)
• Stand-alone (most common)
– GIT, Subversion etc
• Embedded
– Google Docs
5
6. Why version control?
Embedded usage
• Collaborative editing
• Undo functionality
• Performance
• Business logic depends on document history
#CassandraEU
6
9. Playlists at Spotify
Playlist challenges
• More than 1 billion playlists
• >40 000 requests/second at peak
• Offline mode
• Concurrent changes
#CassandraEU
9
10. Playlists at Spotify
Playlist client-server
• Every playlist is a version controlled object
• All playlists are synced on login
– Fetch all new changes
#CassandraEU
10
11. Playlists at Spotify
Playlist client-server
• Local queue of playlist modifications
– Clients optimistically accept changes - fast UI
• Queue flushed to server when possible
– Offline changes
– Fault tolerant
#CassandraEU
11
12. #CassandraEU
Playlists at Spotify
12
Playlist version control
3,038f...: REM(from=2, len=1)
A
C
2,19ca...: MOV(from=2, to=1, len=1)
A
C
B
1,4ed2...: ADD(ix=0, track=A,B,C)
A
B
C
0,ROOT
Representation of a playlist in the backend
16. Cassandra data model
Cassandra at Spotify
• Playlist first system to use Cassandra
– Now we use it a lot...
• Started with Cassandra 0.7
• Using limited set of Cassandra features
– No super columns
– No CQL
#CassandraEU
16
17. Cassandra data model
Planning a data model
• Start with the queries!
• Three common playlist queries
– SYNC: Get all changes since a particular revision
– GET: Get the most recent snapshot
– APPEND: Add/move/delete tracks
#CassandraEU
17
18. #CassandraEU
Cassandra data model
Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA
1,4ed2...
parent=0,ROOT
op=ADD(ix=0, track=A,B,C)
2,19ca...
parent=1,4ed2...
op=MOV(from=2, to=1, len=1)
3,038f...
parent=2,19ca
op=REM(from=2, len=1)
18
20. Cassandra data model
Playlists in Cassandra
• Which revision is the latest?
– Changes with no children
• Multiple heads possible!
– Heads may appear anywhere within the row
#CassandraEU
20
21. #CassandraEU
Cassandra data model
Playlist data model
CF playlist_change
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA
1,4ed2...
prnt=0,ROOT
op=...
CF playlist_head
2,19ca...
prnt=1,4ed2...
op=...
3,038f...
prnt=2,19ca
op=...
Row key
spotify:user:spotify:playlist:
3ZgmfR6lsnCwdffZUan8EA
3,038f...
21
24. Cassandra data model
Playlist heads
• playlist_head is a small CF
– Fits in RAM
• 95% of playlist request only read from playlist_head
– Most playlists are already up-to-date
#CassandraEU
24
25. Cassandra data model
Playlist snapshots
• playlist_change works well when syncing playlists
• Not so well for fetching new playlists
– Snapshot cache
#CassandraEU
25
27. Cassandra data model
Updating playlists
• Validate change
– Locate snapshot
– Client may append to old version
• Update all tables
– playlist_head last
#CassandraEU
27
28. Cassandra data model
Cassandra consistency levels
• Replication factor 3
• All writes using CL_QUORUM
• Reads from playlist_head
– CL_QUORUM
• Reads from playlist_change and playlist_snapshot
– CL_ONE but may fallback to CL_QUORUM
#CassandraEU
28
33. Lessons learned
No moving parts
• Flash disks are awesome
• Reduced size of cluster from 60 to 30 nodes
– Thanks FusionIO!
• IOPS no longer the bottleneck
#CassandraEU
33
34. Lessons learned
Tombstone hell
• Noticed requests to playlist_head took several seconds
– Huh?
• Every change causes a value to be deleted in playlist_head
• playlist_head is essentially a queue
– Well-known anti-pattern
#CassandraEU
34
35. Lessons learned
Tombstone hell
• We had rows with >500,000 tombstones
• Solution: major compaction
– Relatively fast since playlist_head is in RAM
#CassandraEU
35
36. Lessons learned
And more...
• Large rows in playlist_change
– Modify version graph
• Reduce amount of requests
– Group playlists by owner
Sounds interesting? We’re hiring!
#CassandraEU
36