SlideShare une entreprise Scribd logo
1  sur  39
Pressing play



                                        Niklas Gustavsson
                                               ngn@spotify.com
                                                    @protocol7

Tuesday, April 17, 12
Who am I?
      • ngn@spotify.com
      • @protocol7
      • Spotify backend dev based in Göteborg
      • Mainly from a JVM background, working on
        various stuff over the years
      • Apache Software Foundation member




Tuesday, April 17, 12
What’s Spotify all about?
      •       A big catalogue, tons of music
      •       Available everywhere
      •       Great user experience
      •       More convenient than piracy
      •       Fast, reliable, always available
      •       Scalable for many, many users
      •       Ad-supported or payed-for service




Tuesday, April 17, 12
Pressing	
  play


Tuesday, April 17, 12
Where’s Spotify?
      • Let’s start the client, but where should it connect
        to?




Tuesday, April 17, 12
Aside: SRV records
      • Example SRV
      _spotify-mac-client._tcp.spotify.com. 242 IN    SRV 10   8      4070 C8.spotify.com.
      _spotify-mac-client._tcp.spotify.com. 242 IN    SRV 10   16     4070 C4.spotify.com.
      name                                  TTL class     prio weight port host




      • GeoDNS used




Tuesday, April 17, 12
What does that record really point to?
      • accesspoint
      • Handles authentication state, logging, routing,
        rate limiting and much more
      • Protocol between client and AP uses a single,
        encrypted multiplexed socket over TCP
      • Written in C++




Tuesday, April 17, 12
Tuesday, April 17, 12
Find something to play
      • Let’s search




Tuesday, April 17, 12
Services
      • Probably close to 100 backend services, most
        small, handling a single task
      • UNIX philosophy
      • Many autonomous
      • Deployed on commodity servers
      • Always redundant




Tuesday, April 17, 12
Services
      • Mostly written in Python, a few in Java and C
      • Storage optimized for each service, mostly
        PostgreSQL, Cassandra and Tokyo Cabinet
      • Many service uses in-memory caching using for
        example /dev/shm or memcached
      • Usually a small daemon, talking HTTP or Hermes
        • Got our own supervisor which keeps services
           running




Tuesday, April 17, 12
Aside: Hermes
      •       ZeroMQ for transport, protobuf for envelope and payload
      •       HTTP-like verbs and caching
      •       Request-reply and publish/subscribe
      •       Very performant and introspectable




Tuesday, April 17, 12
How does the accesspoint find search?
      • Everything has an SRV DNS record:
        • One record with same name for each service
          instance
        • Clients resolve to find servers providing that
          service
        • Lowest priority record is chosen with weighted
          shuffle
        • Clients retry other instances in case of failures




Tuesday, April 17, 12
Read-only services
      •       Stateless
      •       Writes are hard
      •       Simple to scale, just add more servers
      •       Services can be restarted as needed
      •       Indexes prefabricated, distributed to live servers




Tuesday, April 17, 12
Read-write services
      • User generated content, e.g. playlists
      • Hard to ensure consistence of data across instances

      Solutions:
      • Eventual consistency:
         • Reads of just written data not guaranteed to be up-to-date
      • Locking, atomic operations
          • Creating globally unique keys, e.g. usernames
          • Transactions, e.g. billing


Tuesday, April 17, 12
Sharding
      • Some services use Dynamo inspired DHTs
        • Each request has a key
        • Each service node is responsible for a range of
          hash keys
        • Data is distributed among service nodes
        • Redundancy is ensured by writing to replica
          node
        • Data must be transitioned when ring changes




Tuesday, April 17, 12
DHT example




Tuesday, April 17, 12
search
      • Java service
      • Lucene storage
        • New index published daily
      • Doesn’t store any metadata in itself, returns a list
        of identifiers

      • (Search suggestions are served from a separate
        service, optimized for speed)




Tuesday, April 17, 12
Metadata services
      •       Multiple read-only services
      •       60 Gb indices
      •       Responds to metadata requests
      •       Decorates metadata onto other service responses
              • We’re most likely moving away from this model




Tuesday, April 17, 12
Tuesday, April 17, 12
Another aside: How does stuff get into Spotify?
      • >15 million tracks, we can’t maintain all that
        ourselves
      • Ingest audio, images and metadata from labels
        • Receive, transform, transcode, merge
      • All ends up in a metadata database from which
        indices are generated and distributed to services




Tuesday, April 17, 12
Tuesday, April 17, 12
The Kent bug
      • Much of the metadata lacks identifiers which
        leaves us with heuristics.




Tuesday, April 17, 12
Play


Tuesday, April 17, 12
Audio encodings and files
      • Spotify supports multiple audio encodings
        • Ogg Vorbis 96 (-q2), 160 (-q5) and 320 000 (-
            q9)
        • MP3 320 000 (downloads)
      • For each track, a file for each encoding/bitrate is
        listed in the returned metadata
      • The client picks an appropriate choice




Tuesday, April 17, 12
Get the audio data
      • The client now must fetch the actual audio data
      • Latency kills




Tuesday, April 17, 12
Cache
      •       Player caches tracks it has played
      •       Caches are large (56% are over 5 GB)
      •       Least Recently Used policy for cache eviction
      •       50% of data comes from local cache
      •       Cached files are served in P2P overlay




Tuesday, April 17, 12
Streaming
      • Request first piece from Spotify storage
      • Meanwhile, search peer-to-peer (P2P) for
        remainder
      • Switch back and forth between Spotify storage
        and peers as needed
      • Towards end of a track, start prefetching next one




Tuesday, April 17, 12
P2P
      • All peers are equals (no supernodes)
      • A user only downloads data she needs
      • tracker service keeps peers for each track
      • P2P network becomes (weakly) clustered by
        interest
      • Oblivious to network architecture
      • Does not enforce fairness
      • Mobile clients does not participate in P2P



                        h.p://www.csc.kth.se/~gkreitz/spo9fy/kreitz-­‐spo9fy_kth11.pdf
Tuesday, April 17, 12
Tuesday, April 17, 12
Tuesday, April 17, 12
Success!




Tuesday, April 17, 12
YAA: Hadoop
      • We run analysis using Hadoop which feeds back
        into the previously described process, e.g. track
        popularity is used for weighing search results and
        toplists




Tuesday, April 17, 12
Tuesday, April 17, 12
Development at Spotify
      • Uses almost exclusively open source software
        • Git, Debian, Munin, Zabbix, Puppet, Teamcity...
      • Developers use whatever development tools they are
        comfortable with
      • Scrum or Kanban in three week iterations
      • DevOps heavy. Freaking awesome ops
      • Monitor and measure all the things!




Tuesday, April 17, 12
Development at Spotify
      •        Development hubs in Stockholm, Göteborg and NYC
      •        All in all, >220 people in tech
      •        Very talented team
      •        Hackdays and system owner days in each iteration
      •        Hangs out on IRC
      •        Growing and hiring




Tuesday, April 17, 12
Languages at Spotify




Tuesday, April 17, 12
Questions?



Tuesday, April 17, 12
Thank you

                           Want to work at Spotify?
                        http://www.spotify.com/jobs/


Tuesday, April 17, 12

Contenu connexe

Tendances

The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyJosh Baer
 
Playlists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objectsPlaylists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objectsJimmy Mårdell
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...confluent
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Shirshanka Das
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouseAltinity Ltd
 
Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center   Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center confluent
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcachedJurriaan Persyn
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyKishore Gopalakrishna
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At SpotifyAdam Kawa
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at SpotifyNeville Li
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per SecondAmazon Web Services
 
Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ SpotifyNikhil Tibrewal
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonIgor Anishchenko
 

Tendances (20)

The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at Spotify
 
Playlists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objectsPlaylists at Spotify - Using Cassandra to store version controlled objects
Playlists at Spotify - Using Cassandra to store version controlled objects
 
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
Eventing Things - A Netflix Original! (Nitin Sharma, Netflix) Kafka Summit SF...
 
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
Apache Gobblin: Bridging Batch and Streaming Data Integration. Big Data Meetu...
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouse
 
Attacking REST API
Attacking REST APIAttacking REST API
Attacking REST API
 
Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center   Monitoring Apache Kafka with Confluent Control Center
Monitoring Apache Kafka with Confluent Control Center
 
Data at Spotify
Data at SpotifyData at Spotify
Data at Spotify
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Introduction to memcached
Introduction to memcachedIntroduction to memcached
Introduction to memcached
 
Building real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case studyBuilding real time analytics applications using pinot : A LinkedIn case study
Building real time analytics applications using pinot : A LinkedIn case study
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
 
Nifi workshop
Nifi workshopNifi workshop
Nifi workshop
 
Global Netflix Platform
Global Netflix PlatformGlobal Netflix Platform
Global Netflix Platform
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
 
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second(BDT318) How Netflix Handles Up To 8 Million Events Per Second
(BDT318) How Netflix Handles Up To 8 Million Events Per Second
 
Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ Spotify
 
Nifi
NifiNifi
Nifi
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
 

Similaire à Spotify architecture - Pressing play

Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreNick Barkas
 
Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingRicardo Vice Santos
 
Is Disk Now a Viable Solution for Archive - Jon Toigo
Is Disk Now a Viable Solution for Archive - Jon ToigoIs Disk Now a Viable Solution for Archive - Jon Toigo
Is Disk Now a Viable Solution for Archive - Jon Toigospectralogic
 
The Background Noise of the Internet
The Background Noise of the InternetThe Background Noise of the Internet
The Background Noise of the InternetAndrew Morris
 
DNS in IR: Collection, Analysis and Response
DNS in IR: Collection, Analysis and ResponseDNS in IR: Collection, Analysis and Response
DNS in IR: Collection, Analysis and Responsepm123008
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataWes McKinney
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling PinterestC4Media
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementWei-Ning Huang
 
Puppet Keynote
Puppet KeynotePuppet Keynote
Puppet KeynotePuppet
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Michael Bohlig
 
How to Write the Fastest JSON Parser/Writer in the World
How to Write the Fastest JSON Parser/Writer in the WorldHow to Write the Fastest JSON Parser/Writer in the World
How to Write the Fastest JSON Parser/Writer in the WorldMilo Yip
 
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision ProblemUsing ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision ProblemAPNIC
 
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)Adam Charnock
 
20130714 php matsuri - highly available php
20130714   php matsuri - highly available php20130714   php matsuri - highly available php
20130714 php matsuri - highly available phpGraham Weldon
 
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenchesInternet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenchesAPNIC
 
Approaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsApproaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsRogue Wave Software
 

Similaire à Spotify architecture - Pressing play (20)

Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for more
 
Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streaming
 
Spotify: P2P music streaming
Spotify: P2P music streamingSpotify: P2P music streaming
Spotify: P2P music streaming
 
Is Disk Now a Viable Solution for Archive - Jon Toigo
Is Disk Now a Viable Solution for Archive - Jon ToigoIs Disk Now a Viable Solution for Archive - Jon Toigo
Is Disk Now a Viable Solution for Archive - Jon Toigo
 
The Background Noise of the Internet
The Background Noise of the InternetThe Background Noise of the Internet
The Background Noise of the Internet
 
DNS in IR: Collection, Analysis and Response
DNS in IR: Collection, Analysis and ResponseDNS in IR: Collection, Analysis and Response
DNS in IR: Collection, Analysis and Response
 
Apache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory dataApache Arrow -- Cross-language development platform for in-memory data
Apache Arrow -- Cross-language development platform for in-memory data
 
Scaling Pinterest
Scaling PinterestScaling Pinterest
Scaling Pinterest
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacement
 
Puppet Keynote
Puppet KeynotePuppet Keynote
Puppet Keynote
 
Compression talk
Compression talkCompression talk
Compression talk
 
ION Krakow - A Global IPv6 Deployment Update
ION Krakow - A Global IPv6 Deployment UpdateION Krakow - A Global IPv6 Deployment Update
ION Krakow - A Global IPv6 Deployment Update
 
Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation Coursera amazon cloudsearch presentation
Coursera amazon cloudsearch presentation
 
How to Write the Fastest JSON Parser/Writer in the World
How to Write the Fastest JSON Parser/Writer in the WorldHow to Write the Fastest JSON Parser/Writer in the World
How to Write the Fastest JSON Parser/Writer in the World
 
Spotify: Data center & Backend buildout
Spotify: Data center & Backend buildoutSpotify: Data center & Backend buildout
Spotify: Data center & Backend buildout
 
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision ProblemUsing ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
Using ~300 Billion DNS Queries to Analyse the TLD Name Collision Problem
 
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
PlayNice.ly: Using Redis to store all our data, hahaha (Redis London Meetup)
 
20130714 php matsuri - highly available php
20130714   php matsuri - highly available php20130714   php matsuri - highly available php
20130714 php matsuri - highly available php
 
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenchesInternet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
Internet Week 2018: 1.1.1.0/24 A report from the (anycast) trenches
 
Approaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC appsApproaches to debugging mixed-language HPC apps
Approaches to debugging mixed-language HPC apps
 

Plus de Niklas Gustavsson (11)

Spotify services - Leetspeak 2014
Spotify services - Leetspeak 2014Spotify services - Leetspeak 2014
Spotify services - Leetspeak 2014
 
Spotify services (SDC 2013)
Spotify services (SDC 2013)Spotify services (SDC 2013)
Spotify services (SDC 2013)
 
Real-time web
Real-time webReal-time web
Real-time web
 
RESTful web services
RESTful web servicesRESTful web services
RESTful web services
 
Not only SQL
Not only SQL Not only SQL
Not only SQL
 
HTML5
HTML5HTML5
HTML5
 
The future is bright
The future is brightThe future is bright
The future is bright
 
CouchDB
CouchDBCouchDB
CouchDB
 
Oredev 2009 JAX-RS
Oredev 2009 JAX-RSOredev 2009 JAX-RS
Oredev 2009 JAX-RS
 
Apachecon Eu 2008 Mina
Apachecon Eu 2008 MinaApachecon Eu 2008 Mina
Apachecon Eu 2008 Mina
 
REST made simple with Java
REST made simple with JavaREST made simple with Java
REST made simple with Java
 

Dernier

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 

Dernier (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 

Spotify architecture - Pressing play

  • 1. Pressing play Niklas Gustavsson ngn@spotify.com @protocol7 Tuesday, April 17, 12
  • 2. Who am I? • ngn@spotify.com • @protocol7 • Spotify backend dev based in Göteborg • Mainly from a JVM background, working on various stuff over the years • Apache Software Foundation member Tuesday, April 17, 12
  • 3. What’s Spotify all about? • A big catalogue, tons of music • Available everywhere • Great user experience • More convenient than piracy • Fast, reliable, always available • Scalable for many, many users • Ad-supported or payed-for service Tuesday, April 17, 12
  • 5. Where’s Spotify? • Let’s start the client, but where should it connect to? Tuesday, April 17, 12
  • 6. Aside: SRV records • Example SRV _spotify-mac-client._tcp.spotify.com. 242 IN SRV 10 8 4070 C8.spotify.com. _spotify-mac-client._tcp.spotify.com. 242 IN SRV 10 16 4070 C4.spotify.com. name TTL class prio weight port host • GeoDNS used Tuesday, April 17, 12
  • 7. What does that record really point to? • accesspoint • Handles authentication state, logging, routing, rate limiting and much more • Protocol between client and AP uses a single, encrypted multiplexed socket over TCP • Written in C++ Tuesday, April 17, 12
  • 9. Find something to play • Let’s search Tuesday, April 17, 12
  • 10. Services • Probably close to 100 backend services, most small, handling a single task • UNIX philosophy • Many autonomous • Deployed on commodity servers • Always redundant Tuesday, April 17, 12
  • 11. Services • Mostly written in Python, a few in Java and C • Storage optimized for each service, mostly PostgreSQL, Cassandra and Tokyo Cabinet • Many service uses in-memory caching using for example /dev/shm or memcached • Usually a small daemon, talking HTTP or Hermes • Got our own supervisor which keeps services running Tuesday, April 17, 12
  • 12. Aside: Hermes • ZeroMQ for transport, protobuf for envelope and payload • HTTP-like verbs and caching • Request-reply and publish/subscribe • Very performant and introspectable Tuesday, April 17, 12
  • 13. How does the accesspoint find search? • Everything has an SRV DNS record: • One record with same name for each service instance • Clients resolve to find servers providing that service • Lowest priority record is chosen with weighted shuffle • Clients retry other instances in case of failures Tuesday, April 17, 12
  • 14. Read-only services • Stateless • Writes are hard • Simple to scale, just add more servers • Services can be restarted as needed • Indexes prefabricated, distributed to live servers Tuesday, April 17, 12
  • 15. Read-write services • User generated content, e.g. playlists • Hard to ensure consistence of data across instances Solutions: • Eventual consistency: • Reads of just written data not guaranteed to be up-to-date • Locking, atomic operations • Creating globally unique keys, e.g. usernames • Transactions, e.g. billing Tuesday, April 17, 12
  • 16. Sharding • Some services use Dynamo inspired DHTs • Each request has a key • Each service node is responsible for a range of hash keys • Data is distributed among service nodes • Redundancy is ensured by writing to replica node • Data must be transitioned when ring changes Tuesday, April 17, 12
  • 18. search • Java service • Lucene storage • New index published daily • Doesn’t store any metadata in itself, returns a list of identifiers • (Search suggestions are served from a separate service, optimized for speed) Tuesday, April 17, 12
  • 19. Metadata services • Multiple read-only services • 60 Gb indices • Responds to metadata requests • Decorates metadata onto other service responses • We’re most likely moving away from this model Tuesday, April 17, 12
  • 21. Another aside: How does stuff get into Spotify? • >15 million tracks, we can’t maintain all that ourselves • Ingest audio, images and metadata from labels • Receive, transform, transcode, merge • All ends up in a metadata database from which indices are generated and distributed to services Tuesday, April 17, 12
  • 23. The Kent bug • Much of the metadata lacks identifiers which leaves us with heuristics. Tuesday, April 17, 12
  • 25. Audio encodings and files • Spotify supports multiple audio encodings • Ogg Vorbis 96 (-q2), 160 (-q5) and 320 000 (- q9) • MP3 320 000 (downloads) • For each track, a file for each encoding/bitrate is listed in the returned metadata • The client picks an appropriate choice Tuesday, April 17, 12
  • 26. Get the audio data • The client now must fetch the actual audio data • Latency kills Tuesday, April 17, 12
  • 27. Cache • Player caches tracks it has played • Caches are large (56% are over 5 GB) • Least Recently Used policy for cache eviction • 50% of data comes from local cache • Cached files are served in P2P overlay Tuesday, April 17, 12
  • 28. Streaming • Request first piece from Spotify storage • Meanwhile, search peer-to-peer (P2P) for remainder • Switch back and forth between Spotify storage and peers as needed • Towards end of a track, start prefetching next one Tuesday, April 17, 12
  • 29. P2P • All peers are equals (no supernodes) • A user only downloads data she needs • tracker service keeps peers for each track • P2P network becomes (weakly) clustered by interest • Oblivious to network architecture • Does not enforce fairness • Mobile clients does not participate in P2P h.p://www.csc.kth.se/~gkreitz/spo9fy/kreitz-­‐spo9fy_kth11.pdf Tuesday, April 17, 12
  • 33. YAA: Hadoop • We run analysis using Hadoop which feeds back into the previously described process, e.g. track popularity is used for weighing search results and toplists Tuesday, April 17, 12
  • 35. Development at Spotify • Uses almost exclusively open source software • Git, Debian, Munin, Zabbix, Puppet, Teamcity... • Developers use whatever development tools they are comfortable with • Scrum or Kanban in three week iterations • DevOps heavy. Freaking awesome ops • Monitor and measure all the things! Tuesday, April 17, 12
  • 36. Development at Spotify • Development hubs in Stockholm, Göteborg and NYC • All in all, >220 people in tech • Very talented team • Hackdays and system owner days in each iteration • Hangs out on IRC • Growing and hiring Tuesday, April 17, 12
  • 39. Thank you Want to work at Spotify? http://www.spotify.com/jobs/ Tuesday, April 17, 12