SlideShare une entreprise Scribd logo
1  sur  42
behind the scenes	


                 Spotify	

             Ricardo Santos	

                @ricardovice
“spotifiera”, anyone?
Main goals	

•  A big catalogue, tons of music	

•  Available everywhere	

•  Great user experience	

•  More convenient than piracy	

•  Fast	

•  Reliable, high availability	

•  Scalable to many, many users
Business idea	

•  Free ad-funded version	

•  Paid subscription where users get:	

  •  No advertisements	

  •  Mobile access	

  •  Offline playback	

  •  API access
“music itself is going to become
like running water or electricity”	

          David Bowie, 2002
Accessibility	

•  People should always be able to access
  music	

  •  Whenever they want	

  •  Wherever they are
The catalogue	

•  All content is delivered by labels	

•  Currently over 10 million tracks	

•  Growing every day, around 10k per day	

•  96-320 kbps audio streams, most are Ogg
  Vorbis q5, 160kbps
that all sounds cool,
but let’s talk engineering!
“It’s Easy, Really.” 	

    Blaine Cook, 2007
Handling Growth	

•  Scaling is not an exact science	

•  There is no such thing as a magic formula	

•  Usage patterns differ	

•  There is always a limit to what you can
   handle	

•  Fail gracefully	

•  Continuous evolution process
Usage patterns	

Typically, some services are more demanding
than others, this can be due to:	

•  Higher popularity	

•  Higher complexity	

•  Both combined
Decoupling	

•  Divide and conquer!	

•  Resources assigned individually	

•  Using the right tools to address each
   problem	

•  Organization and delegation	

•  Problems are isolated	

•  Easier to handle growth
Decoupling	

Spotify’s internal services include:	

•  Access Point	

•  User	

•  Playlist	

•  Search	

•  Browse	

Can you guess which one is the most complex?
Playlist!
Playlist!	

Though it may sound simple, by far the most
 demanding:	

•  For each user there are several playlists	

•  Push notifications	

•  Offline writing	

•  Conflict resolution without user interaction
Metadata services	

Search and Browse allow users to find music	

•  Both handle read requests	

•  But their usage and responses differ	

•  Data sources should be optimized for each
   of these, called indices	

•  These are hard to maintain, easier to
   regenerate
Speed thrills
Latency matters	

•  High latency is a problem, not only in First
  Person Shooters	

•  Increased latency of Google searches by
  100 – 400ms decreased usage by 0.2 – 0.6%
  (Jake Brutlag, 2009)	

•  Slow performance is one of the major
  reasons users abandon services	

•  Users don't come back
Focus on low latency	

•  Our SLA is maintained by monitoring
  latency on the client side	

•  On average, the human notion of
  “instantly” is 200ms	

•  The current median latency to begin to
  play a track in Spotify is 265ms	

•  Due to disk lookup, at times it's actually
  faster to start playing a track from network
  than from disk
Playing a track	

•  Check local cache	

•  Request first piece from Spotify servers	

•  Meanwhile, search P2P for remainder	

•  Switch between servers  P2P as needed	

•  Towards the end of a track, start pre-
  fetching the next one via P2P rather than
  our servers
When to start playing?	

•  Trade off between stutter  latency	

•  Look at last 15 min of transfer rates	

•  Model as Markov chain and simulate	

•  Coupled with some heuristics
Production storage	

•  Production storage is a cache with fast
  drives  lots of RAM	

•  Serves the most popular content	

•  A cache miss will generate a request to
  master storage, slightly higher latency	

•  Production storage is available in several
  data centers to ensure closeness to the
  user (latency wise)
Master storage	

•  Works as a DHT, with some redundancy	

•  Contains all available tracks but has slower
  drives and access	

•  Tracks are kept in several formats, adding
  up to around 290TB
P2P helps	

•  Easier to scale	

•  Less servers	

•  Less bandwidth	

•  Better uptime	

•  Less costs	

•  Fun!
P2P overview	

•  Not a piracy network, all tracks are added
  by Spotify	

•  Used on all desktop clients (no mobile)	

•  Each client connected to = 60 others	

•  All nodes are equals (no super nodes)	

•  A track is downloaded from several peers
P2P custom protocol	

•  Ask for most urgent pieces first	

•  If a peer is slow, re-request from new
  peers	

•  When buffers run low, download from
  central servers	

•  If loading from servers, estimate at what
  point P2P will catch up	

•  If buffers are very low, stop uploading
P2P finding peers	

•  Partial central tracker (BitTorrent-style)	

•  Broadcast query in small neighborhood
  (Gnutella-style)	

•  Two mechanisms results in higher
  availability	

•  Limited broadcast for local (LAN) peer
  discovery (cherry on top...)
P2P security	

•  The P2P network needs to be a safe and
  trusted one	

•  All exchanged files have to come originally
  from Spotify	

•  All peers should be trusted Spotify clients
Security trough
          obscurity	

•  Our client needs to be able to read
  metadata and play music	

•  At the same time we have to prevent
  reverse engineering from doing the same	

•  Therefor, we can't openly discuss the
  details
but…	

•  Closed environment	

•  Integrity of downloaded files is checked	

•  Data transfers are encrypted	

•  Usernames are not exposed in P2P
  network, all peers assigned pseudonym	

•  Software obfuscation, makes life difficult for
  reverse engineers
Software obfuscation
So, what's the
           outcome?	

•  At over 10 million users the responses are	

  •  55.4% from client cache	

  •  35.8% from the P2P network	

  •  8.8% from the servers
Oh, and
we have
cake as
well! :D

spotify.com/jobs
jobs@spotify.com
I'd like to know more...	

•  Get in touch with us	

•  Checkout Gunnar Kreitz's slides and
  academic papers on the subject:	

http://www.csc.kth.se/~gkreitz/spotify-p2p10/
Thanks!	

http://commons.wikimedia.org/wiki/File:Surprised_young_cat.JPG	


http://commons.wikimedia.org/wiki/File:Chicken_February_2009-1.jpg	


http://xkcd.com/257/

Contenu connexe

Tendances

The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyJosh Baer
 
Spotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great SuccessSpotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great SuccessNick Barkas
 
Building Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at SpotifyBuilding Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at SpotifyVidhya Murali
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentationTao Feng
 
How data drives spotify
How data drives spotifyHow data drives spotify
How data drives spotifyAli Sarrafi
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at SpotifyNeville Li
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry confluent
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Databricks
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At SpotifyAdam Kawa
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoAlluxio, Inc.
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyChing-Wei Chen
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouseAltinity Ltd
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
iceberg introduction.pptx
iceberg introduction.pptxiceberg introduction.pptx
iceberg introduction.pptxDori Waldman
 

Tendances (20)

The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at Spotify
 
Spotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great SuccessSpotify: Horizontal Scalability for Great Success
Spotify: Horizontal Scalability for Great Success
 
Building Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at SpotifyBuilding Data Pipelines for Music Recommendations at Spotify
Building Data Pipelines for Music Recommendations at Spotify
 
Strata sf - Amundsen presentation
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
 
How data drives spotify
How data drives spotifyHow data drives spotify
How data drives spotify
 
Storm at Spotify
Storm at SpotifyStorm at Spotify
Storm at Spotify
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
Solving Data Discovery Challenges at Lyft with Amundsen, an Open-source Metad...
 
Observability at Spotify
Observability at SpotifyObservability at Spotify
Observability at Spotify
 
From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.From cache to in-memory data grid. Introduction to Hazelcast.
From cache to in-memory data grid. Introduction to Hazelcast.
 
Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
Big Data At Spotify
Big Data At SpotifyBig Data At Spotify
Big Data At Spotify
 
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3GoHigh Performance Data Lake with Apache Hudi and Alluxio at T3Go
High Performance Data Lake with Apache Hudi and Alluxio at T3Go
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at Spotify
 
Building an Observability platform with ClickHouse
Building an Observability platform with ClickHouseBuilding an Observability platform with ClickHouse
Building an Observability platform with ClickHouse
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Apache Kafka at LinkedIn
Apache Kafka at LinkedInApache Kafka at LinkedIn
Apache Kafka at LinkedIn
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
iceberg introduction.pptx
iceberg introduction.pptxiceberg introduction.pptx
iceberg introduction.pptx
 

En vedette

Microservices at Spotify
Microservices at SpotifyMicroservices at Spotify
Microservices at SpotifyKevin Goldsmith
 
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Kinshuk Mishra
 
Managing Experiment at Spotify
Managing Experiment at SpotifyManaging Experiment at Spotify
Managing Experiment at SpotifyAli Sarrafi
 
Spotify Business Model Analysis
Spotify Business Model AnalysisSpotify Business Model Analysis
Spotify Business Model AnalysisTrevor Clendenin
 
A Spotify Presentation - Case studies
A Spotify Presentation - Case studiesA Spotify Presentation - Case studies
A Spotify Presentation - Case studiesEmily Wilkinson
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...Kevin Goldsmith
 
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan BasalamahIndonesia Network Operators Group
 
An experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAn experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAPNIC
 
Insights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkInsights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkOscar Carlsson
 
Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreNick Barkas
 
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Hakka Labs
 
Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Affan Basalamah
 
A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1coachkevinperkins
 
Fail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedFail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedKevin Goldsmith
 
BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify Vincent Tsao
 
Social Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifySocial Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifyValeria Aguerri
 
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUBTu Pham
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At SpotifyVidhya Murali
 

En vedette (20)

Microservices at Spotify
Microservices at SpotifyMicroservices at Spotify
Microservices at Spotify
 
Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23Ads Personalization at Spotify - NYC Data Engineering 10/23
Ads Personalization at Spotify - NYC Data Engineering 10/23
 
Managing Experiment at Spotify
Managing Experiment at SpotifyManaging Experiment at Spotify
Managing Experiment at Spotify
 
Spotify Business Model Analysis
Spotify Business Model AnalysisSpotify Business Model Analysis
Spotify Business Model Analysis
 
A Spotify Presentation - Case studies
A Spotify Presentation - Case studiesA Spotify Presentation - Case studies
A Spotify Presentation - Case studies
 
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
How Spotify Builds Products (Organization. Architecture, Autonomy, Accountabi...
 
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
14 (IDNOG01) Next Generation Campus Network by Affan Basalamah
 
An experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countriesAn experiment in connecting Internet Exchanges between 3 different countries
An experiment in connecting Internet Exchanges between 3 different countries
 
Insights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talkInsights driven design at spotify - meetup talk
Insights driven design at spotify - meetup talk
 
Spotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for moreSpotify: Playing for millions, tuning for more
Spotify: Playing for millions, tuning for more
 
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
Spotify's Ad Targeting Infrastructure: Achieving Real-time Personalization fo...
 
Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010Update implementasi IPv6 di ITB 2010
Update implementasi IPv6 di ITB 2010
 
A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1A Bomb Squad Power Point Draft.1
A Bomb Squad Power Point Draft.1
 
Spotify Teknikdagarna
Spotify TeknikdagarnaSpotify Teknikdagarna
Spotify Teknikdagarna
 
Spotify presentation
 Spotify presentation Spotify presentation
Spotify presentation
 
Fail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, SucceedFail Safe, Fail Smart, Succeed
Fail Safe, Fail Smart, Succeed
 
BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify BUAD 497 Strategic Management- Spotify
BUAD 497 Strategic Management- Spotify
 
Social Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of SpotifySocial Media Monitoring: The Case of Spotify
Social Media Monitoring: The Case of Spotify
 
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB�
MILLIONS EVENT DELIVERY WITH CLOUD PUB / SUB
 
Music Personalization At Spotify
Music Personalization At SpotifyMusic Personalization At Spotify
Music Personalization At Spotify
 

Similaire à Spotify: behind the scenes

Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingRicardo Vice Santos
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementWei-Ning Huang
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21Lorenzo Miniero
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoJenn Riley
 
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressWordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressDigital Strategy Works LLC
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardNOLOH LLC.
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into OverdriveTodd Palino
 
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroScaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroGreg Kawere
 
Scaling WebRTC applications with Janus
Scaling WebRTC applications with JanusScaling WebRTC applications with Janus
Scaling WebRTC applications with JanusLorenzo Miniero
 
Going Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for PodcastingGoing Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for Podcasting3Play Media
 
Introduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WayIntroduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WaySrinath Perera
 
Apache: Code, Community and Open Source
Apache: Code, Community and Open SourceApache: Code, Community and Open Source
Apache: Code, Community and Open SourceOPNFV
 
WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022Lorenzo Miniero
 
E-commerceG1-C1 P2P
E-commerceG1-C1 P2PE-commerceG1-C1 P2P
E-commerceG1-C1 P2Pnewnwan
 
Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Lorenzo Miniero
 
Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Barry Tarlton
 

Similaire à Spotify: behind the scenes (20)

Spotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streamingSpotify: P2P music-on-demand streaming
Spotify: P2P music-on-demand streaming
 
ProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacementProjectTox: Free as in freedom Skype replacement
ProjectTox: Free as in freedom Skype replacement
 
WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21WebRTC, RED and Janus @ ClueCon21
WebRTC, RED and Janus @ ClueCon21
 
Digitizing and Delivering Audio and Video
Digitizing and Delivering Audio and VideoDigitizing and Delivering Audio and Video
Digitizing and Delivering Audio and Video
 
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPressWordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
WordCamp Lightning Talk: Podcasting and Live Streaming with WordPress
 
Comet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forwardComet: by pushing server data, we push the web forward
Comet: by pushing server data, we push the web forward
 
P2P Lecture.ppt
P2P Lecture.pptP2P Lecture.ppt
P2P Lecture.ppt
 
Putting Kafka Into Overdrive
Putting Kafka Into OverdrivePutting Kafka Into Overdrive
Putting Kafka Into Overdrive
 
Scaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo minieroScaling server side web rtc applications the janus challenge by lorenzo miniero
Scaling server side web rtc applications the janus challenge by lorenzo miniero
 
Scaling WebRTC applications with Janus
Scaling WebRTC applications with JanusScaling WebRTC applications with Janus
Scaling WebRTC applications with Janus
 
Going Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for PodcastingGoing Beyond the Listener: Accessible Audio for Podcasting
Going Beyond the Listener: Accessible Audio for Podcasting
 
Peer to peer(p2 p)
Peer to peer(p2 p)Peer to peer(p2 p)
Peer to peer(p2 p)
 
Introduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache WayIntroduction to Open Source, Apache and Apache Way
Introduction to Open Source, Apache and Apache Way
 
Apache: Code, Community and Open Source
Apache: Code, Community and Open SourceApache: Code, Community and Open Source
Apache: Code, Community and Open Source
 
Podcasting
PodcastingPodcasting
Podcasting
 
WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022WHIP WebRTC Broadcasting @ FOSDEM 2022
WHIP WebRTC Broadcasting @ FOSDEM 2022
 
E-commerceG1-C1 P2P
E-commerceG1-C1 P2PE-commerceG1-C1 P2P
E-commerceG1-C1 P2P
 
Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021Can WebRTC help musicians? @ FOSDEM 2021
Can WebRTC help musicians? @ FOSDEM 2021
 
Music streams
Music streamsMusic streams
Music streams
 
Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!Pi, Python, and Paintball??? Innovating with Affordable Tech!
Pi, Python, and Paintball??? Innovating with Affordable Tech!
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Spotify: behind the scenes

  • 1. behind the scenes Spotify Ricardo Santos @ricardovice
  • 3. Main goals •  A big catalogue, tons of music •  Available everywhere •  Great user experience •  More convenient than piracy •  Fast •  Reliable, high availability •  Scalable to many, many users
  • 4. Business idea •  Free ad-funded version •  Paid subscription where users get: •  No advertisements •  Mobile access •  Offline playback •  API access
  • 5. “music itself is going to become like running water or electricity” David Bowie, 2002
  • 6. Accessibility •  People should always be able to access music •  Whenever they want •  Wherever they are
  • 7.
  • 8.
  • 9.
  • 10. The catalogue •  All content is delivered by labels •  Currently over 10 million tracks •  Growing every day, around 10k per day •  96-320 kbps audio streams, most are Ogg Vorbis q5, 160kbps
  • 11. that all sounds cool, but let’s talk engineering!
  • 12. “It’s Easy, Really.” Blaine Cook, 2007
  • 13. Handling Growth •  Scaling is not an exact science •  There is no such thing as a magic formula •  Usage patterns differ •  There is always a limit to what you can handle •  Fail gracefully •  Continuous evolution process
  • 14. Usage patterns Typically, some services are more demanding than others, this can be due to: •  Higher popularity •  Higher complexity •  Both combined
  • 15. Decoupling •  Divide and conquer! •  Resources assigned individually •  Using the right tools to address each problem •  Organization and delegation •  Problems are isolated •  Easier to handle growth
  • 16. Decoupling Spotify’s internal services include: •  Access Point •  User •  Playlist •  Search •  Browse Can you guess which one is the most complex?
  • 18. Playlist! Though it may sound simple, by far the most demanding: •  For each user there are several playlists •  Push notifications •  Offline writing •  Conflict resolution without user interaction
  • 19. Metadata services Search and Browse allow users to find music •  Both handle read requests •  But their usage and responses differ •  Data sources should be optimized for each of these, called indices •  These are hard to maintain, easier to regenerate
  • 20.
  • 22. Latency matters •  High latency is a problem, not only in First Person Shooters •  Increased latency of Google searches by 100 – 400ms decreased usage by 0.2 – 0.6% (Jake Brutlag, 2009) •  Slow performance is one of the major reasons users abandon services •  Users don't come back
  • 23. Focus on low latency •  Our SLA is maintained by monitoring latency on the client side •  On average, the human notion of “instantly” is 200ms •  The current median latency to begin to play a track in Spotify is 265ms •  Due to disk lookup, at times it's actually faster to start playing a track from network than from disk
  • 24. Playing a track •  Check local cache •  Request first piece from Spotify servers •  Meanwhile, search P2P for remainder •  Switch between servers P2P as needed •  Towards the end of a track, start pre- fetching the next one via P2P rather than our servers
  • 25. When to start playing? •  Trade off between stutter latency •  Look at last 15 min of transfer rates •  Model as Markov chain and simulate •  Coupled with some heuristics
  • 26.
  • 27. Production storage •  Production storage is a cache with fast drives lots of RAM •  Serves the most popular content •  A cache miss will generate a request to master storage, slightly higher latency •  Production storage is available in several data centers to ensure closeness to the user (latency wise)
  • 28. Master storage •  Works as a DHT, with some redundancy •  Contains all available tracks but has slower drives and access •  Tracks are kept in several formats, adding up to around 290TB
  • 29.
  • 30. P2P helps •  Easier to scale •  Less servers •  Less bandwidth •  Better uptime •  Less costs •  Fun!
  • 31. P2P overview •  Not a piracy network, all tracks are added by Spotify •  Used on all desktop clients (no mobile) •  Each client connected to = 60 others •  All nodes are equals (no super nodes) •  A track is downloaded from several peers
  • 32. P2P custom protocol •  Ask for most urgent pieces first •  If a peer is slow, re-request from new peers •  When buffers run low, download from central servers •  If loading from servers, estimate at what point P2P will catch up •  If buffers are very low, stop uploading
  • 33. P2P finding peers •  Partial central tracker (BitTorrent-style) •  Broadcast query in small neighborhood (Gnutella-style) •  Two mechanisms results in higher availability •  Limited broadcast for local (LAN) peer discovery (cherry on top...)
  • 34. P2P security •  The P2P network needs to be a safe and trusted one •  All exchanged files have to come originally from Spotify •  All peers should be trusted Spotify clients
  • 35. Security trough obscurity •  Our client needs to be able to read metadata and play music •  At the same time we have to prevent reverse engineering from doing the same •  Therefor, we can't openly discuss the details
  • 36. but… •  Closed environment •  Integrity of downloaded files is checked •  Data transfers are encrypted •  Usernames are not exposed in P2P network, all peers assigned pseudonym •  Software obfuscation, makes life difficult for reverse engineers
  • 38. So, what's the outcome? •  At over 10 million users the responses are •  55.4% from client cache •  35.8% from the P2P network •  8.8% from the servers
  • 39.
  • 40. Oh, and we have cake as well! :D spotify.com/jobs jobs@spotify.com
  • 41. I'd like to know more... •  Get in touch with us •  Checkout Gunnar Kreitz's slides and academic papers on the subject: http://www.csc.kth.se/~gkreitz/spotify-p2p10/