SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
October 17, 2015
Data Pipelines for
Music Recommendations
@
Spotify
Vidhya Murali
@vid052
Vidhya Murali
Who Am I?
2
•Areas of Interest: Data & Machine Learning
•Data Engineer @Spotify
•Masters Student from the University of Wisconsin Madison
aka Happy Badger for life!
“Torture the data, and it will
confess!”
3
– Ronald Coase, Nobel Prize Laureate
Spotify’s Big Data
4
•Started in 2006, now available in 58 countries
• 70+ million active users, 20+ million paid subscribers
• 30+ million songs in our catalog, ~20K added every day
• 1.5 billion playlists so far and counting
• 1 TB of user data logged every day
• Hadoop cluster with 1500 nodes
• ~20,000 Hadoop jobs per day
Music Recommendations at Spotify
Features:
Discover
Discover Weekly
Moments
Radio
Related Artists
5
6
30 million tracks…
What to recommend?
Approaches 7
•Manual curation by Experts
•Editorial Tagging
•Metadata (e.g. Label provided data, NLP over News,
Blogs)
•Audio Signals
•Collaborative Filtering Model
Approaches 7
•Manual curation by Experts
•Editorial Tagging
•Metadata (e.g. Label provided data, NLP over News,
Blogs)
•Audio Signals
•Collaborative Filtering Model
Collaborative Filtering Model 8
•Find patterns from user’s past behavior to generate
recommendations
•Domain independent
•Scalable
•Accuracy (Collaborative Model) >= Accuracy (Content
Based Model)
Definition of CF
9
Hey,
I like tracks P, Q, R, S!
Well,
I like tracks Q, R, S, T!
Then you should check out
track P!
Nice! Btw try track T!
Legacy Slide of Erik Bernhardsson
The YoLo Problem
10
The YoLo Problem
10
•YoLo Problem: “You Only Listen Once” to judge recommendations
•Goal: Predict if users will listen to new music (new to user)
The YoLo Problem
10
•YoLo Problem: “You Only Listen Once” to judge recommendations
•Goal: Predict if users will listen to new music (new to user)
•Challenges
•Scale of catalog (30M songs + ~20K added every day)
•Repeated consumption of music is not very uncommon
•Music is niche
•Music consumption is heavily influenced by user’s lifestyle
The YoLo Problem
10
•YoLo Problem: “You Only Listen Once” to judge recommendations
•Goal: Predict if users will listen to new music (new to user)
•Challenges
•Scale of catalog (30M songs + ~20K added every day)
•Repeated consumption of music is not very uncommon
•Music is niche
•Music consumption is heavily influenced by user’s lifestyle
•Input: Feedback is implicit through streaming behavior, collection adds,
browse history, search history etc
User Plays to Track Recs 11
User Plays to Track Recs 11
1. Weighted play counts from logs
User Plays to Track Recs 11
1. Weighted play counts from logs
2. Train Model using the input signals
User Plays to Track Recs 11
1. Weighted play counts from logs
2. Train Model using the input signals
3. Generate recs from
the trained model
User Plays to Track Recs 11
1. Weighted play counts from logs
2. Train Model using the input signals
3. Generate recs from
the trained model
4. Post process the
recommendations
12
Step 1: ETL of Logs
•Extract and transform the anonymized logs to training data set
•Case: Logs -> (user, track, wt.count)
Step 2: Construct Big Matrix! 13
Tracks(n)
Users(m)
Vidhya
Burn by Ellie Goulding
Step 2: Construct Big Matrix! 13
Tracks(n)
Users(m)
Vidhya
Burn by Ellie Goulding
Order of 70M x 30M!
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
User Track Matrix:
(m x n)
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
User Vector Matrix:
X: (m x f)
User Track Matrix:
(m x n)
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
m m
n
m n
User Vector Matrix:
X: (m x f)
Track Vector Matrix:
Y: (n x f)
User Track Matrix:
(m x n)
Latent Factor Models 14
Vidhya
Burn
.. . . . .
.. . . . .
.. . . . .
.. . . . .
.. . . . .
•Use a “small” representation for each user and items(tracks): f-dimensional
vectors
.. .
.. .
.. .
.. .
. .
...
...
...
...
..
(here, f = 2)
m m
n
m n
User Vector Matrix:
X: (m x f)
Track Vector Matrix:
Y: (n x f)
User Track Matrix:
(m x n)
Matrix Factorization using Implicit Feedback 15
Matrix Factorization using Implicit Feedback
User Track Play
Count Matrix
15
Matrix Factorization using Implicit Feedback
User Track Play
Count Matrix
User Track
Preference
Matrix
Binary Label:
1 => played
0 => not played
15
Matrix Factorization using Implicit Feedback
User Track Play
Count Matrix
User Track
Preference
Matrix
Binary Label:
1 => played
0 => not played
Weights
Matrix
Weights based on play count
and smoothing
15
Equation(s) Alert!
16
Implicit Matrix Factorization 17
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
X YUsers
Tracks
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vectoryi
Alternating Least Squares 18
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
Tracks
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix tracks
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
yi
19
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix tracks
Solve for users
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
20
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
21
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
Solve for tracks
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
22
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
Solve for tracks
Repeat until convergence…
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
23
1 0 0 0 1 0 0 1
0 0 1 0 0 1 0 0
1 0 1 0 0 0 1 1
0 1 0 0 0 1 0 0
0 0 1 0 0 1 0 0
1 0 0 0 1 0 0 1
X YUsers
• = bias for user
• = bias for item
• = regularization parameter
• = 1 if user streamed track else 0
•
• = user latent factor vector
• = item latent factor vector
Fix users
Solve for tracks
Repeat until convergence…
•Aggregate all (user, track) streams into a large matrix
•Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by
minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight
•Why?: Once learned, the top recommendations for a user are the top inner products between
their latent factor vector in X and the track latent factor vectors in Y.
Alternating Least Squares
yi
Tracks
Vectors
•“Compact” representation for users and items(tracks) in the same space
Why Vectors? 25
•Vectors encode higher order dependencies
•Users and Items in the same vector space!
•Use vector similarity to compute:
•Item-Item similarities
•User-Item recommendations
•Linear complexity: order of number of latent factors
•Easy to scale up
26
•Compute track similarities and track recommendations for users as a similarity
measure
Step 3: Compute Recs!
•Euclidian Distance
•Cosine Similarity
•Pearson Correlation
26
•Compute track similarities and track recommendations for users as a similarity
measure
Step 3: Compute Recs!
Recommendations via Cosine Similarity 27
Recommendations via Cosine Similarity 27
28
Annoy
•70 million users, at least 4 million tracks for candidates per user
•Brute Force Approach:
•O(70M x 4M x 10) ~= 0(3 peta-operations)!
• Approximate Nearest Neighbor Oh Yeah!
• Uses Local Sensitive Hashing
• Clone: https://github.com/spotify/annoy
29
•Apply Filters
•Interacted music
•Holiday music anyone?
•Factor for:
•Diversity
•Freshness
•Popularity
•Demographics
•Seasonality
Step 4: Post Processing
30
70 Million users x 30
Million tracks. How to
scale?
Matrix Factorization with MapReduce
31
Reduce stepMap step
u % K = 0
i % L = 0
u % K = 0
i % L = 1
...
u % K = 0
i % L = L-1
u % K = 1
i % L = 0
u % K = 1
i % L = 1
... ...
... ... ... ...
u % K = K-1
i % L = 0
... ...
u % K = K-1
i % L = L-1
item vectors
item%L=0
item vectors
item%L=1
item vectors
i % L = L-1
user vectors
u % K = 0
user vectors
u % K = 1
user vectors
u % K = K-1
all log entries
u % K = 1
i % L = 1
u % K = 0
u % K = 1
u % K = K-1
•Split the matrix up into K x L blocks.
•Each mapper gets a different block, sums up intermediate terms, then key by
user (or item) to reduce final user (or item) vector.
Matrix Factorization with MapReduce
32
One map task
Distributed
cache:
All user vectors
where u % K = x
Distributed
cache:
All item vectors
where i % L = y
Mapper Emit contributions
Map input:
tuples (u, i, count)
where
u % K = x
and
i % L = y
Reducer New vector!
•Input to Mapper is a list of (user, item, count) tuples
– user modulo K is the same for all users in block
– item modulo L is the same for all items in the block
– Mapper aggregates intermediate contributions for each user (or item)
– Eg: K=4, Mapper #1 gets user 1, 5, 9, 13 etc
– Reducer keys by user (or item), aggregates intermediate mapper sums and solves closed form for final user
(or item) vector
Music Recommendations Data Flow
33
34
Source:
Revisiting YOLO!
35
“You Only Listen Once to judge
recommendations” problem
Optimizing for the Yolo Problem
•OFFLINE TESTING:
•Experts’ Inputs
•Measure accuracy
•A/B TESTS: control vs a/b group. Some useful metrics we consider:
•DAU / WAU / MAU
•Retention
•Session Length
•Skip Rate
36
Challenge Accepted!
•Cold start problem for both users and new music/upcoming artists:
•Content based signals, real time recommendation
•Measuring recommendation quality:
•A/B test metrics
•Active forums for getting user feedback
•Scam Attacks:
•Rule based model to detect scammers
•Humans choices are not always predictable:
•Faith in humanity
37
What Next?
•Personalize user experience on Spotify for every moment:
•Right Now
•Recommend other media formats:
•Podcasts
•Video
•Power music recommendations on other platforms:
•Google Now
38
Join the
Band!
We are hiring!
39
Thank You!
You can reach me @
Email: vidhya@spotify.com
Twitter: @vid052

Contenu connexe

Tendances

Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsChris Johnson
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyChing-Wei Chen
 
Machine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data MeetupMachine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data MeetupAndy Sloane
 
Music Personalization : Real time Platforms.
Music Personalization : Real time Platforms.Music Personalization : Real time Platforms.
Music Personalization : Real time Platforms.Esh Vckay
 
Personalized Playlists at Spotify
Personalized Playlists at SpotifyPersonalized Playlists at Spotify
Personalized Playlists at SpotifyRohan Agrawal
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experienceMounia Lalmas-Roelleke
 
Collaborative Filtering at Spotify
Collaborative Filtering at SpotifyCollaborative Filtering at Spotify
Collaborative Filtering at SpotifyErik Bernhardsson
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyNeville Li
 
How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyJosh Baer
 
Approximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupApproximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupErik Bernhardsson
 
The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyJosh Baer
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyChris Johnson
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix ScaleJustin Basilico
 
The Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainThe Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainRafał Wojdyła
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Sudeep Das, Ph.D.
 
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand StreamingTalk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand StreamingSameera Horawalavithana
 

Tendances (20)

Scala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music RecommendationsScala Data Pipelines for Music Recommendations
Scala Data Pipelines for Music Recommendations
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at Spotify
 
Machine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data MeetupMachine learning @ Spotify - Madison Big Data Meetup
Machine learning @ Spotify - Madison Big Data Meetup
 
Music Personalization : Real time Platforms.
Music Personalization : Real time Platforms.Music Personalization : Real time Platforms.
Music Personalization : Real time Platforms.
 
Personalized Playlists at Spotify
Personalized Playlists at SpotifyPersonalized Playlists at Spotify
Personalized Playlists at Spotify
 
Personalizing the listening experience
Personalizing the listening experiencePersonalizing the listening experience
Personalizing the listening experience
 
Collaborative Filtering at Spotify
Collaborative Filtering at SpotifyCollaborative Filtering at Spotify
Collaborative Filtering at Spotify
 
Scala Data Pipelines @ Spotify
Scala Data Pipelines @ SpotifyScala Data Pipelines @ Spotify
Scala Data Pipelines @ Spotify
 
Recommending and searching @ Spotify
Recommending and searching @ SpotifyRecommending and searching @ Spotify
Recommending and searching @ Spotify
 
Spotify: Data center & Backend buildout
Spotify: Data center & Backend buildoutSpotify: Data center & Backend buildout
Spotify: Data center & Backend buildout
 
How Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At SpotifyHow Apache Drives Music Recommendations At Spotify
How Apache Drives Music Recommendations At Spotify
 
Approximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetupApproximate nearest neighbor methods and vector models – NYC ML meetup
Approximate nearest neighbor methods and vector models – NYC ML meetup
 
Search @ Spotify
Search @ Spotify Search @ Spotify
Search @ Spotify
 
The Evolution of Big Data at Spotify
The Evolution of Big Data at SpotifyThe Evolution of Big Data at Spotify
The Evolution of Big Data at Spotify
 
Interactive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and SpotifyInteractive Recommender Systems with Netflix and Spotify
Interactive Recommender Systems with Netflix and Spotify
 
Recommendation at Netflix Scale
Recommendation at Netflix ScaleRecommendation at Netflix Scale
Recommendation at Netflix Scale
 
Distributed "Web Scale" Systems
Distributed "Web Scale" SystemsDistributed "Web Scale" Systems
Distributed "Web Scale" Systems
 
The Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and PainThe Evolution of Hadoop at Spotify - Through Failures and Pain
The Evolution of Hadoop at Spotify - Through Failures and Pain
 
Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it! Crafting Recommenders: the Shallow and the Deep of it!
Crafting Recommenders: the Shallow and the Deep of it!
 
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand StreamingTalk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
Talk on Spotify: Large Scale, Low Latency, P2P Music-on-Demand Streaming
 

En vedette

Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ SpotifyNikhil Tibrewal
 
Mugo one pager
Mugo one pagerMugo one pager
Mugo one pagerori segal
 
Jackdaw research music survey report
Jackdaw research music survey reportJackdaw research music survey report
Jackdaw research music survey reportJan Dawson
 
How We Listen to Music - SXSW 2015
How We Listen to Music - SXSW 2015How We Listen to Music - SXSW 2015
How We Listen to Music - SXSW 2015Paul Lamere
 

En vedette (6)

Playlist Recommendations @ Spotify
Playlist Recommendations @ SpotifyPlaylist Recommendations @ Spotify
Playlist Recommendations @ Spotify
 
Music & interaction
Music & interactionMusic & interaction
Music & interaction
 
Music survey results (2)
Music survey results (2)Music survey results (2)
Music survey results (2)
 
Mugo one pager
Mugo one pagerMugo one pager
Mugo one pager
 
Jackdaw research music survey report
Jackdaw research music survey reportJackdaw research music survey report
Jackdaw research music survey report
 
How We Listen to Music - SXSW 2015
How We Listen to Music - SXSW 2015How We Listen to Music - SXSW 2015
How We Listen to Music - SXSW 2015
 

Similaire à Building Data Pipelines for Music Recommendations at Spotify

Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with SparkChris Johnson
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systemsNAVER Engineering
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringChangsung Moon
 
Recsys 2018 overview and highlights
Recsys 2018 overview and highlightsRecsys 2018 overview and highlights
Recsys 2018 overview and highlightsSandra Garcia
 
Practical Deep Learning Using Tensor Flow - Sandeep Kath
Practical Deep Learning Using Tensor Flow - Sandeep KathPractical Deep Learning Using Tensor Flow - Sandeep Kath
Practical Deep Learning Using Tensor Flow - Sandeep KathSandeep Kath
 
Recommender systems
Recommender systemsRecommender systems
Recommender systemsTamer Rezk
 
Scalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSHScalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSHMaruf Aytekin
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...Prabhu Kumar
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksMarco Brambilla
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation SystemsRobin Reni
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfAlphaIssaghaDiallo
 
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...Matteo Ferroni
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Sc Huang
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReducesscdotopen
 
microposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdfmicroposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdfSunnySam26
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedOmid Vahdaty
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptAruneshAdarsh
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptHumayilZia
 
Data science-2013-heekim
Data science-2013-heekimData science-2013-heekim
Data science-2013-heekimHaklae Kim
 

Similaire à Building Data Pipelines for Music Recommendations at Spotify (20)

Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
FitCompete
FitCompeteFitCompete
FitCompete
 
Recent advances in deep recommender systems
Recent advances in deep recommender systemsRecent advances in deep recommender systems
Recent advances in deep recommender systems
 
Recommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative FilteringRecommender Systems: Advances in Collaborative Filtering
Recommender Systems: Advances in Collaborative Filtering
 
Recsys 2018 overview and highlights
Recsys 2018 overview and highlightsRecsys 2018 overview and highlights
Recsys 2018 overview and highlights
 
Practical Deep Learning Using Tensor Flow - Sandeep Kath
Practical Deep Learning Using Tensor Flow - Sandeep KathPractical Deep Learning Using Tensor Flow - Sandeep Kath
Practical Deep Learning Using Tensor Flow - Sandeep Kath
 
Recommender systems
Recommender systemsRecommender systems
Recommender systems
 
Scalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSHScalable Recommendation Algorithms with LSH
Scalable Recommendation Algorithms with LSH
 
A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...A new similarity measurement based on hellinger distance for collaborating fi...
A new similarity measurement based on hellinger distance for collaborating fi...
 
Community analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networksCommunity analysis using graph representation learning on social networks
Community analysis using graph representation learning on social networks
 
Recommendation Systems
Recommendation SystemsRecommendation Systems
Recommendation Systems
 
IntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdfIntroductionRecommenderSystems_Petroni.pdf
IntroductionRecommenderSystems_Petroni.pdf
 
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
[EUC2016] FFWD: latency-aware event stream processing via domain-specific loa...
 
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
Random Walk by User Trust and Temporal Issues toward Sparsity Problem in Soci...
 
Scalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduceScalable Similarity-Based Neighborhood Methods with MapReduce
Scalable Similarity-Based Neighborhood Methods with MapReduce
 
microposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdfmicroposts2015presentation-150518124457-lva1-app6892.pdf
microposts2015presentation-150518124457-lva1-app6892.pdf
 
Machine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data DemystifiedMachine Learning Essentials Demystified part2 | Big Data Demystified
Machine Learning Essentials Demystified part2 | Big Data Demystified
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
LinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.pptLinearAlgebra_2016updatedFromwiki.ppt
LinearAlgebra_2016updatedFromwiki.ppt
 
Data science-2013-heekim
Data science-2013-heekimData science-2013-heekim
Data science-2013-heekim
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Building Data Pipelines for Music Recommendations at Spotify

  • 1. October 17, 2015 Data Pipelines for Music Recommendations @ Spotify Vidhya Murali @vid052
  • 2. Vidhya Murali Who Am I? 2 •Areas of Interest: Data & Machine Learning •Data Engineer @Spotify •Masters Student from the University of Wisconsin Madison aka Happy Badger for life!
  • 3. “Torture the data, and it will confess!” 3 – Ronald Coase, Nobel Prize Laureate
  • 4. Spotify’s Big Data 4 •Started in 2006, now available in 58 countries • 70+ million active users, 20+ million paid subscribers • 30+ million songs in our catalog, ~20K added every day • 1.5 billion playlists so far and counting • 1 TB of user data logged every day • Hadoop cluster with 1500 nodes • ~20,000 Hadoop jobs per day
  • 5. Music Recommendations at Spotify Features: Discover Discover Weekly Moments Radio Related Artists 5
  • 7. Approaches 7 •Manual curation by Experts •Editorial Tagging •Metadata (e.g. Label provided data, NLP over News, Blogs) •Audio Signals •Collaborative Filtering Model
  • 8. Approaches 7 •Manual curation by Experts •Editorial Tagging •Metadata (e.g. Label provided data, NLP over News, Blogs) •Audio Signals •Collaborative Filtering Model
  • 9. Collaborative Filtering Model 8 •Find patterns from user’s past behavior to generate recommendations •Domain independent •Scalable •Accuracy (Collaborative Model) >= Accuracy (Content Based Model)
  • 10. Definition of CF 9 Hey, I like tracks P, Q, R, S! Well, I like tracks Q, R, S, T! Then you should check out track P! Nice! Btw try track T! Legacy Slide of Erik Bernhardsson
  • 12. The YoLo Problem 10 •YoLo Problem: “You Only Listen Once” to judge recommendations •Goal: Predict if users will listen to new music (new to user)
  • 13. The YoLo Problem 10 •YoLo Problem: “You Only Listen Once” to judge recommendations •Goal: Predict if users will listen to new music (new to user) •Challenges •Scale of catalog (30M songs + ~20K added every day) •Repeated consumption of music is not very uncommon •Music is niche •Music consumption is heavily influenced by user’s lifestyle
  • 14. The YoLo Problem 10 •YoLo Problem: “You Only Listen Once” to judge recommendations •Goal: Predict if users will listen to new music (new to user) •Challenges •Scale of catalog (30M songs + ~20K added every day) •Repeated consumption of music is not very uncommon •Music is niche •Music consumption is heavily influenced by user’s lifestyle •Input: Feedback is implicit through streaming behavior, collection adds, browse history, search history etc
  • 15. User Plays to Track Recs 11
  • 16. User Plays to Track Recs 11 1. Weighted play counts from logs
  • 17. User Plays to Track Recs 11 1. Weighted play counts from logs 2. Train Model using the input signals
  • 18. User Plays to Track Recs 11 1. Weighted play counts from logs 2. Train Model using the input signals 3. Generate recs from the trained model
  • 19. User Plays to Track Recs 11 1. Weighted play counts from logs 2. Train Model using the input signals 3. Generate recs from the trained model 4. Post process the recommendations
  • 20. 12 Step 1: ETL of Logs •Extract and transform the anonymized logs to training data set •Case: Logs -> (user, track, wt.count)
  • 21. Step 2: Construct Big Matrix! 13 Tracks(n) Users(m) Vidhya Burn by Ellie Goulding
  • 22. Step 2: Construct Big Matrix! 13 Tracks(n) Users(m) Vidhya Burn by Ellie Goulding Order of 70M x 30M!
  • 23. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n
  • 24. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n User Track Matrix: (m x n)
  • 25. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n User Vector Matrix: X: (m x f) User Track Matrix: (m x n)
  • 26. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. m m n m n User Vector Matrix: X: (m x f) Track Vector Matrix: Y: (n x f) User Track Matrix: (m x n)
  • 27. Latent Factor Models 14 Vidhya Burn .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . •Use a “small” representation for each user and items(tracks): f-dimensional vectors .. . .. . .. . .. . . . ... ... ... ... .. (here, f = 2) m m n m n User Vector Matrix: X: (m x f) Track Vector Matrix: Y: (n x f) User Track Matrix: (m x n)
  • 28. Matrix Factorization using Implicit Feedback 15
  • 29. Matrix Factorization using Implicit Feedback User Track Play Count Matrix 15
  • 30. Matrix Factorization using Implicit Feedback User Track Play Count Matrix User Track Preference Matrix Binary Label: 1 => played 0 => not played 15
  • 31. Matrix Factorization using Implicit Feedback User Track Play Count Matrix User Track Preference Matrix Binary Label: 1 => played 0 => not played Weights Matrix Weights based on play count and smoothing 15
  • 33. Implicit Matrix Factorization 17 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. X YUsers Tracks • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vectoryi
  • 34. Alternating Least Squares 18 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers Tracks • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix tracks •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. yi
  • 35. 19 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix tracks Solve for users •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 36. 20 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 37. 21 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users Solve for tracks •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 38. 22 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users Solve for tracks Repeat until convergence… •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 39. 23 1 0 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 1 X YUsers • = bias for user • = bias for item • = regularization parameter • = 1 if user streamed track else 0 • • = user latent factor vector • = item latent factor vector Fix users Solve for tracks Repeat until convergence… •Aggregate all (user, track) streams into a large matrix •Goal: Approximate binary preference matrix by the inner product of 2 smaller matrices by minimizing the weighted RMSE (root mean squared error) using a function of total plays as weight •Why?: Once learned, the top recommendations for a user are the top inner products between their latent factor vector in X and the track latent factor vectors in Y. Alternating Least Squares yi Tracks
  • 40. Vectors •“Compact” representation for users and items(tracks) in the same space
  • 41. Why Vectors? 25 •Vectors encode higher order dependencies •Users and Items in the same vector space! •Use vector similarity to compute: •Item-Item similarities •User-Item recommendations •Linear complexity: order of number of latent factors •Easy to scale up
  • 42. 26 •Compute track similarities and track recommendations for users as a similarity measure Step 3: Compute Recs!
  • 43. •Euclidian Distance •Cosine Similarity •Pearson Correlation 26 •Compute track similarities and track recommendations for users as a similarity measure Step 3: Compute Recs!
  • 44. Recommendations via Cosine Similarity 27
  • 45. Recommendations via Cosine Similarity 27
  • 46. 28 Annoy •70 million users, at least 4 million tracks for candidates per user •Brute Force Approach: •O(70M x 4M x 10) ~= 0(3 peta-operations)! • Approximate Nearest Neighbor Oh Yeah! • Uses Local Sensitive Hashing • Clone: https://github.com/spotify/annoy
  • 47. 29 •Apply Filters •Interacted music •Holiday music anyone? •Factor for: •Diversity •Freshness •Popularity •Demographics •Seasonality Step 4: Post Processing
  • 48. 30 70 Million users x 30 Million tracks. How to scale?
  • 49. Matrix Factorization with MapReduce 31 Reduce stepMap step u % K = 0 i % L = 0 u % K = 0 i % L = 1 ... u % K = 0 i % L = L-1 u % K = 1 i % L = 0 u % K = 1 i % L = 1 ... ... ... ... ... ... u % K = K-1 i % L = 0 ... ... u % K = K-1 i % L = L-1 item vectors item%L=0 item vectors item%L=1 item vectors i % L = L-1 user vectors u % K = 0 user vectors u % K = 1 user vectors u % K = K-1 all log entries u % K = 1 i % L = 1 u % K = 0 u % K = 1 u % K = K-1 •Split the matrix up into K x L blocks. •Each mapper gets a different block, sums up intermediate terms, then key by user (or item) to reduce final user (or item) vector.
  • 50. Matrix Factorization with MapReduce 32 One map task Distributed cache: All user vectors where u % K = x Distributed cache: All item vectors where i % L = y Mapper Emit contributions Map input: tuples (u, i, count) where u % K = x and i % L = y Reducer New vector! •Input to Mapper is a list of (user, item, count) tuples – user modulo K is the same for all users in block – item modulo L is the same for all items in the block – Mapper aggregates intermediate contributions for each user (or item) – Eg: K=4, Mapper #1 gets user 1, 5, 9, 13 etc – Reducer keys by user (or item), aggregates intermediate mapper sums and solves closed form for final user (or item) vector
  • 52. 34
  • 53. Source: Revisiting YOLO! 35 “You Only Listen Once to judge recommendations” problem
  • 54. Optimizing for the Yolo Problem •OFFLINE TESTING: •Experts’ Inputs •Measure accuracy •A/B TESTS: control vs a/b group. Some useful metrics we consider: •DAU / WAU / MAU •Retention •Session Length •Skip Rate 36
  • 55. Challenge Accepted! •Cold start problem for both users and new music/upcoming artists: •Content based signals, real time recommendation •Measuring recommendation quality: •A/B test metrics •Active forums for getting user feedback •Scam Attacks: •Rule based model to detect scammers •Humans choices are not always predictable: •Faith in humanity 37
  • 56. What Next? •Personalize user experience on Spotify for every moment: •Right Now •Recommend other media formats: •Podcasts •Video •Power music recommendations on other platforms: •Google Now 38
  • 57. Join the Band! We are hiring! 39
  • 58. Thank You! You can reach me @ Email: vidhya@spotify.com Twitter: @vid052