2. The Data
“The files are in the computer.” – Derek Zoolander
Proprietary & Confidential
Proprietary & Confidential
3. Pandora
200+ million registered users
70+ million active monthly users
Average Pandora listener listens for 17 hours
a month
More than 80% of listening occurs on mobile
and other connected devices
8.06% of total US radio listening hours
Proprietary & Confidential
4. Pandora
1.47+ billion listening hours in October
30+ billion thumbs
5+ billion stations
Approximately one out of every two US
smartphone users has listened to
Pandora in the past month
Proprietary & Confidential
5. Experimentation & Metrics
“It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you
are. If it doesn’t agree with experiment, it’s wrong.” – Richard Feynman
Proprietary & Confidential
Proprietary & Confidential
6. A/B Testing
All improvements begin as a hypothesis.
Hypotheses beget experiments.
Experiments are tried against real Pandora listeners.
When an experiment beats the current algorithm, ship it!
Rinse, wash, repeat.
A/B testing is how you leverage scale. More data lets you build stronger
models and try fancy data intensive algorithms, but the big win comes
from unlocking A/B testing. Online evaluation > Offline evaluation.
Proprietary & Confidential
7. Metrics
How you judge experiments shapes where you are headed.
Choose the wrong measuring stick and you wind up in the wrong
place.
Choose the right measuring stick and progress is inevitable.
Improvements come both from better hypotheses to run experiments
but also from better measuring sticks.
Incremental improvements tend to come from hypotheses.
Leapfrog improvements tend to come from better measuring sticks.
Proprietary & Confidential
8. Evolution of Big Picture Metrics
Thumb up percentage
Total listening hours
Listener return rate
Machine learning doesn’t exist in a vacuum.
Make sure you’re optimizing the right thing.
Approach problems by first deciding what you’re
trying to achieve, then think technology. If ML
isn’t the right tool for the job, don’t use it.
8
Proprietary & Confidential
9. Deeper Metrics
Relevance
Prediction accuracy
Musical diversity
Novelty / Surprisal
Awesomeness
These metrics all support our big picture goal at Pandora:
Connecting people with music they love.
9
Proprietary & Confidential
10. How It Works
“Truth is what works.” – William James
Proprietary & Confidential
Proprietary & Confidential
12. Ensemble Recommendations
The Music Genome Project
People are truly unique
No single approach to music
recommendations works for
everybody
Using a variety of recommendation
techniques and combining them
intelligently works – Pandora
uses 50+ algorithms
The more varied the individual
techniques the stronger the
ensemble – seek orthogonality
Proprietary & Confidential
13. Content-Based Recommendations
The Music Genome Project
25 music analysts
13 years in development
Up to 450 attributes identified
per track – everything from
the melody, harmony, and
instrumentation to rhythm, vocals,
and lyrics
As of yet the human ear still
understands music better than
machines
Proprietary & Confidential
14. Collaborative Filtering
The Music Genome Project
At small scale matrix factorization
techniques work wonders
At Pandora scale MF techniques
make less sense for many
problems
Don’t waste cycles doing something
fancy when scale allows you to
simply measure
Simple item-item recommenders
win at scale
Proprietary & Confidential
15. Collective Intelligence – reinforcement learning
The Music Genome Project
Our listeners know what they want
(most of the time)
Pandora is a platform for listeners
to cooperate in making the music
better for themselves
We build, grow, measure, and
enhance this ecosystem – but
mostly we stay out of the way
Pandora is awesome because our
listeners are awesome
Proprietary & Confidential