3. Deezer overview
RecSysFr #3
● 420 employees in 20 cities
● 5M albums
● 40M tracks
● 100M playlists
● 16M MAU
● 6M subscribers
4. ● ~500 servers
● 4.5 PB storage for audio files
● 1.5 TB of logs / day
● ~1B requests / day
● ~30k new albums each week
● Hadoop cluster with 1.5PB storage,
4TB RAM, 1000+ vcores
Some technical numbers
RecSysFr #3
10. RecSysFr #3
Architecture overview
Content data:
- Tags
- Popularity
User data:
- Taste model
- Hot tracks
- Behaviors
Build tracklist
- Data cache
- User action history
- Update user models
- Consolidate tags data
- Build indexes
actions logs
11. RecSysFr #3
● % users listening more than 10mn
● % users who reconnect more than 3
days last week
● % users who do a like / dislike
=> take care of statistical confidence !
A/B Tests evaluation metrics
12. ● A/B tests are costly, long
● Want to test more cases
Offline testing:
● setup benchmarking methodology
● Freeze data and evaluate algos with user future actions
RecSysFr #3
Offline testing / benchmarking