4. 4
Tutorials
• T1: The Recommender Problem Revisited
• T2: Personalized Location Recommendation on Location-based Social Networks
• T3: Cross-Domain Recommender Systems
• T4: Social Recommender Systems
Keynotes
• K1: Neil Hunt - Quantifying the Value of Better Recommendations
• K2: Jeff Dean - Large Scale Machine Learning for Predictive Tasks
• K3: Hector Garcia-Molina - Thoughts on the Future of Recommender Systems
8th ACM Conference on Recommender Systems
5. 5
Main Sessions
• S1: Novel Applications
• S2: Novel Setups
• S3: Cold Start and Hybrid Recommenders
• S4: Metrics and Evaluation
• S5: Diversity, Novelty and Serendipity
• S6: Recommendation Methods and Theory
• S7: Ranking and Top-N Recommendation
• S8: Matrix Factorization
8th ACM Conference on Recommender Systems
6. 6
Workshops
• W1: Controlled Experimentation in Recommendation, Ranking & Response Prediction
• W2: Recommender Systems and the Social Web
• W3: Crowdsourcing and Human Computation for Recommender Systems
• W4: Interfaces and Human Decision Making for Recommender Systems
• W5: New Trends in Content-based Recommender Systems
• W6: RecSys Challenge
• W7: Recommender Systems Evaluation
• W8: Recommendation Systems for Television and Movies
• W9: Large Scale Recommender Systems
8th ACM Conference on Recommender Systems
7. 7
Industry Sessions
• Facebook
› news recommendation
› page recommendation
• LinkedIn: A/B testing
• Shopkick
• StichFix
• The Cilmate Corporation
8th ACM Conference on Recommender Systems
10. 10
T1: The Recommender Problem Revisited
Source: http://www.slideshare.net/xamat/recsys-2014-tutorial-the-recommender-problem-revisited
11. 11
• Legacy from Netflix Prize: Singular Value Decomposition++ (SVD++), Restricted Boltzmann Machines (RBM)
• Ranking > RMSE, learning to rank is better than optimizing to RMSE
• Ranking measures: NDCG, MRR, FCP (Fraction of Concordant Pairs)
• Limitation of Collaborative Filtering: Cold Start, Popularity Bias
• Limitation of Content-Based Filtering: Requires meaningful features, difficult to implement serendipity, easy to
overfit
• Hybrid approaches (weighting, switching, mixing, feature combination, cascade, feature augmentation, meta-level)
• Clustering: LSH (Locality Sensitive Hashing) for NN, K-Means, Spectral Clustering, LDA (Latent Dirichlet Allocation),
Association Rules
• Features: Serendipity, Diversity, Awareness, Explanation
T1: The Recommender Problem Revisited
Source: http://www.slideshare.net/xamat/recsys-2014-tutorial-the-recommender-problem-revisited
12. 12
T1: The Recommender Problem Revisited
Source: http://www.slideshare.net/xamat/recsys-2014-tutorial-the-recommender-problem-revisited
13. 13
• Supervised learning to rank
› Pointwise: Linear/Logistic Regression, SVM, GBDT
› Pairwise: RankSVM, RankBoost, RankNet, Frank
› Listwise: RankCosine, ListNet, CLiMF, TFMAP, SVM-MAP, AdaRank
• Tensor Factorization:
› Models: HOSVD, FM, Gradient Boosting FM
› Solvers: ALS, SGD, Adaptive SGD, MCMC
• Deep Learning: CF and CBF, training on GPUs and AWS (Spotify)
• Social Recommendations: Advogato, Appleseed, MoleTrust, TidalTrust
T1: The Recommender Problem Revisited
Source: http://www.slideshare.net/xamat/recsys-2014-tutorial-the-recommender-problem-revisited
14. 14
T1: The Recommender Problem Revisited
Page composition
• Accuracy vs. Diversity
• Discovery vs. Continuation
• Depth vs. Coverage
• Freshness vs. Stability
• Recommendations vs.
Tasks
15. 15
T1: The Recommender Problem Revisited
Exploration vs. Exploitation
• Multi-armed Bandits (MAB)
• 𝜀 greedy strategy: explore with 𝜀 (5%), exploit with 1 − 𝜀
• Choose an item/algorithm (MAB testing)
• Upper Confidence Bound (using variance)
• Thompson Sampling (posterior distribution)
Hastagiri P. Vanchinathan, Isidor Nikolic, Fabio De Bona, and Andreas Krause. 2014. Explore-exploit in top-N recommender systems via Gaussian processes.
Negar Hariri, Bamshad Mobasher, and Robin Burke. 2014. Context adaptation in interactive recommender systems.
Lihong Li, Wei Chu, John Langford, and Robert E. Schapire. 2010. A contextual-bandit approach to personalized news article recommendation.
17. 17
T3: Cross-Domain Recommender Systems
Task Goal Ratio
Multi-domain
Cross- selling
Diversity
Serendipity
20%
Linked-domain Accuracy 55%
Cross-domain
Cold-start
New users
New items
25%
Source: http://recsys.acm.org/wp-content/uploads/2014/10/recsys2014-tutorial-cross_domain.pdf
18. 18
T3: Cross-Domain Recommender Systems
Domain Example Ratio
Attribute Comedy Thriller 12%
Type Movies Books 9%
Item Movies Restaurants 55%
System Netflix MovieLens 24%
Goal Ratio
Cold-start 5%
New user 15%
New item 5%
Improving accuracy 55%
Diversity 5%
Privacy 5%
User model 10%
Source: http://recsys.acm.org/wp-content/uploads/2014/10/recsys2014-tutorial-cross_domain.pdf
19. 19
• Learning approaches
› Linking/aggregating knowledge (merging user preferences, mediating user modeling data)
› Sharing/transferring knowledge (sharing latent features, transferring patterns: Code Book Transfer)
• Difficulties:
› Strongly depends on data overlapping
› Sometimes noisy and useless
• Research issues:
› Cross-domain vs. contextual models
› User model elicitation effort
› Real-life datasets
T3: Cross-Domain Recommender Systems
Source: http://recsys.acm.org/wp-content/uploads/2014/10/recsys2014-tutorial-cross_domain.pdf
21. 21
• About Netflix: 7B hours/quarter, 50M users, 90mins/day, 150M choices/day
• Current trend: Content discovery („There are no bad shows, just shows with small audiences”)
• Consumer: „I dont need suggestions, just show me the good stuff”, „Dont hide the items, let me evaluate them”
• Oracle vs. advisor? „Thrst me, you’ll love this” vs. „Based upon your …, you’ll probably enjoy this”
• Future TV: Personalized channel for each users, 20-50 personalized choices, unlimited catalog, recommended AD-s
• Metrics: Distribution of frequency vs. hours of viewing
• Moment of truth: 1-2 minutes to find something, 20-50 chances to connect
• Filter Bubbles and Echo Chambers: Recommendations are reinforcing existing taste,
dont exposure users to the new, unexpected or different
• Diversity, serendipity and explanation matters, but hard to measure the impact
K1: Neil Hunt - Quantifying the Value of Better Recommendations
Source: http://www.slideshare.net/ndhunt/recsys-2014-the-value-of-better-recommendations
22. 22
K1: Neil Hunt - Quantifying the Value of Better Recommendations
Source: http://www.slideshare.net/ndhunt/recsys-2014-the-value-of-better-recommendations
23. 23
• Required: history, aggregated behavior of other users, understanding user contexts, understanding texts
• Deep learning is hot topic (DNN) Embedding function into 1000 dimensional space, densifying data, space
visualization, very effective for wide range of tasks
• Learning analogies, e.g. hotter-hot+big ~= bigger, fell-fall-fallen
• Interesting stats about Google Speech: 5 days training, 800 machines
• Text processing: bag of words -> topic modeling -> sequential neural networks (RNN, LSTM)
• Sentiment Analysis (Stanford Treebank) 1
• Paragraph Vector Model 2
• Time > Accuracy (Google), simple algorithms, scalability (patience threshold)
K2: Jeff Dean - Large Scale Machine Learning for Predictive Tasks
1 Socher R, Perelygin A, Wu JY, Chuang J, Manning CD, et al. Recursive deep models for semantic compositionality
over a sentiment treebank; 2013. Citeseer. pp. 1631-1642.
2 Quoc V. Le, Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. 31st International
Conference on Machine Learning, Beijing, China, June, 2014.
24. 24
K2: Jeff Dean - Large Scale Machine Learning for Predictive Tasks
Source: Stanford Treebank: http://nlp.stanford.edu/sentiment/index.html
25. 25
• Convergence: recommendations + search + advertising
• Case study: CourseRank
› Recommendations: Additional courses
› Search: Listing courses
› Advertising: Books
› Control: Data visualization about courses
› Important to understand the domain and users
• Case study: DataSift
› Search engine (that can handle rich queries) as recommender engine,
› Activating the crowd fixing knowledge that cannot be modeled by collaboration (e.g. images)
› Crowds are slow
K3: H. Garcia-Molina – Thoughts on the Future of Rec. Systems
27. 27
W8K1: Brendan Kitts – Addressable Ad Targeting for TV
• TV is dead?
• Positive trend:
› TV
› Direct mail
› Radio
• Negative trend:
› Yellow pages
› Newspaper
› Magazines
4,000
40,000
1952 1962 1972 1982 1992 2002 2012
Radio Direct Mail Yellow Pages
Magazines Newspapers TOTAL TVSource: B. Kitts - Tectonic Shifts in Television Advertising (RecSys 2014)
28. 28
W8K1: Brendan Kitts – Addressable Ad Targeting for TV
McDonough, P. (2012), The Evolution of The Video Consumer, Audience Measurement 7 presentation, Nielsen Corporation
29. 29
0.00
0.50
1.00
1.50
2.00
2.50
3.00
K2-11 T12-17 A18-24 A25-34 A35-44 A45-54 A55-64 A65+
Video on Mobile
Video on Mobile
0.00
0.50
1.00
1.50
2.00
2.50
K2-11 T12-17 A18-24 A25-34 A35-44 A45-54 A55-64 A65+
Video on internet
Video on internet
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
K2-11 T12-17 A18-24 A25-34 A35-44 A45-54 A55-64 A65+
Time-shiftedTV
Time-shifted TV
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
K2-11 T12-17 A18-24 A25-34 A35-44 A45-54 A55-64 A65+
TraditionalTV
Traditional TV
12-17 year olds
25-34 year olds
18-24 year olds
65+ year olds
Source: B. Kitts - Tectonic Shifts in Television Advertising (RecSys 2014)
30. 30
W8K1: Brendan Kitts – Addressable Ad Targeting for TV
• Demographic based targeting
• 3500+ demographic values per person (highest vehicle value, jewelery, has children)
• Targeting: Pr(Buyer|Media)
• Media heatmap
• Auto target creating (what; enrichment, profiling, clustering, tresholds)
• Automated media scoring (scores for targets)
• Schedule impact monitoring
Source: B. Kitts - Tectonic Shifts in Television Advertising (RecSysTV 2014)
31. 31
• Future: Targeted advertising
• Samsung Smart TV: Automatic Content Recognition (ACR) -> log: time, ip adress, object
• User metadata: income, age, gender
• more than the half of the items are new cold start is significant
• Data processing: Extracting words from description
• Matrix Factorization on watching duration, learning to rank
• CF+CBF with some tricks, but basically CF is better than CBF
• Big Data Architecture: Computing, Machine Learning | Spark | YARN | Hadoop
W8K3: J. Hu - Building Large-Scale Recommender Systems for TV
32. 32
• Roberto Turren et al. (ContentWise)
• Properties: strong context-awareness, implicit feedbacks, stream of data, time-constrained catalog
• Time-based weighting, 1 hour windows (intraday and intraweek neighbors)
• Data: channel switching aggregated in 1 minutes (events less than 1 minutes are dropped)
• Catalog size: 56K schedules per month
• Item metadata: channel, genre, subgenre
• Channel-based recommendations, join/leave
• Results are not outstanding, but better than baselines.
• Genre and subgenre metadata seem to be less relevant.
W8P2: Time-based TV Programs Prediction
33. 33
• W8P1: A Graph-based Collaborative and Context-aware Rec. System for TV Programs (E. Şamdan et al.)
› Graph representation of data (user, item, tag, genre, actor + relations)
› Similar users: finite random walks
› Factorization of whole graph (different weights for different relations)
• W8P3: Augmented Matrix Factorization with Explicit Labels for Recommender Systems ( J. Zhou et al.)
› Matrix-tri-factorization: User Profile X Label Profile X Label-Item Matrix
› Scalability considerations, focusing on memory and time efficient solutions
› Proposed model: Augmented Matrix Factorization with Block Coordinate Descent solver
RecSysTV other papers
34. 34
• W8P4: Item-Based Collaborative Filtering Using the Big Five Personality Traits (H. Alharthi et al.)
› Cross domain recommendation without overlapping data
› Learning personality traits used by psychologist: openness, conscientiousness, extraversion, agreeableness,
neuroticism
› Questionnaire data (comment: expensive and less practical)
› Item personality profile
• W8P6: A Mood-based Genre Classification of Television Content (H. Corona Pampín et al.)
› Rossell's model of affect, 3 dimensions: valence, arousal, dominance
› Modelling channels by these dimensions (Channels: discovery, CN, E, Fox, Syfy, MSNBC, Fox news, CNN)
› Channel clustering
› Easy to cluster: news and animation
› Hard to cluster: horror
RecSysTV other papers
36. 36
• Factorization of Markov Decision Process probability distribution on topics (fMDP) 1
• Change detection in context change 2
• Question recommendation for online courses (load balancing with Concave Cost Flow) 3
• Attacking item-based recommender (fake reviewing/pushing, power item attack (PIA) model) 4
• Learning private attributes using MF (e.g. how to find out the gender) 5
› Questionaire active learning (10 questions is enough)
› Interesting for recommendation strategies
S2: Novel Setups
1 M. Tavakol, U. Brefeld. Factored MDPs for Detecting Topics of User Sessions
2 N. Hariri, B. Mobasher, R. Burke. Context Adaptation in Interactive Recommender Systems
3 D. Yang, D. Adamson , C. Rose. Question Recommendation with Constraints for Massive Open Online Courses
4 C. Seminario, D. Wilson. Attacking Item-Based Recommender Systems with Power Items
5 S. Bhagat, U. Weinsberg, S. Ioannidis, N. Taft. Recommending with an Agenda: Active Learning of Private Attributes using Matrix Factorization
37. 37
• News recommendation 1
› Referer is important
› Session length: Homepage 3.0, Google: 1.8, Twitter: 1.2, Reddit: 1.1
› Graph representation, edge-based next item prediction
› Hybrid filtering
• Job recommendation (LinkedIn) 2
› Skill based, topK features, bayes rules, Skill relevance weighting
› Cold jobs first, warm jobs later
• Ratings + reviews based item modeling, LDA-based topic detection 3
S3: Cold Start and Hybrid Recommenders
1 M. Trevisiol, L. M. Aiello, R. Schifanella, A. Jaimes. Cold-start News Recommendation with Domain-dependent Browse Graph
2 H. Liu, A. Goyal, T. Walker, A. Bhasin. Improving The Discriminative Power Of Inferred Content Information Using Segmented Virtual Profile
3 G. Ling, M. Lyu, I. King. Ratings Meet Reviews, a Combined Approach to Recommend
38. 38
• Each item is recommended for N users (diversity, more chance to rare items to be recommeded)1
• Probabilistic Neighborhood Selection (weighted sampling of k-neighbors) 2
› Hubness: "the tendency of high-dimensional data to contain points (hubs) that frequently occur in k-nearest-
neighbor lists of other points
• User Perception of Differences in Movie Recommendation Algorithms 3
› Survey about subjective properties of factors (diversity, novelty, satisfactions)
› Correlation: Diversity Satisfaction: pos. | NoveltySatisfaction: neg. | 1st impression choice: weak pos.
• News recommendation key features: relevancy, popularity and freshness 4
S5: Diversity, Novelty and Serendipity
1 S. Vargas, P. Castells. Improving Sales Diversity by Recommending Users to Items
2 P. Adamopoulos , A. Tuzhilin. On Over-Specialization and Concentration Biases of Recommendations: Prob. Neighborhood Selection in Coll. Filtering Systems
3 M. Ekstrand, F. M. Harper, M. Willemsen, J. Konstan. User Perception of Differences in Movie Recommendation Algorithms
4 F. Garcin, B. Faltings, O. Donatsch, A. Alazzawi, C. Bruttin, A. Huber. Offline and Online Evaluation of News Recommender Systems at swissinfo.ch
39. 39
• Genre Diversity for Recommender Systems 1
› Intra-list diversity optimization. metrics: sum of similarities (ILS), sum of weighted marginal of genres (MIA)
› Genre selection with binomial distribution
› Binomial diversity is an interesting metric: Coverage*(1-Redundancy) -> greedy optimizer
› Binomial framework for genre diversity. Coverage, redundancy and recommendation list size-awareness.
• MF+LDA incorporation for cold-start dynamic (offline+online) recommendations (book, movie, music sets) 2
• Processing recommendation list based feedback by Gaussian kernel (user,item,position) -> CGPRank model 3
• Tag list size matters for auto content tagging (parameter free length-based optimized tagging for images) 4
S7: Ranking and Top-N Recommendation
1 S. Vargas, L. Baltrunas, A. Karatzoglou, P. Castells. Coverage, Redundancy and Size-Awareness in Genre Diversity for Recommender Systems
2 X. Liu. Towards a Dynamic Top-N Recommendation Framework
3 H. P. Vanchinathan, I. Nikolic, F. De Bona, A. Krause. Explore-Exploit in Top-N Recommender Systems via Gaussian Processes
4 M. Gueye, T. Abdessalem, H. Naacke. A Parameter-free Algorithm for an Optimized Tag Recommendation List Size
40. 40
• GASGD: Distributed Asynchronous SGD (Graph representation, low communication overhead, fast convergence) 1
• Post-processing MF models, neighbor searching 2
› Problem transformation: Inner Product -> Eucledian distance
› Methods: Nearest Neighbors, PCA, NN-boosting
• GBFM: Gradient Boosting Factorization Machines 3
› Greedy feature selection, Taylow-row estimation
› Linear complexity
• Social influence in music scrobbling (Last.fm) 4
S8: Matrix Factorization
1 F. Petroni, L. Querzoni. GASGD: Stochastic Gradient Descent for Distributed Asynchronous Matrix Completion via Graph Partitioning
2 Y. Bachrach et al. Speeding Up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces
3 Chen Cheng, Fen Xia, Tong Zhang, Irwin King and Michael Lyu. Gradient Boosting Factorization Machines
4 R. Palovics, A. Benczur, T. Kiss, L. Kocsis and E. Frigo. Exploiting Temporal Influence in Online Recommendation
42. 42
• 1B impressions per day
• Stats: 800M active users, 25% mobile, 40+minutes/day in US
• Feed Ranking Machine. Input: clicks, likes, comments, shares
• Expected feed value: event confidence/probability to occur (scoring comes from experience, a/b testing)
• Counters: personal (time, clicks), post (recency, device, number of users)
• Prediction: million counters, thousand weights, logistic regression
• Bumping:
› Penalizing already seen stories
› New stories are put at the top
› Result +5% likes, 57%-70% stories are seen
Facebook – news recommendation
44. 44
LinkedIn – A/B testing
• Story: CTR drop, due to 3 pixel difference in homepage
• XLNT - LinkedIn's A/B testing system, several parallel random A/B tests, experiment management system
• A/B test significance matters, they use 95% confidence level (p-value, interval)
• Several metrics to measure (but they didn't tell how to select)
Shopkick
• Online shopping recommendation
• Trending buzzwords: hyper-locational, context-sensitive, geofencing
• Deal recommendation: logistic function (deal attributes, deal trend metrics, user retailer score, user category score,
external factors)
• Gender distribution in science team: 50%-50%
LinkedIn & Shopkick
45. 45
StitchFix
• Estimating the proper size of clothes
• Human vs. machine, different capabilities: eigenvalues vs. find the angry dog
• machine is good for structured data, human is better for unstrucured data, context, image process
• Reco usage statistics: Amazon: 35%, LinkedIn: 50%, Netflix: 75%
• Algorithms: MF, PCA/SVM, clustering
• Applies both human and machine resources for different tasks
The Climate Corporation
• Agriculture-related data science
• Yield monitor data (14B), remote sensing data (260B), weather data (20B)
• Challenges: spatio-temporal, sparse, missing, noisy data, size, scalability, long term decisions
• Multi-arm bandit -> decision optimization
StichFix & Climate
46. 46
• Learning to rank and page optimization
• Context still trending topic
• Diversity and serendipity is important
• Deep learning it hot topic
• Exploration vs. exploitation with Multi-Armed Bandits
• Preference-Popularity model
• Cross-Recommendation in different levels
• Hybrid filtering for TV recommendation, age-device correlation
• Silicon Valley is a good place with a lot of talented guys. Uber taxi is cool ☺
What we learnt