2. Evaluating Recommender Systems
• General error metrics
• Mean absolute error (MAE)
• Root mean squared error (RMSE)
• General rank based metrics
• Precision/Recall
• Mean Reciprocal Rank (MRR)
• Directly/indirectly optimised
• What does improving these metrics mean?
3. Goal driven vs Metric driven design
• Metric driven (error or rank based)
• Metric(s) is abstract and general (e.g. RMSE)
• Objective of the algorithm is abstract
• Competitions (e.g. Kaggle, Netflix) encourage this approach
• Goal driven
• Metric(s) depends on the goal
• More focused algorithm design
• Fitting goal to data (not model to data)
4. Goal driven design
• User/system perspective
• Define and consider user satisfaction or system related objectives as
priority
• Internal/external goals
• Consider algorithmic or non-algorithmic solutions
• Time dependent goals
• Identify various objectives and find the optimal solution
5. User/System perspective
• System perspective
• Performance
• Cost
• Update time
• Profit margin
• User perspective
• User satisfaction (how to measure?)
• Diversity
• Novelty
• Serendipity
• Context
6. User vs System perspective
System- User-
General
focused focused
metric
metric metric
Collaborative system-user goals
System-
General
focused
metric
metric
User-
focused
metric
Opposite system-user goals
7. External/Internal goal
• External goals
• System as a black box
• Plugin any algorithm
• Post/pre-filtering or independent algorithmic solution
• Easier to evaluate (modularised)
• Internal goals
• Goal is built in the algorithm
• Goal is directly optimised
• Difficult to evaluate different components
8. Approach to goal-driven design
Recommender system
External goal External goal
Internal goal
data data prediction prediction
9. Time dependent goals
• User side
• Change in taste
• Seasonal trends
• System side
• Cost over time (e.g. peak vs off-peak)
• Demand
10. Time dependent algorithms
• Internal
• Exponential decay (model temporal change)
• Survival analysis
• Time-series model
• External
• Time sensitive post filtering
• Control theory
• Online learning approaches
11. Approach to goal-driven design
User or External or Time Performance
Goal(s)
System Internal dependency Measures
12. User perspective
Static Temporal
Internal goals Diversification/long tail items Optimal Control Theory
(promote diverse items) (cold-start problem)
External goals Nudging and Serendipity Balanced Control Theory
(promote serendipitous items) (improve prediction per user)
13. System perspective
Static Temporal
Internal goals Stock control Optimal Control Theory
(promote items that are in stock) (estimate/maximise profit)
External goals Optimised content delivery Balanced Control Theory
(pre-cache liked items) (stabilise resource allocation)
14. Example: Diversification/long tail items
Goal: Promote diverse items
Challenges: How to measure diversity?
Scope: Goal is optimised within the algorithm
Algorithm: Matrix factorisation with convex optimisation
Evaluation: Measure diverse items in top position
15. Example: Stock availability
Goal: Recommend items that are in stock
Challenges: Up-to-date stock availability?
Scope: Goal is optimised within the algorithm
Algorithm: Matrix factorisation with convex optimisation
Evaluation: Measure waiting list for items
16. Example: Improve prediction per user
Goal: Request more data for difficult users
Use only useful data to train model
Challenges: Define noise/signal for data points
Scope: Goal is optimised over time, for each user
independently
Algorithm: Control theory
Evaluation: Performance measured per user basis
17. Example: Resource allocation
Goal: Improve resource allocation (e.g. CPU time)
Maximise available resources (e.g. fixed
cluster)
Challenges: Define system dynamics
Scope: Goal is optimised over time
Algorithm: Control theory
Evaluation: Stability, divergence from reference
18. Summary
• Goal defines algorithm and performance measure
• Multiple goals can be integrate into a single system
• Modularity to evaluate goals separately (external)
• Optimisation to merge goals (internal)