Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
NoTube: Recommendations (Collaborative)
1. WP
3
User
profiling
&
Recommenda6on
(Part
3)
BBC,
Pro-‐ne+cs,
VUA
1
Wednesday, March 28, 12
2. Contents
Overview
User profiling
General goal & approach
From activity streams to profile
Issues
Analytics
Beancounter
Recommendations
General goal & approach
Semantic recommendation
Statistical recommendation
Hybrid recommendation
Exploitation
Conclusions
26-27 March 2012 NoTube 3rd Review 2
Wednesday, March 28, 12
3. Overview
Semantic Content Semantic
Patterns for Pattern-based
TV Programs Recommendation
EPG Metadata TV Program
Strategy
(BBC) Enrichment
RDF Graph Statistical
TV Recommendation Similarity-based
Programs Service Recommendation
Strategy
User Ratings &
Demographics User Data Similarity
(BBC EPG Analysis Clusters Hybrid
Data) of Programs Recommendation
Strategy
End End-Users
Users
26-27 March 2012 NoTube 3rd Review 3
Wednesday, March 28, 12
4. Overview
Semantic Content Semantic
Patterns for Pattern-based
TV Programs Recommendation
EPG Metadata TV Program
Strategy
(BBC) Enrichment
RDF Graph Statistical
TV Recommendation Similarity-based
Programs Service Recommendation
Strategy
User Ratings &
Demographics User Data Similarity
(BBC EPG Analysis Clusters Hybrid
Data) of Programs Recommendation
Strategy
BEA
NCO
UNT
E R
End End-Users
Users
26-27 March 2012 NoTube 3rd Review 3
Wednesday, March 28, 12
5. Statistical recommendations
• We had privileged access to two bulk user ratings datasets
from BBC
• From these, used Apache Mahout toolkit to derive "item to
item" similarity measures between each pair of items
• With larger (20k users) this worked well; with a smaller (1k)
dataset, less well
• With BBC, investigating publication of these behaviour-
derived similarity measures
26-27 March 2012 NoTube 3rd Review 4
Wednesday, March 28, 12
6. Hybrid models:
factual paths and statistical similarity
(and not to mention ‘@wossy’ is on Twitter with 1 million followers...)
31
Wednesday, March 28, 12
13. TV Preference Data is very sparse
• Even for a single service (e.g. Netflix), data is
‘overwhelmingly sparse’
• For NoTube’s open systems, challenges multiply:
– often no global view, only per-user data
– many ways of identifying the same content item
– many ways of identifying the same user
– never mind other entities (actors, directors, ...)
• Q: Can we tell a story about how organizations with such
privileged overviews can contribute in a privacy respecting
way to the public commons of linked data? (A: yes! see WP4)
26-27 March 2012 NoTube 3rd Review 12
Wednesday, March 28, 12
17. Statistical recommendation:
Process
• Build on best-in-class opensource code, rather than re-
invent
• Big-data ready (Hadoop-based)
• Of various options, LogLikelihoodSimilarity generally gave
best results (standard 'withold some ratings' evaluation
strategy)
• Other explorations: including large scale (1/2 billion tweet)
Twitter analysis, Spectral Clustering, using
demographics, ...
26-27 March 2012 NoTube 3rd Review 16
Wednesday, March 28, 12
18. Exploitation & Further
Development
Beancounter:
•Pronetics’ user profiling SaaS
•integration in the e-commerce technological solution
• making it more general purpose
• making it capable of big data management a SaaS
playground for Semantic Web researcher
•open source licensing
•community extensions
26-27 March 2012 NoTube 3rd Review 17
Wednesday, March 28, 12
19. Exploitation & Further
Development
Recommendations:
•explore further the combination of demographic
stereotypes & semantics in a hybrid approach to learn a
prediction model for the shows a user is most likely
interested in
•integrate in personalized semantic search frameworks
•extend with additional LOD sources
•test further the measures for diversity, serendipity and
predictability
•open source licensing
•community extensions
26-27 March 2012 NoTube 3rd Review 18
Wednesday, March 28, 12
20. Acknowledgements
26-27 March 2012 NoTube 3rd Review 19
Wednesday, March 28, 12