Generative AI for Technical Writer or Information Developers
Tag And Tag Based Recommender
1. Tag & Tag-based Recommenders
IBM Research – China
Presenter: Xiatian Zhang (张夏天)
Team:
赵石顽 张夏天 袁 泉
2. About Me
2000-2004, B.S. Math, Central South University
2004-2007, M.S. Computer Science, BUPT
2007-Present, Researcher, Working on Recommender Systems and
Data Mining
3. Agenda
Social Tagging System and Its Features
Tag Recommender
Tag-based Recommender
4. Social Tagging
A folksonomy is a system of classification derived from the practice
and method of collaboratively creating and managing tags to annotate
and categorize content; this practice is also known as collaborative
tagging, social classification, social indexing, and social tagging.
Folksonomy is a portmaneau of folk and taxonomy.
Social Tagging boomed from 2004, with the wave of Web 2.0.
– Delicious
– Citeulike
– Bibsonomy
– Youtube
– Flickr
– Dogear – A internal social book marking system in IBM
– …
5. Some Insights of Tagging System
Shilad Sen et.al., tagging, communities, vocabulary, evolution,
CSCW’06
– Modeling vocabulary evolution
– Tagging system features
– Based on Movielens recommender system
– Personal tendency and community influence
– Tag displaying strategies and their effects
– Tag utility
7. Tagging System Features
Design Features
– Tag Sharing
– Tag Selection
– Item Ownership
– Tag Scope
– Broad
– Narrow
Tag Class
– Factual Tag
– Subjective Tag
– Personal Tag
9. Personal Tendency
How strongly do investment and
habit affect personal tagging
behavior?
– 1. Habit and investment
influence user’s tag applications.
– 2. Habit and investment
influence grows stronger as
users apply more tags.
– 3. Habit and investment cannot
be the only factors thatcontribute
to vocabulary evolution.
10. Community Influence
How does the tagging
community influence
personal vocabulary?
– 1. Community influence
affects a user’s personal
vocabulary.
– 2. Community influence
on a user’s first tag is
stronger for users who
have seen more tags.
13. Tag Recommender
Purpose
– Encourage users to tag more frequently, apply more tags to an
individual resource, reuse common tags
– Make user use tags not previously considered.
– Eliminate Redundant tags
– Promote a core tag vocabulary steering the user toward adopting
certain tags while not imposing any strict rules.
– Avoid ambiguous tags in favor of tags that offer greater information
value.
14. Tag Recommender – Technologies
Naive Methods
– Most Popular Tags on Resources
– Most Popular Tags on Users
– Most Popular Tags on Resources and Users
Classical Collaborative Filtering
– User-KNN
– Item-KNN
Adapted KNN Methods
– Extend User-Item Matrix
– Degrade User-Item-Tag Relationship
Content-based Method
Tensor Method
– Tensor Factorization
Graph Based
– FolkRank
Our Work
16. Adapted KNN – Degrade User-Item-Tag relationship
Process
– TF/IDF on UI, UT, IT
– P-Core Processing
– Remove noise data
– Extract User Model by
Hebbian Deflation
18. FolkRank
PageRank
PR( p j )
PR( pi ) (1 d ) / N d
p j M ( pi ) L( p j ) (1)
Personalized PageRank
PR( p j )
PR( pi ) (1 d ) pi d
p j M ( pi ) L( p j ) (2)
FolkRank
1. Compute global PageRank by (1)
2. Then for each <user, item> pair, compute personalized PageRank by (2)
– p[i] = 1, but p [u] = 1 + |U| and p [r] = 1 + |R|.
3. FolkRank = Personalized PageRank - PageRank
19. Our Work
Explored and Exploring Methods
– Non-classical Tensor Fusion Factorization
– Multi-label Classification by Random Decision Trees, High Speed
– The performance of both two methods are close to FolkRank
Current Progress
– Shiwan develop a simple graph model
– Best precision and recall on several datasets compared to other
methods
– We are writing paper targeting ACM RecSys 2010
20. Tag-based Recommender
Our Work
– IUI 2008 Paper, Improved Recommendation based on Collaborative
Tagging Behaviors
– Explored Methods
– Tensor Factorization
– Non-classical Tensor and Matrix Fusion Factorization
Other Works
– Shilad Sen, Jesse Vig, and John Riedl, Tagommenders: Connecting
Users to Items through Tags, WWW 2009
21. IUI 2008 Paper Overview
We invent a new collaborative filtering approach TBCF (Tag-based Collaborative
Filtering) based on the semantic distance among tags assigned by different users
to improve the effectiveness of neighbor selection.
That is, two users could be considered similar not only if they rated the items
similarly, but also if they have similar cognitions over these items.
Example
– Both Bob and Tom may rate the movie Avatar with 5 stars, which indicates they
all like this movie very much.
– Nevertheless, as a 3D fan, Bob appreciates this movie for its high quality 3D
animations, while Tom may think that it is a wonderful action movie.
22. Tag-based Collaborative Filtering
Tag-based User-Item Matrix
Item1 Item2 Item3 Item4
Alice Art, photo Home, Products Writing, Design Learning,
Education
Daniel Photo, Album, Ø Typewriter Tutorial, Training
Image
Sherry Ø Cleaning Ø Language, Study
Maggie Photography Ø Ovens Ø
Steps
1. Calculate the semantic similarity of tags based on WordNet (for the tags not
included in WordNet, calculate the edit-distance instead)
2. Calculate the similarity between tag sets
3. Calculate the similarity between user u and v by summing up the similarity of tag
sets on common pages (tagged by both u & v)
4. Find the top-N nearest neighbors of the active user to make the prediction
5. Return the top-M predicted items to the active user
23. Tag Similarity Calculation
Tag similarity
– WordNet
– LSA/PLSA
Tag set similarity
– Hungarian method
WordNet Concept Tree
Word similarity in WordNet
If x and y are contained in WordNet, dis(x,y) is the shortest path length between x and y.
24. Experimental Evaluation
Data Set
Extract total 8000 users, 5315 pages and 7670 tags from web logs.
Algorithm Average Precision Average Ranking
TBCF 0.27 2.8
cosine 0.13 1.5
Random generated subset Average Precision Average Precision
TBCF cosine
500 0.208 0.121
2000 0.182 0.118
4000 0.202 0.173
6000 0.209 0.180