SlideShare une entreprise Scribd logo
1  sur  42
Exploiting distributional semantics for
Content-Based and Context-Aware
Recommendation
PhD in Artificial Intelligence
Victor Codina
Advisor: Luigi Ceccaroni
Universitat Politècnica de Catalunya
June, 2014
Information and choice overload problem
2
Recommender Systems help users to find
the right items through recommendations
3
Recommender Systems are a widely
adopted technology in many domains
4
Recommender system’s components
5
Knowledge
base
Recommender
Engine
User
Interface
Item
data
User
data
Main families of recommendation models
6
Collaborative
Filtering (CF)
Content-Based
(CB) Filtering
Context-aware
Recommendation
(CARS)
Item metadata
Ratings
Context
Target
user
Target
item
Predicted rating
LIMITATION:
Low accuracy in
data-sparsity
scenarios
 Exploitation of explicit semantic relationships
 to mitigate the data-sparsity problem
Existing solution: use the knowledge
contained in domain ontologies
77
Semantically-Enhanced
CB Filtering
Semantically-Enhanced
CARS
Item ontology
attribute
similarities
Context ontology
condition
similarities
castle monastery
Historic building
is-a
sunny cloudy
Weather
is-a
 Building and mantaining ontologies is expensive
 Ontologies are bounded by fixed representations
 They may not suit the data
Limitations of domain ontologies
8
rating dataontology
≠
domain expert
 Similarities automatically derived from the data itself
 Advantages:
 Collecting rating data is cheaper than building ontologies
 Not bounded by a fixed knowledge representations
 Fine-grained semantic similarities can be identified
Key idea: exploit distributional semantics
derived from rating data
9
rating data
semantic similarities
 Question 1: Is it possible to enhance content-based
recommendation by exploiting the distributional
semantics of item attributes?
 Question 2: Is it possible to enhance contextual
recommendation by exploiting the distributional
semantics of contextual conditions?
Research questions
10
Outline
11
Novel content-based approach (SCB)
Novel context-aware approach (SPF)
Distributional Semantics
Outline
12
Distributional Semantics
Distributional hypothesis
Semantic vector representation
Distributional similarity measures
Novel content-based approach (SCB)
Novel context-aware approach (SPF)
 The meaning of a concept is captured by its usage
Distributional Hypothesis:
“concepts that share similar usages share similar meaning”
 In Linguistics usages are regions of text:
• document
• paragraph
• sentence
Distributional hypothesis
13
Word s1 s2 s3 s4 s5 s6 s7
glass 2 1 0 1 0 2 0
wine 2 1 1 0 1 2 0
spoon
0 0 1 1 0 0 2
Semantic vector representation
14
frequency-based weight
“sentence 1”
 Cosine similarity is the most popular measure
 good accuracy in high-dimensional vector spaces
 Advantage: it can be used in combination with
dimensionality reduction techniques (SVD)
Distributional similarity measure
15
Glass
Wine
Spoon
Outline
16
Novel content-based approach (SCB)
Novel context-aware approach (SPF)
Distributional Semantics
Outline
17
Novel content-based approach (SCB)
Limitations of traditional item-to-user profile matching
Semantic item-to-user profile matching
Experimental evaluation
Content-based recommendation
User-dependent distributional semantics
Novel context-aware approach (SPF)
Distributional Semantics
 IDEA: “show me more of the same I’ve liked”
Content-based Recommendation
18
user profile
Profile
Learner
Profile
Matching
target user’s ratings
item metadata
target item profile
predicted
rating
 Lack of semantics exploitation
 Syntactically different attribute pairs are not considered
 Hypothesis: profile matching can be enhanced by
exploiting similarities between attributes
Traditional item-to-user profile matching
19
Item Profile
User profile
0.2 1 0.5 0 1
0 0.7 0 1 0
a1 a2 a3 a4 a5
a1 a2 a3 a4 a5
score = 1 x 0.7
 Hypothesis: best-pairs is better for rating prediction
and all-pairs is better for ranking prediction
Semantic item-to-user profile matching
20
Item Profile 0.2 1 0.5 0 1
User Profile 0 0.7 0 1 0
a1 a2 a3 a4 a5
a1 a2 a3 a4 a5
All-pairs strategyBest-pairs strategy
0.2 1 0.5 0 1
0 0.7 0 1 0
a1 a2 a3 a4 a5
a1 a2 a3 a4 a5
 Assumption: two attributes are similar if several
users are interested in them similarly
Attribute User1 User2 User3 User4 User5 User6 User7
action 1 -0.7 0 0.9 0.1 -1 0
Bruce
Willis
0.7 -0.8 0.5 0.8 0.4 -0.2 0
comedy -0.5 0.7 0.2 -1 0.9 0.8 0.5
Distributional semantics of item’s
attributes derived from rating data
21
User6’s degree of interest in action movies
(“-1” = strong dislike, “1” = strong like)
 Rating data set statistics before and after pruning:
Evaluation using MovieLens data set
22
Original Pruned
Users 2.113 2.113
Movies 10.197 1.646
Attributes 6 4
Attributes
values
13.367 3.105
Ratings per user 404 235
Sparsity 96% 86%
Best-pairs Vs. All-pairs
23
% = Improvement with respect to the traditional CB profile matching
Best-pairs All-pairs (the higher, the better)
Rating prediction Ranking prediction
Distributional Vs. Ontology semantics
24
Ranking prediction
% = Improvement with respect to the traditional CB profile matching
(the higher, the better)
SCB Vs. State of the art
25
SCB (proposed method) SVD++ BPR-MF
Rating prediction Ranking prediction
% = Improvement with respect to the traditional CB profile matching
Outline
26
Novel content-based approach (SCB)
Novel context-aware approach (SPF)
Distributional Semantics
Outline
27
Novel context-aware approach (SPF)
Limitations of traditional contextual pre-filtering
Semantic pre-filtering approach
Experimental evaluation
Context-aware recommendation
Rating-based distributional semantics of conditions
Novel content-based approach (SCB)
Distributional Semantics
Context matters
28
 Assumption: user’s experience depend on context
29
Context-aware recommendation
 Context as additional dimension for estimation
 Three main context-aware recommender families
target
context
predicted
ratingPrediction
model
in-context
ratings
target
Item
target
user
Pre-filtering Post-filteringContextual modeling
 Main limitation: its lack of flexibility
 Only ratings acquired in exactly the same context are used
 Hypothesis: ratings filtering can be enhanced by
exploiting semantic similarities between contexts
Traditional contextual pre-filtering
30
local
ratings
in-context
ratings
Ratings
filtering
Prediction
model
target context
predicted
rating
 Key idea: reuse ratings acquired in similar contexts
Semantic contextual pre-filtering
31
local
ratings
Ratings
filtering
Prediction
model
≈
≠semantic
similarities
in-context
ratings
target context
global threshold
predicted
rating
Distributional semantics of contextual
conditions derived from rating data
32
 Assumption: two contexts are similar if their
composing conditions influence ratings similarly
Condition User1 User2 User3 User4 User5 User6 User7
1 -0.7 0 0.9 0.1 -0.6 0
0.7 -0.8 0.5 0.8 0.4 -0.2 0
-0.5 0.7 0.2 -1 0.9 0.8 0.5
Influence of family context in User6’s ratings
(“<0” = negative, “0” = neutral, “>0” = positive)
 Six in-context rating data sets on diverse domains:
Evaluation data sets
UMAP – June 2013, Rome, Italy 33
Datasets Ratings Conditions
Context
granularity
Music 4013 26 1
Tourism 1358 57 3
Adom 1464 14 3
Comoda 2296 49 12
Movie 2190 29 2
Library 609K 149 4
Semantic Vs. traditional pre-filtering
34
% = MAE reduction with respect to a context-free MF model
(the higher, the better)
Semantic Traditional
SPF Vs. State of the art
35
% = MAE reduction with respect to the context-free MF model
(the higher, the better)
SPF (proposed method) UI-Splitting CAMF
Main contributions
36
Novel content-based approach (SCB)
Novel context-aware approach (SPF)
Distributional Semantics
 Method for computing the
distributional semantics of item’s
attributes
 Two strategies for exploiting the
semantic similarities during profile
matching
Main contributions (II)
37
Semantic Content-Based filtering (SCB)
 Better accuracy than state of the art in new user scenarios
Main contributions (III)
38
Semantic Content-Based filtering (SCB)
SCB (proposed method)
38
Ranking predictionRating prediction
 Method for computing the
distributional semantics of
contextual conditions
 Novel semantic pre-filtering
method that reuses ratings in
semantically similar contexts
Main contributions (IV)
39
Semantic Contextual Pre-filtering (SPF)
 Better accuracy than state of the art
Main contributions (V)
40
Semantic Contextual Pre-filtering (SPF)
SPF
Question 1?
YES. It is possible to enhance content-based
recommendation by exploiting the distributional
semantics of item’s attributes
Question 2?
YES. It is possible to enhance context-aware
recommendation by exploiting the distributional
semantics of contextual conditions
Conclusions
41
 Conference papers:
 CCIA 2010: Codina, V. & Ceccaroni, L. Taking advantage of semantics…
 DCAI 2010: Codina, V., & Ceccaroni, L. A Recommendation System for the…
 CCIA 2011: Codina, V., & Ceccaroni, L. Extending Recommendation Systems with…
 CCIA 2012: Codina, V., & Ceccaroni, L. Semantically-Enhanced Recommenders
 CARR 2013: Codina et al. Semantically-enhanced pre-filtering for…
 UMAP 2013: Codina et al. Exploiting the Semantic Similarity of Contextual…
 RecSys 2013: Codina et al. Local Context Modeling with Semantic Pre-filtering
 Journal paper:
 UMUAI (User Modeling and User-Adapted Interaction journal): Codina et al.
Distributional Semantic Pre-filtering in Context-Aware Recommender Systems.
2012 Impact Factor: 1.600 (current status: accepted)
Publications related to the thesis
42

Contenu connexe

En vedette

Contextual eVSM: a context-aware content-based recommendation framework based...
Contextual eVSM: a context-aware content-based recommendation framework based...Contextual eVSM: a context-aware content-based recommendation framework based...
Contextual eVSM: a context-aware content-based recommendation framework based...
Cataldo Musto
 
PhD oral defense_Yang
PhD oral defense_YangPhD oral defense_Yang
PhD oral defense_Yang
Yang Cong
 
IGARSS2011_final.pptx
IGARSS2011_final.pptxIGARSS2011_final.pptx
IGARSS2011_final.pptx
grssieee
 

En vedette (12)

Contextual eVSM: a context-aware content-based recommendation framework based...
Contextual eVSM: a context-aware content-based recommendation framework based...Contextual eVSM: a context-aware content-based recommendation framework based...
Contextual eVSM: a context-aware content-based recommendation framework based...
 
The lived experience of Australian nurses working in disaster environments
The lived experience of Australian nurses working in disaster environmentsThe lived experience of Australian nurses working in disaster environments
The lived experience of Australian nurses working in disaster environments
 
PhD oral defense_Yang
PhD oral defense_YangPhD oral defense_Yang
PhD oral defense_Yang
 
PhD Oral Defense of Md Kafiul Islam on "ARTIFACT CHARACTERIZATION, DETECTION ...
PhD Oral Defense of Md Kafiul Islam on "ARTIFACT CHARACTERIZATION, DETECTION ...PhD Oral Defense of Md Kafiul Islam on "ARTIFACT CHARACTERIZATION, DETECTION ...
PhD Oral Defense of Md Kafiul Islam on "ARTIFACT CHARACTERIZATION, DETECTION ...
 
Thesis defence
Thesis defence Thesis defence
Thesis defence
 
Slope one recommender on hadoop
Slope one recommender on hadoopSlope one recommender on hadoop
Slope one recommender on hadoop
 
John McGaughey - Towards integrated interpretation
John McGaughey - Towards integrated interpretationJohn McGaughey - Towards integrated interpretation
John McGaughey - Towards integrated interpretation
 
IGARSS2011_final.pptx
IGARSS2011_final.pptxIGARSS2011_final.pptx
IGARSS2011_final.pptx
 
Oral defense b. henry
Oral defense   b. henryOral defense   b. henry
Oral defense b. henry
 
Day2 5 lentswe_hydrogeology_botswana
Day2 5 lentswe_hydrogeology_botswanaDay2 5 lentswe_hydrogeology_botswana
Day2 5 lentswe_hydrogeology_botswana
 
defense_2
defense_2defense_2
defense_2
 
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
[IUI 2017] Criteria Chains: A Novel Multi-Criteria Recommendation Approach
 

Similaire à PhD defense - Exploiting distributional semantics for content-based and context-aware recommendation

Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
Victor Codina
 
Synthese Recommender System
Synthese Recommender SystemSynthese Recommender System
Synthese Recommender System
Andre Vellino
 
An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...
IJECEIAES
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Paolo Missier
 

Similaire à PhD defense - Exploiting distributional semantics for content-based and context-aware recommendation (20)

Extending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context AwarenessExtending Recommendation Systems With Semantics And Context Awareness
Extending Recommendation Systems With Semantics And Context Awareness
 
Synthese Recommender System
Synthese Recommender SystemSynthese Recommender System
Synthese Recommender System
 
PhD defense
PhD defense PhD defense
PhD defense
 
Social Recommendation a Review.pptx
Social Recommendation a Review.pptxSocial Recommendation a Review.pptx
Social Recommendation a Review.pptx
 
Content Recommendation Through Linked Data
Content Recommendation Through Linked DataContent Recommendation Through Linked Data
Content Recommendation Through Linked Data
 
Dynamic personalized recommendation on sparse data
Dynamic personalized recommendation on sparse dataDynamic personalized recommendation on sparse data
Dynamic personalized recommendation on sparse data
 
Bayesian Phylogenetics - Systematics.pptx
Bayesian Phylogenetics - Systematics.pptxBayesian Phylogenetics - Systematics.pptx
Bayesian Phylogenetics - Systematics.pptx
 
"Content-based RecSys: problems, challenges & research directions"-UMAP'10, I...
"Content-based RecSys: problems, challenges & research directions"-UMAP'10, I..."Content-based RecSys: problems, challenges & research directions"-UMAP'10, I...
"Content-based RecSys: problems, challenges & research directions"-UMAP'10, I...
 
Dynamic personalized recommendation on sparse data
Dynamic personalized recommendation on sparse dataDynamic personalized recommendation on sparse data
Dynamic personalized recommendation on sparse data
 
acmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptxacmsigtalkshare-121023190142-phpapp01.pptx
acmsigtalkshare-121023190142-phpapp01.pptx
 
An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...An enhanced kernel weighted collaborative recommended system to alleviate spa...
An enhanced kernel weighted collaborative recommended system to alleviate spa...
 
Discriminate2Rec: A Discriminative Temporal Interest-based Content-based Reco...
Discriminate2Rec: A Discriminative Temporal Interest-based Content-based Reco...Discriminate2Rec: A Discriminative Temporal Interest-based Content-based Reco...
Discriminate2Rec: A Discriminative Temporal Interest-based Content-based Reco...
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
 
Overview of recommender system
Overview of recommender systemOverview of recommender system
Overview of recommender system
 
Rokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptxRokach-GomaxSlides.pptx
Rokach-GomaxSlides.pptx
 
Rokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptxRokach-GomaxSlides (1).pptx
Rokach-GomaxSlides (1).pptx
 
Linked Administrative Data and Adaptive Design
Linked Administrative Data and Adaptive DesignLinked Administrative Data and Adaptive Design
Linked Administrative Data and Adaptive Design
 
Slides sem on pls-complete
Slides sem on pls-completeSlides sem on pls-complete
Slides sem on pls-complete
 
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
GPS for Chemical Space - Digital Assistants to Support Molecule Design - Chem...
 

Dernier

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Dernier (20)

Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 

PhD defense - Exploiting distributional semantics for content-based and context-aware recommendation

  • 1. Exploiting distributional semantics for Content-Based and Context-Aware Recommendation PhD in Artificial Intelligence Victor Codina Advisor: Luigi Ceccaroni Universitat Politècnica de Catalunya June, 2014
  • 2. Information and choice overload problem 2
  • 3. Recommender Systems help users to find the right items through recommendations 3
  • 4. Recommender Systems are a widely adopted technology in many domains 4
  • 6. Main families of recommendation models 6 Collaborative Filtering (CF) Content-Based (CB) Filtering Context-aware Recommendation (CARS) Item metadata Ratings Context Target user Target item Predicted rating LIMITATION: Low accuracy in data-sparsity scenarios
  • 7.  Exploitation of explicit semantic relationships  to mitigate the data-sparsity problem Existing solution: use the knowledge contained in domain ontologies 77 Semantically-Enhanced CB Filtering Semantically-Enhanced CARS Item ontology attribute similarities Context ontology condition similarities castle monastery Historic building is-a sunny cloudy Weather is-a
  • 8.  Building and mantaining ontologies is expensive  Ontologies are bounded by fixed representations  They may not suit the data Limitations of domain ontologies 8 rating dataontology ≠ domain expert
  • 9.  Similarities automatically derived from the data itself  Advantages:  Collecting rating data is cheaper than building ontologies  Not bounded by a fixed knowledge representations  Fine-grained semantic similarities can be identified Key idea: exploit distributional semantics derived from rating data 9 rating data semantic similarities
  • 10.  Question 1: Is it possible to enhance content-based recommendation by exploiting the distributional semantics of item attributes?  Question 2: Is it possible to enhance contextual recommendation by exploiting the distributional semantics of contextual conditions? Research questions 10
  • 11. Outline 11 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  • 12. Outline 12 Distributional Semantics Distributional hypothesis Semantic vector representation Distributional similarity measures Novel content-based approach (SCB) Novel context-aware approach (SPF)
  • 13.  The meaning of a concept is captured by its usage Distributional Hypothesis: “concepts that share similar usages share similar meaning”  In Linguistics usages are regions of text: • document • paragraph • sentence Distributional hypothesis 13
  • 14. Word s1 s2 s3 s4 s5 s6 s7 glass 2 1 0 1 0 2 0 wine 2 1 1 0 1 2 0 spoon 0 0 1 1 0 0 2 Semantic vector representation 14 frequency-based weight “sentence 1”
  • 15.  Cosine similarity is the most popular measure  good accuracy in high-dimensional vector spaces  Advantage: it can be used in combination with dimensionality reduction techniques (SVD) Distributional similarity measure 15 Glass Wine Spoon
  • 16. Outline 16 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  • 17. Outline 17 Novel content-based approach (SCB) Limitations of traditional item-to-user profile matching Semantic item-to-user profile matching Experimental evaluation Content-based recommendation User-dependent distributional semantics Novel context-aware approach (SPF) Distributional Semantics
  • 18.  IDEA: “show me more of the same I’ve liked” Content-based Recommendation 18 user profile Profile Learner Profile Matching target user’s ratings item metadata target item profile predicted rating
  • 19.  Lack of semantics exploitation  Syntactically different attribute pairs are not considered  Hypothesis: profile matching can be enhanced by exploiting similarities between attributes Traditional item-to-user profile matching 19 Item Profile User profile 0.2 1 0.5 0 1 0 0.7 0 1 0 a1 a2 a3 a4 a5 a1 a2 a3 a4 a5 score = 1 x 0.7
  • 20.  Hypothesis: best-pairs is better for rating prediction and all-pairs is better for ranking prediction Semantic item-to-user profile matching 20 Item Profile 0.2 1 0.5 0 1 User Profile 0 0.7 0 1 0 a1 a2 a3 a4 a5 a1 a2 a3 a4 a5 All-pairs strategyBest-pairs strategy 0.2 1 0.5 0 1 0 0.7 0 1 0 a1 a2 a3 a4 a5 a1 a2 a3 a4 a5
  • 21.  Assumption: two attributes are similar if several users are interested in them similarly Attribute User1 User2 User3 User4 User5 User6 User7 action 1 -0.7 0 0.9 0.1 -1 0 Bruce Willis 0.7 -0.8 0.5 0.8 0.4 -0.2 0 comedy -0.5 0.7 0.2 -1 0.9 0.8 0.5 Distributional semantics of item’s attributes derived from rating data 21 User6’s degree of interest in action movies (“-1” = strong dislike, “1” = strong like)
  • 22.  Rating data set statistics before and after pruning: Evaluation using MovieLens data set 22 Original Pruned Users 2.113 2.113 Movies 10.197 1.646 Attributes 6 4 Attributes values 13.367 3.105 Ratings per user 404 235 Sparsity 96% 86%
  • 23. Best-pairs Vs. All-pairs 23 % = Improvement with respect to the traditional CB profile matching Best-pairs All-pairs (the higher, the better) Rating prediction Ranking prediction
  • 24. Distributional Vs. Ontology semantics 24 Ranking prediction % = Improvement with respect to the traditional CB profile matching (the higher, the better)
  • 25. SCB Vs. State of the art 25 SCB (proposed method) SVD++ BPR-MF Rating prediction Ranking prediction % = Improvement with respect to the traditional CB profile matching
  • 26. Outline 26 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  • 27. Outline 27 Novel context-aware approach (SPF) Limitations of traditional contextual pre-filtering Semantic pre-filtering approach Experimental evaluation Context-aware recommendation Rating-based distributional semantics of conditions Novel content-based approach (SCB) Distributional Semantics
  • 28. Context matters 28  Assumption: user’s experience depend on context
  • 29. 29 Context-aware recommendation  Context as additional dimension for estimation  Three main context-aware recommender families target context predicted ratingPrediction model in-context ratings target Item target user Pre-filtering Post-filteringContextual modeling
  • 30.  Main limitation: its lack of flexibility  Only ratings acquired in exactly the same context are used  Hypothesis: ratings filtering can be enhanced by exploiting semantic similarities between contexts Traditional contextual pre-filtering 30 local ratings in-context ratings Ratings filtering Prediction model target context predicted rating
  • 31.  Key idea: reuse ratings acquired in similar contexts Semantic contextual pre-filtering 31 local ratings Ratings filtering Prediction model ≈ ≠semantic similarities in-context ratings target context global threshold predicted rating
  • 32. Distributional semantics of contextual conditions derived from rating data 32  Assumption: two contexts are similar if their composing conditions influence ratings similarly Condition User1 User2 User3 User4 User5 User6 User7 1 -0.7 0 0.9 0.1 -0.6 0 0.7 -0.8 0.5 0.8 0.4 -0.2 0 -0.5 0.7 0.2 -1 0.9 0.8 0.5 Influence of family context in User6’s ratings (“<0” = negative, “0” = neutral, “>0” = positive)
  • 33.  Six in-context rating data sets on diverse domains: Evaluation data sets UMAP – June 2013, Rome, Italy 33 Datasets Ratings Conditions Context granularity Music 4013 26 1 Tourism 1358 57 3 Adom 1464 14 3 Comoda 2296 49 12 Movie 2190 29 2 Library 609K 149 4
  • 34. Semantic Vs. traditional pre-filtering 34 % = MAE reduction with respect to a context-free MF model (the higher, the better) Semantic Traditional
  • 35. SPF Vs. State of the art 35 % = MAE reduction with respect to the context-free MF model (the higher, the better) SPF (proposed method) UI-Splitting CAMF
  • 36. Main contributions 36 Novel content-based approach (SCB) Novel context-aware approach (SPF) Distributional Semantics
  • 37.  Method for computing the distributional semantics of item’s attributes  Two strategies for exploiting the semantic similarities during profile matching Main contributions (II) 37 Semantic Content-Based filtering (SCB)
  • 38.  Better accuracy than state of the art in new user scenarios Main contributions (III) 38 Semantic Content-Based filtering (SCB) SCB (proposed method) 38 Ranking predictionRating prediction
  • 39.  Method for computing the distributional semantics of contextual conditions  Novel semantic pre-filtering method that reuses ratings in semantically similar contexts Main contributions (IV) 39 Semantic Contextual Pre-filtering (SPF)
  • 40.  Better accuracy than state of the art Main contributions (V) 40 Semantic Contextual Pre-filtering (SPF) SPF
  • 41. Question 1? YES. It is possible to enhance content-based recommendation by exploiting the distributional semantics of item’s attributes Question 2? YES. It is possible to enhance context-aware recommendation by exploiting the distributional semantics of contextual conditions Conclusions 41
  • 42.  Conference papers:  CCIA 2010: Codina, V. & Ceccaroni, L. Taking advantage of semantics…  DCAI 2010: Codina, V., & Ceccaroni, L. A Recommendation System for the…  CCIA 2011: Codina, V., & Ceccaroni, L. Extending Recommendation Systems with…  CCIA 2012: Codina, V., & Ceccaroni, L. Semantically-Enhanced Recommenders  CARR 2013: Codina et al. Semantically-enhanced pre-filtering for…  UMAP 2013: Codina et al. Exploiting the Semantic Similarity of Contextual…  RecSys 2013: Codina et al. Local Context Modeling with Semantic Pre-filtering  Journal paper:  UMUAI (User Modeling and User-Adapted Interaction journal): Codina et al. Distributional Semantic Pre-filtering in Context-Aware Recommender Systems. 2012 Impact Factor: 1.600 (current status: accepted) Publications related to the thesis 42

Notes de l'éditeur

  1. Today I’m going to present the main contributions of my research in the field of the RSs This work has been carried out in the UPC, with the support of the KEMLG research group and it has been supervised by the Dr. LUIGI Ceccaroni
  2. We are living in an era of information and choice overload having access to an overwhelming number of alternatives for almost every type of product or service we are interested in.   Although having such a variety of options is usually seen as something beneficial, it also has the negative effect that makes harder the decision-making process, leading us to make poor decisions when we don’t have the necessary knowledge.
  3. A natural way for solving this information overload problem is to rely on the recommendations of other people, and this simple observation was what motivated the development of RSs. Therefore, the goal of RSs is to help users to find the right items for them through recommendations adapted to their preferences. Here you can see an example of personalized movie recommendations provided by the popular movie rental service Netflix
  4. Nowadays, the success of many popular sites in a large variety of domains strongly depend on the RSs. The Amazon, Ebay, Netflix, Spotify, Yahoo News, LinkedIn are some popular examples They use RS to add value to their information services, improving the user’s experience and as a consequence their business.
  5. Recommender systems are composed of three main components: the knowledge base, where it is stored information about the items to recommend and historical user data, that is, previous user-item interactions that show what users liked or disliked in the past; the recommendation engine, where one or several recommendation models exploiting the knowledge base are used to make recommendations; and finally the user interface component which is responsible for presenting the recommendations in an proper way and also to collect new feedback about the recommended items. my thesis has focused on improving the accuracy of existing recommendation models.
  6. The recommendation task is commonly formulated as a rating prediction problem, that is, the problem of estimating how much a target user will like or dislike a certain candidate item. Depending on the type of information exploited, recommendation models are commonly classified into three main families: CF approaches, which make predictions to a user based on the ratings of others, so they only require rating data; CB approaches, whose predictions are based on the metadata of the items the target user rated in the past and the candidate ones, and finally, the context-aware approaches, which in addition to the ratings also incorporate contextual information into their processes. A common limitation that share the three recommendation approaches is that they perform poorly (in terms of accuracy) in data-sparsity scenarios, and although it is a well-known limitation in the research community, still is an open and relevant issue.
  7. A reason of this low accuracy of CB and CA approaches in data-sparsity scenarios is that their models lack semantic intelligence. Therefore, several works have address this limitation by exploiting the explicit semantics relationships about items content and contextual information available in domain ontologies. In CB approaches, these explicit similarities between item attributes is commonly used to infer new user’s interests. For example, a CB recommendation model exploiting this item ontology could infer that users that like castles also are interested in monstareies and viceversa, because these two concepts are hierarchically related In CA approaches, the hierarchical relatiionships between contextual conditions are commonly used to make generalizations of the context when it is too fine grained to make meaningful contextual recommendation
  8. However, using ontologies as a knowledge source has its limitations. On the one hand, the process of building and maintining expressive ontologies is expensive. This limits its use in many domains and the number of publicly available domain-specific ontologies is limited. Most of them consists of general taxonomies that are limited in terms of expressivenss and richness. In addition to this, another major limitation of ontologies is the fact that they are predefined specifications of a domain based on the criteria of human experts. For this reason, it may happen that the ontology does’nt fit the data which is actually used for making recommendations, a therefore exploiting this knowledge is not really useful for improving the prediction accuracy.
  9. In order to overcome this limitation of ontogy-based semantics, in this thesis, I have investigated the use of distributional semantics derived from rating data to improve recommendations. Differently from similarities derived from ontologies, distributinal semantic similarities are automatically derived from the data itself, and consequently this semantics source does’nt suffer from the previously mentioned limitations of ontologies. On the one hand, user data is cheaper and easier to obtain than ontologies. They are not bounded to static knowledge represntations Finallly, distributional semantic similarity measures can capture finer-grained similarities which might be only detected from the data
  10. My research then has focused on investigating how distributional semantics derived from rating data can be exploited in existing CB and CA recommendation models in order to improve their accuracy. These are the two research questions of this thesis:   To answer these questions I have implemented and empirically evaluated two recommendation models: (1) a novel content-based approach enhanced with distributional semantics of item’s attributes, and (2) a novel context-aware approach enhanced by using the distributional semantics of contextual
  11. This is the outline I will follow during the First, before presenting each of the two proposed approaches I’m going to introduce to you the concept of distributional semantics and its mathematical foundations, which come from Computional Linguistics In the second part, I’ll present the novel content-based approach enhanced with distributional semantics and its evaluation results, State of the art will be presented in each of the sections And in the last part, I’ll talk about the novel contextual pre-filtering approach and also its performance results
  12. In Distributional Semantics the meaning of a concept is captured by the usage or distributional properties of the concept, which are automatically derived from the corpus of data where the concept is used. The fundamental idea behind this way to extract semantic similarities between domain concepts is the so-called distributional hypothesis: which claims that concepts repeatedly co-occurring in the same context or usage tend to be related. Distributional semantics have been mainly studied in Linguistics, where usages or contexts are defined by specific regions of text that can have different granularities: for instance, the whole document, a paragraph or a sentence.
  13. A common representation method to measure distributional similarities between words consists of employing a vector space representation of concept meaning, and then measure the similarity in terms of proximity in such vector space. This matrix shows an example of such a representation where rows represent the semantic vectors of these words, and each of the elements (the columns) indicate if the concept was used or not in the linguistic context (that in this example are supposed to be defined as text sentences). Commonly these values are calculated by means of a weighting scheme of the occurrence frequency of the concept in the specific region of text. In this example, we can see that the concept wine has a better overlap with glass than with spoon because they co-occurr more frequently.
  14. Once computed the semantic vectors or co-occurrence matrix, then we are ready to calculate semantic similarities between words. To do so we need to employ a specific similarity measure. In the literature there are several types of similarity measures that can be used for this purpose, such as set theory measures and probabilistic measures. However, for computing similarities in the vector space, the cosine similarity is one of the most commonly used because of its proved reliability, specially when dealing with high-dimensional vector spaces. Here I’m showing in a 2D space the main idea of the cosine similarity, which is calculated as the cosine of the angle between the vectors. The smaller is the angle, the more similar are the semantic vectors. Therefore in this case the cosine similarity between glass and wine is larger than the one between wine and spoon. Additionally, it has the advantage that can be used in combination with dimensionality reduction techniques like SVD. These techniques are useful when the dimensionality of the semantic vectors is too high and sparse, because they can produce a more compact and informative semantic representation. This usually improves the accuracy of the similarity assessments.
  15. This is the outline I will follow during the First, before presenting each of the two proposed approaches I’m going to introduce to you the concept of distributional semantics and its mathematical foundations, which come from Computional Linguistics In the second part, I’ll present the novel content-based approach enhanced with distributional semantics and its evaluation results, State of the art will be presented in each of the sections And in the last part, I’ll talk about the novel contextual pre-filtering approach and also its performance results
  16. The main assumption of CB recommendation approaches is that users tend to like items with similar attributes to those he or she already liked in the past. As illustrated in this graphic, CB models first build a model of user’s interests in the same attribute space as items, and then use this user profile to recommend new items whose attributes match the user’s interests. In domain where explicit ratings are available, it is common the use of linear CB models, in which user profiles are represented as weighted vectors, each value indicating a quantification of the degree of interest in a certain item attribute based on the ratings given to the items containing the attribute, and then the predictions are computed by directly comparing the user and item vector representations.
  17. Commonly the item and user profile matching is computed by means of the dot product or the cosine similarity, which are methods that only rely on the “syntactic” evidence of attribute relatedness. That is, syntactically different attributes do not contribute to the similarity value. Present example (0 means that the attribute do not appears in the profile) For example, using the dot product to compute the matching score between this user profile and item profile, only the weights of the attribute 2 would be aggregated. Therefore, they have a lack of semantics intelligence in this sense, which limit the accuracy of the prediction, especially if the user profiles are based on few ratings and consequently there is little knowledge about user’s interest My hypothesis was that traditional profile matching could be enhanced by exploiting the distributional similarities of the syntactically different item’s attributes in addition to the exact coincidences.
  18. In particular, I proposed two profile matching strategies based on pairwise comparison that exploit the distributional semantic similarities between item’s attributes: a best-pairs and an all-pairs strategy. The best-pairs strategy, aggregates in addition to the exact attribute matchings, the best-matching attribute pairs, so each attribute in the item profile with value different from 0 is compared with only 1 attribute in the user profile different from zero. In this example… The alll-pairs strategy, as its name indicates agregates all the possible attribute pairs combinations appearing in both profiles. So in the same user and item profile comparison, the number of aggregated values is doubled. In both strategies the aggregated attribute pairs are weighted according to their semantic similarity value, so that the weaker similarities contribute less to the predicted score. I experimented with these two strategies because my hypothesis was that they might perform differently depending on the recommendation task. In particular the all-pairs strategy is supposed to perform better in ranking prediction, where what matters most is the order of the recommended items and not how similar the predicted and true ratings are. And in contrast, given that the best-pairs is more selective it should be more adequate for rating prediction, where the exact predicted score is relevant.
  19. So far I have explained the methods for exploiting semantic similarities during the item and user profile matching. Now i’m going to talk about how we calculated these similarites based on the distributional semantics of item’ attributes derived from rating data. The main assumption of the proposed method for computin such distributional similarities is that two attributes are semantically related if several users are interested in them in a similar way Based on this assumption, to measure user-dependent distributional similarities, first we need to compute the user-dependant semantic vector where each element stores a user interest weight. That is the attribute’s semantic vectors are built with respect to the attribute-based user profiles generated by the CB profile learner. In this example I show the semantic vectors of three movie attributes with respect to six users of the system. If we analyze the number of co-occurrences between pairs of attributes, it is easy to observe that between &amp;lt;Bruce Willis, action&amp;gt; pair that several users tend to be interested in them similarly, and in contrast, there is only 1 case between Bruce willis and comdey. Finally, based on this semantic representation, we calculate the distributional similarity between two attributes by comparing their semantic vectors. We experimented with several measures but as expected, the Cosine similarity was the one performing better in general.
  20. For the evaluation of SCB we used an extension of the popular movie rating data set collected by the MovieLens recommender, which contains over 10 million ratings from 2K users on 10K movies. We used this data set because it included a large variety of movie attributes such as genres, directors, actors, countries of origin, filming locations and user tags; some of them extracted from IMDb. In order to avoid the introduction of non-informative movie metadata into the CB models which could degrade predictions we discarded some of them, especially the least popular actors and user tags. We also removed all the movies with less than five tags as well as the ratings associated to them.
  21. Here I illustrate the % improvement achieved when using the proposed pairwise strategies exploiting the user-based distributional similarities compared to the traditional profile matching strategy. Optional. The baseline and the enhanced CB approaches employed the same user profile learning method. In these experiments we employed a sophisticated user-profile learning method based on the rating average. MAE and RMSE are well-know metrics for measuring how accurate are the models predicting unknown ratings, and Recall and NDCG are metrics that measure the accuracy of the models making personalized rankings. All means that the results are averaged over all the users, and New averaged over the set of new users. In our experiments we considered as new users the 10% of users with the lowest number of ratings. We can see that in the New user scenario is where both variants has a significantly different performance and particularly effecttive. On the one hand, best-pairs strategy is better than the all-pairs one for rating prediction, and in contrast, the all-pairs clearly outperforms the best-pairs in terms of ranking precision. These results prove the hypothesis that the all-pairs strategy is more effective for ranking, given that for this task what matters most is the order of the items and not the closeness between the predicted and the true rating. And that the best-pais, which is more selective matching strategy is better for rating prediction.
  22. Here I compare the ranking accuracy of the all-pairs strategy when exploiting different sources of semantics similarities: the blue bars correspond to the user-based distributional semantics; the yellow bars the distributional semantics derived from item-based co-occurrences; that is, in the item-based representation rating data is not considered only the item metada, and the red bars are similarities derived from an ontology. The ontology-based semantics were derived from the hierarchical relationships defined in the Amazon.com movie taxonomy. As it can be observed, using distributional semantics the overall accuracy is better than when using ontology-based semantics, being the user-based slighty better than item-based. In the new user scenario results are quite different. In this case, the item-based ones are clearly the less effective, and the user-based and ontology-based have similar accuracy results. Considering the accuracy in both sets of users, the results validate the hypothesis that user-based semantics, derived from rating data, can be more effective to improve prediction accuracy than the other types.
  23. I this other slide I show the improvement achieved by using the proposed novel CB method (the orange bar) compared to two state-of-the-art CF approaches based on Matrix Factorization which is a popular CF method. the yellow bar correspond to SVD++, a MF model which was part of the winning solution in the Netflix prize and therefore is especially effective for rating prediction, and the red bars correspond to BPR-MF, another MF model which is designed for recommending rankings, and therefore it is not able to make rating predictions. Possibly you have noted that the gain achieved for rating prediction is much smalller than the gain achieved for ranking. This is because rating prediction the space for improvement is more limitaed as was demonstrated during the Netflix challenge, where they offered a 1M dollar prize for reducing by 10% the RMSE of their approach and 3 years of research were needed to achieve it. If we look at the overall results, the all columns we can see that the CF approaches are clearly better: SVD is the best model for rating prediction and BPR-MF for ranking. However, for new users the new CB method outperforms the best CF approach for each recommendation task. Differences are especially significant in terms MAE and NDCG. This proves that our method is an effective method for improving CB recommendation in general, and for improving state-of-the-art CF methods in data sparsity scenarios as the new user. We can see that based on all the users, the CF approaches are the most accurate
  24. This is the outline I will follow during the First, before presenting each of the two proposed approaches I’m going to introduce to you the concept of distributional semantics and its mathematical foundations, which come from Computional Linguistics In the second part, I’ll present the novel content-based approach enhanced with distributional semantics and its evaluation results, State of the art will be presented in each of the sections And in the last part, I’ll talk about the novel contextual pre-filtering approach and also its performance results
  25. The main assumption of CARS is that items can be experienced differently by the users depending on the current contextual situation, and as a result, user evaluations or ratings can also be different. A clear example where context matters is in the tourism domain, where the same recommendations to the same users can be considered as good or bad depending on the weather conditions.
  26. For this reason, context-aware recommendation approaches incorporate contextual information into their processes. Typically, CARS extend existing CB and CF techniques with context-awareness, and depending on how they incorporate context into the recommendation process, three main familiies of context-aware approach can be identified: pre-filtering, post-filtering and contextual modeling. Pre-filtering approaches exploit contextual information to discard the user’s ratings that are not relevant in the context in which the user is asking for a recommendation. Then, a context-free CB or CF approach is used to make recommendations based on the subset of relevant ratings. On the contrary, post-filtering approaches use contextual information once recommendations are made by a context-free model to adjust them. For instance, by applying some kind of rescoring. Finally, contextual modeling approaches incorporate context into the recommendation model, representing user’s interests and other model parameters as a function of context. Because context-aware approaches require a large number of ratings of users for items in several context, they are more affected by the data-sparsity problem than the context-free ones. Contextual prefiltering is the approach that tipically suffers more from this limitation and for this reason my research has focus on this paradigm.
  27. The traditional contextual pre-filtering, is known as the reduction-based approach, because for each target contextual situation it builds a strict local model, where only the ratings acquired in exactly the same situation to the target one are used for recommending. The main limitation of this approach is its lack of flexibility because it uses always the maximum level of contextualization, and therefore it fails when the target situations are too specific and not relevant, or when there are not enough ratings in that situation for generating a robust local prediction model With this example I’m showing how the traditional contextual pre-filtering works. Each of these circles represent the set of training ratings tagged with 3 syntactically different situations, s1, s2 and s3. Assuming that the target context is s3, then the method would discard all the ratings acquired in S2 and s1. Finally it builds a local prediction model based on the selected ratings. My hypothesis is that is possible to overcome this lack of flexibility exploiting the semantic similarities between contexts during the rating pre-filtering process.
  28. To validate this hypothesis, we proposed a novel pre-filtering approach that, in addition to the ratings acquired in the exactly the same context, it also reuses ratings acquired in contexts semantically similar to the target one. Following with the same example, let’s assume now that the system knows that the target context sunny is semantically related to when users travel in family but not when users are sad. In this case the semantic pre-filtering would also reuse the ratings acquired in the famility context for building the local prediction model to make prediction in the target context sunny. Our approach employs a global similarity threshold to select those situations that are similar enough to be considered as reusable: the larger the threshold, the sharper the contextualization, that is the more similar are the local models to the strict models generated by the traditional approach
  29. So far I have assumed the existence of semantic similarities between contexts, and now I’m going to explain how we compute these similarities with respect to the rating data. In particular our method computes distributional semantic similarities between contextual situations based on the assumption that two situations are similar if their composing conditions influence users’ ratings in a similar way. For this reason, in this case the semantic vectors of contextual conditions contains estimates of its influence on the given ratings. We estimated this influence as the average deviation between the observed ratings when the condition holds, and a context-free rating estimated by using a baseline predictor. In this case, the average deviation be calculated either from the item perspective or the user perspective, and the depending on the rating data is more appropiate than the other. here I’m showing an example using the user-based persepective. In this case this -1 is indicating that the family condition influences negatively the ratings of the user 6. Once computed the semantic vectors of the conditions, we calculate their similarities by comparing the vectors using the cosine similarity. The more similar is the influence, the more similar are the conditions. In this example, we can see that famility and sunny are more similar than sunny and sad given that there are more users where the influence is similar and in constrast between sunny and sad only 1. In the case that the situation is defined by several conditions, we compute first the semantic vector of the situation by averaging the vectors of its composing conditions, and then we compute the cosine similarity
  30. For the evaluation we have considered 6 data sets of contextually tagged ratings on diverse domains and with different characteristics. Here I’m showing some of them: conditions refer to total number of conditions captured by the system, and context granularity is calculated as the average number of conditions per contextual situation. The larger number, the more specific are the contexts. The Music data set contains ratings for music tracks collected by an in-car music recommender. The Tourism data set contains ratings for POIs in the region of South Tyrol. Adom, Comoda and Movie are all movie rating data sets and Library is about book ratings. As you can see, Library is the biggest data set with more than 600k ratings and Comoda is the one with fine grained contextual situations.
  31. Here I’m showing the MAE reduction with respect to a context-free MF when using the proposed semantic pre-filtering (the orage bars) and the traditional one. (the yellow ones). The larger is the percentage, the better is the rating prediction accuracy. As you can see, in all the data sets the semantic prefiltering is clearly superior to the traditional one, proving the effectiveniss of our method to exploit distributional semantic similarities between contextual situations during prefiltering to improve accuracy. The traditional prefiltering is even worse than the context-free model in some data sets. This poor performance in Tourism, Music and Comoda is due to the lack of flexibility of this approach, which allways builds a strict local model, and in some cases they are too specific that there is not enough training data to build robust local models and therefore their accuracy is worse than the global context-free model.
  32. Here I’m showing the results of the proposed semantic pre-filtering (the orange bars) compared to two state-of-the-art context-aware approaches. The blue bars correspond to another pre-filtering approach that differently from the reduction-based approach that build local models for each target context, it modifies the original rating matrix by splitting the rating vectors associated to the users and items into virtual vectors based on the contextual condition that influences the most the rating. And then builds a global model based on new rating matrix. The red bars corresponds to CAMF, a contextual modeling approach that extends the standard MF model with additional parameters that model the influence of context with respect to the items or the users. So in this case the context is modeled as part of the MF model. An advantage of pre-filtering approaches is that they can use any context-free recommendation technique to model the local models. However, to properly compare the performance of the pre-filtering approaches to CAMF, we use them in combination with the standard context-free MF. In other words, our method uses MF to build the local models, and the global model of the splitting approach. As it can be observed, the three context-aware prediction models significantly outperform MF in all the data sets, confirming that contextual information is relevant for improving the rating predictions. On the other hand, the new method is the most effective exploiting the context since it outperforms the other approaches in all the data sets, and the differences are specially large in the Tourism, Adom, Comoda and Movie data sets.
  33. Building block: distributional semantics Key idea: Content-based and context-aware recommendation can be enhanced by exploiting distributional semantics derived from rating data
  34. User-based distributional semantics of attributes Based on how users are interested in them More effective than item and ontology-based 50% gain in ranking accuracy 7% gain in rating prediction
  35. User-based distributional semantics of attributes Based on how users are interested in them More effective than item and ontology-based 50% gain in ranking accuracy 7% gain in rating prediction
  36. Based on how conditions influence the users’ ratings
  37. Based on how conditions influence the users’ ratings
  38. Question 1: Is it possible to enhance CB recommendation by exploiting distributional semantic similarities between item attributes? semantic similarities between attributes are useful to enhance the profile matching Question 2: Is it possible to enhance contextual recommendation by exploiting distributional semantic similarities between contextual conditions?
  39. Many results reported in this thesis have already been presented in several international conferences, some of the them of significant impact in the field of Recommender Systems, such as UMAP and Recsys conferences. Additionaly the main results of this thesis have also been presented in a highly ranked journal related to the field.