SlideShare une entreprise Scribd logo
1  sur  24
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
An evaluation of SimRank and Personalized PageRank to
build a recommender system for the Web of Data
Phuong T. Nguyen, Paolo Tomeo, Tommaso Di Noia, Eugenio Di Sciascio
{phuong.nguyen, paolo.tomeo, tommaso.dinoia, eugenio.disciascio}@poliba.it
Polytechnic University of Bari - Bari (ITALY)
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Outline
 Introduction
 Recommender systems
 Linked Open Data
 SimRank and Personalized PageRank
 Experimental Evaluation
 Conclusions
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Introduction
 Content-based recommender systems base on the
notion of similarity between items, but obviously
they need content
 Web of data is an opportunity to foster data-
intensive applications
 We have investigated how effectively SimRank and
Personalized PageRank work with Linked Data in the
recommendation task.
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Recommender Systems (RSs) are software tools and
techniques providing suggestions for items to be of
use to a user.
[F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.]
Recommender Systems
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Recommender Systems (RSs) are software tools and
techniques providing suggestions for items to be of
use to a user.
[F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.]
Recommender Systems
• Content-based filtering
• Collaborative filtering
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Content-based RSs try to recommend items similar to those a given user
has liked in the past. The recommendations are based upon a description
of the items and a profile of the user’s interests
Recommender
System
User profile
Content-based RSs
Items description
Personalized
Recommendations
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
We need domain knowledge
and rich descriptions of the items
No suitable suggestion without enough information
Quality of CB recommendations depends on quantity and quality of
the features explicitly associated to the items
Main drawback: Limited Content Analysis
P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. In
Recommender Systems Handbook: A Complete Guide for Research Scientists & Practitioners, Chapter 3, Springer, 2010.
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Linked Open Data
• Initiative for publishing and connecting data on the Web using
Semantic Web technologies
• >30 billion of RDF triples from hundreds of data sources
• Good resource to mine existing information and deduce new
knowledge
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Florence in DBpedia
Florence
Formers capital of Italydcterms:subject
Tuscany
dbpedia-owl:region
Italy
dbpedia-owl:country
.
.
.
Example of facts in Triple-form: subject- predicate - object
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Enrich Data model
Catalog Items Knowledge Graph
1 – Mapping (Entity linking)
2 – Subgraph extraction
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Enrich Data model
Catalog Items Knowledge Graph
Each item is represented by a part of this subgraph.
We need for algorithms able to compute similarity between graphs.
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
SimRank
SimRank computes similarity between nodes in a graph using the
structural context: two nodes are similar if they are referenced by
similar nodes.
G. Jeh and J. Widom. Simrank: A measure of structural-context similarity. In ACM KDD ’02, pages 538–543, 2002.
Given k ≥ 0, R(k) (α, β) = 1 with α = β. R(k) (α, β) = 0 with k = 0 and α = β.
Otherwise, the general formula is
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Personalized PageRank
A node receives an amount of rank from every node which points to
it and in turn transfers an amount of its rank to the nodes it refers to.
T. H. Haveliwala. Topic-sensitive pagerank. In ACM WWW ’02, pages 517–526, 2002. ACM.
The similarity between two items α and β represented by vectors α = {ai }i=1,..,n
and β = {bi }i=1,..,n is computed as the inner product space between the two vectors
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
K-Nearest Neighbors
K-Nearest Neighbors (k-NN) algorithm find the set neighbors(α)
containing the k most similar entities β to a given item α using a
similarity function sim(α, β)
A k-NN recommendation algorithm predicts the rating for item α for
an user u, taking into account the neighbors of α and the profile of u
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Evaluation Methodology
• Top-N Item recommendation task, for different number of N
(from top-1 to top-50 )
• 60-40% holdout split for each user
• Baselines: Vector Space Model (VSM) and a simplified variation of
the algorithm proposed in [T. Di Noia, R. Mirizzi, V. C. Ostuni, D. Romito, and
M. Zanker. Linked open data to support content-based recommender systems. In
ACM I-SEMANTICS ’12. ACM]
• The results for the four algorithms were computed with 10, 20, 40,
60, 80 neighbors
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Evaluation Metrics
Accuracy Precision (P@N)
Recall (R@N)
Sales diversity Catalog coverage
Entropy
Gini index
Novelty Long-tail percentage
Expected Popularity Complement (EPC@N)
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Dataset
Subset of Last.fm mapped to DBpedia
• 1,867 users
• 700 most popular artists and bands
• 47,330 ratings
• 113,386 number of extracted triples
Mappings
http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Properties extracted
Properties extracted from DBpedia (113,386 extracted triples)
Inbound Outbound
dbo:producer
dbo:artist
dbo:writer
dbo:associatedBand
dbo:associatedMusicalArtist
dbo:musicalArtist
dbo:musicComposer
dbo:bandMember
dbo:formerBandMember
dbo:starring
dbo:composer
dcterms:subject
dbo:genre
dbo:associatedBand
dbo:associatedMusicalArtist
dbo:instrument
dbo:occupation
dbo:birthPlace
dbo:background
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Accuracy with 40 neighbors
PageRank and SimRank more accurate
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Catalog coverage with 40 neighbors
Worst coverage
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Distribution with 40 neighbors
Recommendations concentred on a few items
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Novelty Results
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Conclusions
SimRank and Personalized PageRank obtained interesting
results compared to the two baselines in terms of precision,
recall and novelty, even though catalog coverage and items
distribution decrease
Future work:
Same experiments on two more dataset: Movielens and
TheLibraryThing
Same experiments using other graph similarity metrics
WWW 2015 – 7th International Workshop on Web Intelligence & Communities
May 18, 2015 in Florence, Italy
Thanks for your
attention!
Q & A

Contenu connexe

Tendances

Distributed data processing
Distributed data processingDistributed data processing
Distributed data processingAyisha Kowsar
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & UnderfittingSOUMIT KAR
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reductionKrish_ver2
 
Asymptotic analysis of parallel programs
Asymptotic analysis of parallel programsAsymptotic analysis of parallel programs
Asymptotic analysis of parallel programsSumita Das
 
Elements of dynamic programming
Elements of dynamic programmingElements of dynamic programming
Elements of dynamic programmingTafhim Islam
 
State Space Search in ai
State Space Search in aiState Space Search in ai
State Space Search in aivikas dhakane
 
Fundamentals of data structures
Fundamentals of data structuresFundamentals of data structures
Fundamentals of data structuresNiraj Agarwal
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitivesStudent
 
Advanced Operating System Lecture Notes
Advanced Operating System Lecture NotesAdvanced Operating System Lecture Notes
Advanced Operating System Lecture NotesAnirudhan Guru
 
Attribute oriented analysis
Attribute oriented analysisAttribute oriented analysis
Attribute oriented analysisHirra Sultan
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methodsKrish_ver2
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade offVARUN KUMAR
 
Propositional And First-Order Logic
Propositional And First-Order LogicPropositional And First-Order Logic
Propositional And First-Order Logicankush_kumar
 

Tendances (20)

Distributed data processing
Distributed data processingDistributed data processing
Distributed data processing
 
Overfitting & Underfitting
Overfitting & UnderfittingOverfitting & Underfitting
Overfitting & Underfitting
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reduction
 
Asymptotic analysis of parallel programs
Asymptotic analysis of parallel programsAsymptotic analysis of parallel programs
Asymptotic analysis of parallel programs
 
Elements of dynamic programming
Elements of dynamic programmingElements of dynamic programming
Elements of dynamic programming
 
State Space Search in ai
State Space Search in aiState Space Search in ai
State Space Search in ai
 
Fundamentals of data structures
Fundamentals of data structuresFundamentals of data structures
Fundamentals of data structures
 
Binary search
Binary searchBinary search
Binary search
 
Graph coloring using backtracking
Graph coloring using backtrackingGraph coloring using backtracking
Graph coloring using backtracking
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Reasoning in AI
Reasoning in AIReasoning in AI
Reasoning in AI
 
Communication primitives
Communication primitivesCommunication primitives
Communication primitives
 
Advanced Operating System Lecture Notes
Advanced Operating System Lecture NotesAdvanced Operating System Lecture Notes
Advanced Operating System Lecture Notes
 
Attribute oriented analysis
Attribute oriented analysisAttribute oriented analysis
Attribute oriented analysis
 
Multimedia Mining
Multimedia Mining Multimedia Mining
Multimedia Mining
 
Scaling and Normalization
Scaling and NormalizationScaling and Normalization
Scaling and Normalization
 
3.2 partitioning methods
3.2 partitioning methods3.2 partitioning methods
3.2 partitioning methods
 
Chapter 4-Naming.ppt
Chapter 4-Naming.pptChapter 4-Naming.ppt
Chapter 4-Naming.ppt
 
Bias and variance trade off
Bias and variance trade offBias and variance trade off
Bias and variance trade off
 
Propositional And First-Order Logic
Propositional And First-Order LogicPropositional And First-Order Logic
Propositional And First-Order Logic
 

En vedette

Unifying Nearest Neighbors Collaborative Filtering
Unifying Nearest Neighbors Collaborative FilteringUnifying Nearest Neighbors Collaborative Filtering
Unifying Nearest Neighbors Collaborative Filteringkoeverstrep
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM RecommendersYONG ZHENG
 
Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...
Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...
Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...Paolo Tomeo
 
Using Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorUsing Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorJulia Kiseleva
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detectionDavid Gleich
 
Pagerank
PagerankPagerank
PagerankEdison
 

En vedette (8)

Unifying Nearest Neighbors Collaborative Filtering
Unifying Nearest Neighbors Collaborative FilteringUnifying Nearest Neighbors Collaborative Filtering
Unifying Nearest Neighbors Collaborative Filtering
 
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
[CIKM 2014] Deviation-Based Contextual SLIM Recommenders
 
Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...
Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...
Recsys 2015: Exploiting Regression Trees as User Models for Intent-Aware Mult...
 
Using Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing BehaviorUsing Contextual Information to Understand Searching and Browsing Behavior
Using Contextual Information to Understand Searching and Browsing Behavior
 
Random walk on Graphs
Random walk on GraphsRandom walk on Graphs
Random walk on Graphs
 
Personalized PageRank based community detection
Personalized PageRank based community detectionPersonalized PageRank based community detection
Personalized PageRank based community detection
 
Random walk theory
Random walk theoryRandom walk theory
Random walk theory
 
Pagerank
PagerankPagerank
Pagerank
 

Similaire à An evaluation of SimRank and Personalized PageRank to build a recommender system for the Web of Data

Visual Querying LOD sources with LODeX
 Visual Querying LOD sources with LODeX Visual Querying LOD sources with LODeX
Visual Querying LOD sources with LODeXFabio Benedetti
 
About a Mapping Proposal from FRBRoo to SharedCanvas
About a Mapping Proposal from FRBRoo to SharedCanvasAbout a Mapping Proposal from FRBRoo to SharedCanvas
About a Mapping Proposal from FRBRoo to SharedCanvasEquipex Biblissima
 
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Ghislain ATEMEZING
 
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Paolo Nesi
 
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...Fabio Benedetti
 
Ranking the Linked Data: the case of DBpedia - ICWE 2010
Ranking the Linked Data: the case of DBpedia - ICWE 2010Ranking the Linked Data: the case of DBpedia - ICWE 2010
Ranking the Linked Data: the case of DBpedia - ICWE 2010Roku
 
A Linked Data Recommender System using a Neighborhood-based Graph Kernel
A Linked Data Recommender System using a Neighborhood-based Graph KernelA Linked Data Recommender System using a Neighborhood-based Graph Kernel
A Linked Data Recommender System using a Neighborhood-based Graph KernelVito Ostuni
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTHong-Linh Truong
 
A Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesA Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesKarwan Jacksi
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Cataldo Musto
 
Online Index Extraction from Linked Open Data Sources
Online Index Extraction from Linked Open Data SourcesOnline Index Extraction from Linked Open Data Sources
Online Index Extraction from Linked Open Data SourcesFabio Benedetti
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slidesARDC
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportDavid Kennedy
 
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...Fabio Benedetti
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Fabrizio Orlandi
 
DISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudDISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudPaolo Nesi
 

Similaire à An evaluation of SimRank and Personalized PageRank to build a recommender system for the Web of Data (20)

Visual Querying LOD sources with LODeX
 Visual Querying LOD sources with LODeX Visual Querying LOD sources with LODeX
Visual Querying LOD sources with LODeX
 
CV.pdf
CV.pdfCV.pdf
CV.pdf
 
CV.pdf
CV.pdfCV.pdf
CV.pdf
 
About a Mapping Proposal from FRBRoo to SharedCanvas
About a Mapping Proposal from FRBRoo to SharedCanvasAbout a Mapping Proposal from FRBRoo to SharedCanvas
About a Mapping Proposal from FRBRoo to SharedCanvas
 
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
Semantic Web Methodologies, Best Practices and Ontology Engineering Applied t...
 
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
Linked Open Graph: browsing multiple SPARQL entry points to build your own LO...
 
Carpenter: Standards, standards everywhere: Ensuring the platform you build w...
Carpenter: Standards, standards everywhere: Ensuring the platform you build w...Carpenter: Standards, standards everywhere: Ensuring the platform you build w...
Carpenter: Standards, standards everywhere: Ensuring the platform you build w...
 
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...Context Semantic Analysis: a knowledge-based technique for computing inter-do...
Context Semantic Analysis: a knowledge-based technique for computing inter-do...
 
Presentation at MTSR 2012
Presentation at MTSR 2012Presentation at MTSR 2012
Presentation at MTSR 2012
 
Ranking the Linked Data: the case of DBpedia - ICWE 2010
Ranking the Linked Data: the case of DBpedia - ICWE 2010Ranking the Linked Data: the case of DBpedia - ICWE 2010
Ranking the Linked Data: the case of DBpedia - ICWE 2010
 
A Linked Data Recommender System using a Neighborhood-based Graph Kernel
A Linked Data Recommender System using a Neighborhood-based Graph KernelA Linked Data Recommender System using a Neighborhood-based Graph Kernel
A Linked Data Recommender System using a Neighborhood-based Graph Kernel
 
Towards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoTTowards a Resource Slice Interoperability Hub for IoT
Towards a Resource Slice Interoperability Hub for IoT
 
A Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD ResourcesA Survey of Exploratory Search Systems Based on LOD Resources
A Survey of Exploratory Search Systems Based on LOD Resources
 
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
Semantics-aware Recommender Systems Exploiting Linked Open Data and Graph-bas...
 
Online Index Extraction from Linked Open Data Sources
Online Index Extraction from Linked Open Data SourcesOnline Index Extraction from Linked Open Data Sources
Online Index Extraction from Linked Open Data Sources
 
Tracking research data footprints - slides
Tracking research data footprints - slidesTracking research data footprints - slides
Tracking research data footprints - slides
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf export
 
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...
 
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
 
DISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloudDISIT Lab overview: smart city, big data, semantic computing, cloud
DISIT Lab overview: smart city, big data, semantic computing, cloud
 

Dernier

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 

Dernier (20)

NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 

An evaluation of SimRank and Personalized PageRank to build a recommender system for the Web of Data

  • 1. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy An evaluation of SimRank and Personalized PageRank to build a recommender system for the Web of Data Phuong T. Nguyen, Paolo Tomeo, Tommaso Di Noia, Eugenio Di Sciascio {phuong.nguyen, paolo.tomeo, tommaso.dinoia, eugenio.disciascio}@poliba.it Polytechnic University of Bari - Bari (ITALY)
  • 2. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Outline  Introduction  Recommender systems  Linked Open Data  SimRank and Personalized PageRank  Experimental Evaluation  Conclusions
  • 3. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Introduction  Content-based recommender systems base on the notion of similarity between items, but obviously they need content  Web of data is an opportunity to foster data- intensive applications  We have investigated how effectively SimRank and Personalized PageRank work with Linked Data in the recommendation task.
  • 4. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Recommender Systems (RSs) are software tools and techniques providing suggestions for items to be of use to a user. [F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.] Recommender Systems
  • 5. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Recommender Systems (RSs) are software tools and techniques providing suggestions for items to be of use to a user. [F. Ricci, L. Rokach, B. Shapira, and P. B. Kantor, editors. Recommender Systems Handbook. Springer, 2011.] Recommender Systems • Content-based filtering • Collaborative filtering
  • 6. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Content-based RSs try to recommend items similar to those a given user has liked in the past. The recommendations are based upon a description of the items and a profile of the user’s interests Recommender System User profile Content-based RSs Items description Personalized Recommendations
  • 7. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy We need domain knowledge and rich descriptions of the items No suitable suggestion without enough information Quality of CB recommendations depends on quantity and quality of the features explicitly associated to the items Main drawback: Limited Content Analysis P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. In Recommender Systems Handbook: A Complete Guide for Research Scientists & Practitioners, Chapter 3, Springer, 2010.
  • 8. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Linked Open Data • Initiative for publishing and connecting data on the Web using Semantic Web technologies • >30 billion of RDF triples from hundreds of data sources • Good resource to mine existing information and deduce new knowledge
  • 9. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Florence in DBpedia Florence Formers capital of Italydcterms:subject Tuscany dbpedia-owl:region Italy dbpedia-owl:country . . . Example of facts in Triple-form: subject- predicate - object
  • 10. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Enrich Data model Catalog Items Knowledge Graph 1 – Mapping (Entity linking) 2 – Subgraph extraction
  • 11. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Enrich Data model Catalog Items Knowledge Graph Each item is represented by a part of this subgraph. We need for algorithms able to compute similarity between graphs.
  • 12. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy SimRank SimRank computes similarity between nodes in a graph using the structural context: two nodes are similar if they are referenced by similar nodes. G. Jeh and J. Widom. Simrank: A measure of structural-context similarity. In ACM KDD ’02, pages 538–543, 2002. Given k ≥ 0, R(k) (α, β) = 1 with α = β. R(k) (α, β) = 0 with k = 0 and α = β. Otherwise, the general formula is
  • 13. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Personalized PageRank A node receives an amount of rank from every node which points to it and in turn transfers an amount of its rank to the nodes it refers to. T. H. Haveliwala. Topic-sensitive pagerank. In ACM WWW ’02, pages 517–526, 2002. ACM. The similarity between two items α and β represented by vectors α = {ai }i=1,..,n and β = {bi }i=1,..,n is computed as the inner product space between the two vectors
  • 14. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy K-Nearest Neighbors K-Nearest Neighbors (k-NN) algorithm find the set neighbors(α) containing the k most similar entities β to a given item α using a similarity function sim(α, β) A k-NN recommendation algorithm predicts the rating for item α for an user u, taking into account the neighbors of α and the profile of u
  • 15. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Evaluation Methodology • Top-N Item recommendation task, for different number of N (from top-1 to top-50 ) • 60-40% holdout split for each user • Baselines: Vector Space Model (VSM) and a simplified variation of the algorithm proposed in [T. Di Noia, R. Mirizzi, V. C. Ostuni, D. Romito, and M. Zanker. Linked open data to support content-based recommender systems. In ACM I-SEMANTICS ’12. ACM] • The results for the four algorithms were computed with 10, 20, 40, 60, 80 neighbors
  • 16. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Evaluation Metrics Accuracy Precision (P@N) Recall (R@N) Sales diversity Catalog coverage Entropy Gini index Novelty Long-tail percentage Expected Popularity Complement (EPC@N)
  • 17. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Dataset Subset of Last.fm mapped to DBpedia • 1,867 users • 700 most popular artists and bands • 47,330 ratings • 113,386 number of extracted triples Mappings http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/
  • 18. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Properties extracted Properties extracted from DBpedia (113,386 extracted triples) Inbound Outbound dbo:producer dbo:artist dbo:writer dbo:associatedBand dbo:associatedMusicalArtist dbo:musicalArtist dbo:musicComposer dbo:bandMember dbo:formerBandMember dbo:starring dbo:composer dcterms:subject dbo:genre dbo:associatedBand dbo:associatedMusicalArtist dbo:instrument dbo:occupation dbo:birthPlace dbo:background
  • 19. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Accuracy with 40 neighbors PageRank and SimRank more accurate
  • 20. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Catalog coverage with 40 neighbors Worst coverage
  • 21. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Distribution with 40 neighbors Recommendations concentred on a few items
  • 22. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Novelty Results
  • 23. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Conclusions SimRank and Personalized PageRank obtained interesting results compared to the two baselines in terms of precision, recall and novelty, even though catalog coverage and items distribution decrease Future work: Same experiments on two more dataset: Movielens and TheLibraryThing Same experiments using other graph similarity metrics
  • 24. WWW 2015 – 7th International Workshop on Web Intelligence & Communities May 18, 2015 in Florence, Italy Thanks for your attention! Q & A

Notes de l'éditeur

  1. 1
  2. 2
  3. 3
  4. RSs are software tool able to suggest relevant items or information to the user in a personalized way trying to deal with the information overload problem. These systems must to be able to predict items that the user may have an intest in.
  5. Typically, they present a list of recommendations, produced in one of two ways: content-based filtering or collaborative filtering. In this work we are interested in content-based
  6. In general Content Based approaches have some limitations with respect to other approaches like collaborative filtering, but at the same time they have also some advantages like dealing with cold start situations. One of the main drawback is the limited content analysis. If there are no item’s descriptions the system cannot compute recommendations. To apply CB RS we need domain knowledge. Furthermore the better is the quality of the item’s descriptions the better is the quality of the recommendations
  7. 8
  8. 9
  9. 10
  10. 11
  11. 12
  12. 13
  13. 14
  14. 15
  15. 16
  16. 17
  17. 18
  18. 19
  19. 20
  20. 21
  21. 22