Slides from my talk on Personalised Access to Linked Data. Presented at the EKAW 2014 conference. The poster to this paper won the best poster award at the conference!
1. Czech Technical
University in
Prague
Personalised Access to Linked Data
Milan Dojchinovski and Tomas Vitvar
Web Intelligence Research Group
Czech Technical University in Prague
The 19th International Conference on Knowledge Engineering
and Knowledge Management (EKAW 2014)
November 24-28, 2014, Linköping, Sweden
Milan Dojchinovski
milan.dojchinovski@fit.cvut.cz - @m1ci - http://dojchinovski.mk
Except where otherwise noted, the content of this presentation is licensed under
Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
Web Intelligence
Research Group
2. Outline
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
2
• Introduction
• Personalised Resource Recommendations
• Experiments and Results
• Conclusion and Future Work
3. Introduction
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
3
LOD cloud stats [1]:
• 294 in Sep 2011
• 1,091 datasets in Apr 2014
• 271% growth
• Find relevant information in LOD is not easy
- SPARQL, manual dereferencing URIs, …
• … or ask other people for recommendations and get
personalised recommendations of resources
• Linked Data based recommenders can help
[1] M. Schmachtenberg et al, Adoption of linked data best practices in different topical domains, ISWC 2014.
4. Related Work
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
4
• dbRec (Passant, 2010): semantic distance measure
- function of direct and indirect links
• Content-based LD recommender (Di Noia et. al, 2012)
- movies domain, max resource distance: 2
• Lookup Explore Discovery (Mirizzi et al., 2010)
- user input required
- recommendations related to the entities occurring in the query
• Discovery Hub (Marie et al., 2013)
- based on the spreading activation
- utilizes small portion of information DBpedia
• Aemoo (Musetti et al., 2012)
- Encyclopedic Knowledge Patterns over DBpedia
5. Introduction
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
5
• Method for personalised Linked Data recommendations
- apply collaborative filtering technique to Linked Data
- recommendations from users with similar resource interests
• Two novel metrics:
- resource similarity and resource relevance
• Considered aspects:
- Resource Commonalities
- how much information two resources share
- Resource Informativeness
- how informative the resources are
- Resource Connectivity
- how well are resources connected
6. Outline
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
6
• Introduction
• Personalised Resource Recommendations
- Resource Similarity
- Resource Relevance
• Experiments and Results
• Conclusion and Future Work
7. Resource Recommendation In a Nutshell
ls:usedAPI
#Hashtagram
ls:usedAPI
ls:usedAPI
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
7
• Input: RDF graph (including user profiles)
• Step 1: evaluate user similarities
- e.g. similarity between resources representing users
- instances of foaf:Person class
• Step 2: recommend resource from similar users
- compute relevance for each resource candidate
- incorporate the resource (user) similarities
dc:creator
dc:creator
dc:creator
creator dc:ls:category
usedAPI
ls:ls:usedAPI
ls:tag
ls:tag
ls:tag
ls:tag
ls:usedAPI
#microblogginig
ls:tag
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Instagram
#Twitter-API
#Facebok-API
#social
#music
#search #Microsoft-Bing-
API
#411Sync-API
#MTV-Billboard-charts
#Mobile-
Weather-Search
#mlachwani
8. Outline
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
8
• Introduction
• Personalised Resource Recommendations
- Resource Similarity
- Resource Relevance
• Experiments and Results
• Conclusion and Future Work
9. Resource Similarity Computation
dc:creator
dc:creator
ls:usedAPI
#Hashtagram
ls:usedAPI
ls:usedAPI
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
9
• Assumption 1: the more information two resource share,
the more similar they are
#microblogginig
ls:tag
#social
#music
• 6 resources in the shared context graph
dc:creator
creator dc:ls:category
usedAPI
ls:ls:usedAPI
ls:tag
ls:tag
ls:tag
ls:tag
ls:usedAPI
ls:tag
ls:tag
#Alfredo
#FriendLynx
#Instagram
#Twitter-API
#Facebok-API
#search #Microsoft-Bing-
API
#411Sync-API
#MTV-Billboard-charts
#Mobile-
Weather-Search
#mlachwani
10. Resource Similarity Computation (cont.)
dc:creator
#Instagram
ls:tag
#microblogginig
ls:tag
dc:creator
ls:usedAPI
#Hashtagram
ls:usedAPI
ls:usedAPI
ls:usedAPI
ls:tag
ls:tag
#Facebok-API
#social
#411Sync-API
#music
#Microsoft-Bing-
ls:tag
#Alfredo
#FriendLynx
#Twitter-API
#search
API
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
Information Content (IC)
10
• Assumption 2: less probable shared resources carry more
similarity information than the more common
Resource IC
#MTV-Billboard-charts
dc:creator
ls:tag
ls:tag
ls:usedAPI
#mlachwani
#Mobile-
creator dc:Weather-Search
ls:category
usedAPI
ls:• Evaluated by computing the node degree value
- Microsoft-Bing-API (deg. 40) more than Twitter-API (deg. 799)
11. Resource Similarity Computation (cont.)
dc:creator
dc:creator
ls:usedAPI
#Hashtagram
ls:usedAPI
ls:usedAPI
ls:usedAPI
ls:tag
#Alfredo
#FriendLynx
#Instagram
#Twitter-API
#microblogginig
#social
#411Sync-API
#music
#search #Microsoft-Bing-
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
11
• Assumption 3: better connected shared resources carry
more similarity information
#MTV-Billboard-charts
dc:creator
ls:tag
ls:tag
ls:usedAPI
#mlachwani
creator dc:Weather-Search
ls:category
usedAPI
ls:ls:tag
ls:tag
ls:tag
#Facebok-API
ls:tag
API
#Mobile-
• The number of simple paths between the resources
- 2 simple paths between #Alfredo and #Twitter-API
12. Outline
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
12
• Introduction
• Personalised Resource Recommendations
- Resource Similarity
- Resource Relevance
• Experiments and Results
• Conclusion and Future Work
13. Resource Relevance Computation
dc:creator
dc:creator
ls:usedAPI
#Hashtagram
ls:usedAPI
ls:usedAPI
ls:usedAPI
ls:tag
#Alfredo
#FriendLynx
#Instagram
#Twitter-API
#mlachwani similar users
#Facebok-API
#microblogginig
#social
#411Sync-API
#music
#search #Microsoft-Bing-
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
13
• Recommending resources of type Web APIs for an user
#MTV-Billboard-charts
dc:creator
ls:tag
ls:tag
ls:usedAPI
creator dc:Weather-Search
ls:category
usedAPI
ls:ls:tag
ls:tag
ls:tag
ls:tag
API
#Mobile-
• Recommendations from similar users
- connectivity between the similar user and the resource candidate
- number of simple paths
- informativeness of each resource in these paths
14. Outline
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
14
• Introduction
• Personalised Resource Recommendations
- Resource Similarity
- Resource Relevance
• Experiments and Results
• Conclusion and Future Work
15. Experiments Setup
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
15
• Linked Web APIs dataset
- RDF representation of ProgrammableWeb.com
- largest service and mashup repository
• Evaluated accuracy and usefulness of recommendations
• Accuracy:
- precision/recall, AUC, NDCG, MAP, MRR
• Usefulness:
- serendipity: how surprising the recommendations are
- diversity: how diverse the recommendations are
• Evaluated methods:
- User-KNN, Item-KNN, Most popular, Random
- LD with RIC, LD without RIC
16. Accuracy Evaluation
• Taking into account resource informativeness makes sense
• Item-KNN and User-KNN do not work well
- … at least in the Web services domain
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
16
0.0 0.2 0.4 0.6 0.8 1.0
0.00 0.05 0.10 0.15 0.20
Recall
Precision
Linked Data based with RIC
Linked Data based without RIC
User-KNN
Item-KNN
Most popular
Random
17. Serendipity and Diversity Evaluation
• Serendipity score = user resource avg. distance
• Diversity score = avg. dissimilarity between all resource
pairs
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
17
@top-N Random
Most
Popular
User-KNN Item-KNN
LD without
RIC
LD with
RIC
@top-5 2.97752 2.66810 2.59197 2.68006 3.18881 3.03271
@top-10 2.98455 2.67465 2.65514 2.70402 3.54821 3.26700
@top-15 2.98364 2.65816 2.68101 2.71267 3.73117 3.36509
@top-20 2.98455 2.65184 2.69780 2.70968 3.84142 3.42444
@top-5 0.65339 0.58347 0.62092 0.63349 0.83417 0.81949
@top-10 0.65317 0.61354 0.62411 0.64392 0.86044 0.82912
@top-15 0.65370 0.60374 0.63159 0.64558 0.87511 0.82884
@top-20 0.65347 0.60719 0.63276 0.64287 0.88435 0.83114
serendipity
diversity
19. Outline
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
19
• Introduction and Motivation
• Personalised Resource Recommendations
- Resource Similarity
- Resource Relevance
• Experiments and Results
• Conclusion and Future Work
20. Conclusion
Personalised Access to Linked Data - @m1ci - http://dojchinovski.mk
20
• Method for personalised access to Linked Data
- recommendations based on the collaborative filtering
technique
• Considered aspects:
- resources’ commonalities
- resources’ informativeness
- resources’ connectiviteness
• Validated on a dataset from the Web services domain
- Linked Web APIs dataset
• Future work:
- consider other multi-domain datasets
- automatic determination of optimal resource contexts distances
- publish the Linked Web APIs dataset to the LOD cloud