The power of search is with no doubt one of the main aspects for the success of the Web. Currently available search engines on the Web allow to return results with a high precision. Nevertheless, if we limit our attention only to lookup search we are missing another important search task. In exploratory search, the user is willing not only to find documents relevant with respect to her query but she is also interested in learning, discovering and understanding novel knowledge on complex and sometimes unknown topics.
In the paper we address this issue presenting LED, a web based system that aims to improve (lookup) Web search by enabling users to properly explore knowledge associated to her query. We rely on DBpedia to explore the semantics of keywords within the query thus suggesting potentially interesting related topics/keywords to the user.
From Exploratory Search to Web Search and back - PIKM 2010
1. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
FROM EXPLORATORY SEARCH
TO WEB SEARCH AND BACK
Politecnico di Bari
Via Orabona, 4
70125 Bari (ITALY)
Roberto Mirizzi, Tommaso Di Noia
mirizzi@deemail.poliba.it, t.dinoia@poliba.it
2. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
Outline
Tags to improve Web Search
Exploratory Search
LED (Lookup Explore Discover): exploratory
search in the Web (of Data)
DBpediaRanker: RDF ranking in DBpedia
Conclusion and Future work
3. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
Why we use tags?
and many
more…
4. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
What is Exploratory Search?
[Gary Marchionini. Exploratory Search: From Finding to understanding. Communications of the ACM, 49(4): 41-46, 2006]
5. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
Can Semantic tags support Exploratory search?
Plugged into the Web 3.0
Disambiguation
Relations among tags
Machine understandable
Semantic-aided query refinement
LED: Lookup Explore Discover
http://sisinflab.poliba.it/led/
If Semantic tags helped 10% of Internet users to save 10 minutes per month on their searches, this would save globally over 4,000,000 of working hours per year
6. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
LED: Lookup Explore Discover
Objectives
Enable users to properly
explore the semantics of a
keyword
Guide users to refine a
query suggesting related
topics/keywords
Improve lookup search to explore knowledge
7. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
What is behind LED? (i)
8. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
What is behind LED? (ii)
Comments
DBpedia resources are
highly interconnected
in the RDF graph
Not all the relevant
resources for a given
node are its direct
neighbors
1. Explore the
neighborhood of a
resource to discover
new relevant
resources not
directly connected to
it
2. Rank the results
9. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpedia graph exploration in LED
Semantic_Web XML-based_standards
Knowledge_representation Data_management Internet_architecture
Triplestores Folksonomy
…
…
XML Computer_and_telecommunication_stantards
Web_services User_interface_markup_languages Scalable_Vector_GraphicsMicroformats
skos:subject skos:broaderCategoryArticle
Legend
……
…
Resource Description Framework
Microformat
RDFa
…
…
10. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
The functional architecture
Back-end
Query engine
Storage
GUI
Ext.InfoSources
DBpedia
Lookup
Service
Interface
Delicious
Yahoo!
Bing
Google
Graph
Explorer
SPARQL
Context
Analyzer
Ranker
Offline computation
Linked Data graph
exploration
Rank nodes exploiting
external information
Store results as pairs of
nodes together with their
similarity
Runtime Search
Start typing a query
Query the system for
relevant tags
(corresponding to DBpedia
resources) and aggregate
results
Show the semantic tag
cloud and the results
1
2
3
1
2
3
OfflinecomputationRuntimesearch
1
2
3
1
2
3
Tag Cloud
Generator
Meta-search
engine
11. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: ranking
?r1 ?r2
isSimilar
v
hasValue
einfo_sourc2
21
1
21
einfo_sourc21
)(
),(
)(
),(
),(
rf
rrf
rf
rrf
rrsim
viceversaandrandrbetweenwikilink,2
saor viceverrandrbetweenkwikilin,1
randrbetweenwikilinkno,0
),(
21
21
21
21 rrorewikilinkSc
)(
),(
),(
2
12
21
rl
rrl
rroreabstractSc
Graph-based and text-based ranking
Ranking based on external sources
12. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: an example (i)
wikilinkScore(RDFa, Resource_Description_Framework) = 2
abstractScore(RDFa, Resource_Description_Framework) = 1.0
13. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: an example (ii)
sim(RDFa, Resource_Description_Framework)Google = 1.67e5 / 4.42e5 + 1.67e5 / 1.19e7 = 0.39
delicious
14. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
DBpediaRanker: context analysis
The same similarity measure is used in the context analysis
?r1
?c1
belongsTo
v
hasValue
?c2
?c…
?cN
C
Example:
C = {Programming Languages, Databases, Software}
Does Dennis Ritchie belongs to the given context?
Algorithm:
If(v>THRESHOLD) then
r1 belongs to the context;
add r1 to the graph exploration queue
Else
r1 does not belong to the context;
exclude r1 from graph exploration
EndIf
15. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
Evaluation (i)
http://sisinflab.poliba.it/evaluation
Comparison of 5 different algorithms
50 volunteers
Researchers in the ICT area
244 votes collected (on average 5 votes for each users)
Average time to vote: 1min and 40secs
16. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
Evaluation (ii)
http://sisinflab.poliba.it/evaluation/data
3.91 - Good
17. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
Conclusion
LED: a system for exploratory search and query
refinement on the (Semantic) Web
DBpediaRanker: ranking algorithms for resources in
DBpedia
Future work
Expose a RESTful API for building novel mashups and for
comparing with different systems
Improve ranking algorithms
Deal with cases where a single knowledge base in not
sufficient
Combine a content-based recommendation and a
collaborative-filtering approach
18. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
FROM EXPLORATORY SEARCH TO WEB SEARCH AND BACK (PIKM 2010)
If you're interested in learning more…
1. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tags generation and retrieval for online
advertising. 19th ACM International Conference on Information and Knowledge Management (CIKM 2010)
2. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Ranking the Linked Data: the case of DBpedia. 10th
International Conference on Web Engineering (ICWE 2010)
3. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tag cloud generation via DBpedia. 11th
International Conference on Electronic Commerce and Web Technologies (EC-Web 2010)
4. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tagging for crowd computing. 18th Italian
Symposium on Advanced Database Systems (SEBD 2010)
5. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic Wonder Cloud: exploratory search in DBpedia.
2th International Workshop on Semantic Web Information Management (SWIM 2010) - Best Workshop Paper at International
Conference on Web Engineering (ICWE 2010)
Roberto Mirizzi - mirizzi@deemail.poliba.it
Thanks for your attention!
19. PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada