SEMashup - ENsEN in Aimashup2014 by M.Alsarem and P.Portier
1. SEMashup -Mazen Alsarem & Pierre-Edouard Portier 1
How to enhance Web snippets with Linked Data?
Mazen Alsarem & Pierre-Edouard Portier
Laboratory LIRIS, INSA de Lyon, France
SEMashup
2. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
2
Given the query: “epimenides knossos paradox”,
Among the first results returned by the Google
SE, we find these snippets:
4. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
4
Our snippet highlights an alternative excerpt to
better summarize the conceptual content of the
document.
6. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
6
Our snippet also accentuates concepts that are
present in the document and related to the user's
information need as expressed by her query.
11. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
11
A mashup of Web of Data services
We use the DBpedia Spotlight service to extract
concepts from the document.
13. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
13
A mashup of Web of Data services
We use the DBpedia Spotlight service to extract
concepts from the document.
We query a DBpedia SPARQL endpoint to find
existing triples between the concepts.
15. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
15
In order to benefit from the Linked Data, we need
to select the concepts to extend.
We propose to rank the concepts by their
importance relatively to the user's information
need.
To do this efficiently, we cannot rely only on the
small graph we built, but we need to go back to
the textual content of the document.
Therefore, we introduce a new iterative SVD
algorithm.
16. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
16
To each concept, we associate a text made of its
abstract and of the sentences of the document
that contain its instances.
We build a concept-stem matrix whose entries
are frequencies.
We do a first SVD decomposition.
We give more importance to the concepts and the
stems close to the query, whereafter we do a
second SVD decomposition.
In the reduced SVD space, we measure how the
norms of the concepts and the stems evolved.
17. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
17
dbp:Epim
enides
dbp:Knossos
dbp:Paradox
Evolution of the norms of the concepts in the
reduced SVD space, between iterations 1 and 2:
18. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
18
The stems and the concepts that moved the most
will be stressed at next iteration, the stems that
nearly didn't move will be removed.
Concepts linked by a predicate to concepts
elected to be stressed, will also be stressed.
20. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
20
We use a DBpedia SPARQL endpoint to find new
triples about the most important resources.
In a pre-processing step, we kept only the
DBpedia predicates that carry enough information
(we discarded the predicates whose objects when
concatenated had a low entropy).
21. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
21
In order to rank the triples of the extended graph
and build the snippet, we do a tensor
decomposition (CP) of the graph.
In order to take into account the types of the
predicates, we choose to do a tensor
decomposition instead of a decomposition of the
adjacency matrix (each horizontal slice of the
tensor represents the adjacency matrix for one
given predicate).
22. SEMashup -Mazen Alsarem & Pierre-Edouard Portier
22
Thank you!
And, please, come see the live demo!
http://demo.ensen-insa.org