Presentation given at IESD2014: http://iesd14.wordpress.com/
Paper: Survey of linked data based exploration systems: https://hal.inria.fr/hal-01057035
Abstract. Linked datasets now constitute a valuable background knowledge for supporting exploration and discovery objectives through browsers, recommenders and exploratory search systems in particular. Today there is a need to look at the achievements and tendencies in this rapidly developing field in order to better orient the future research works. In this paper we propose a survey of such systems from the earliest semantic browsers to more recent and innovative ones.
Good morning everybody. I am a Phd student in the field of linked data based exploratory search systems. During my thesis I made a state of the art reviewabout the systems that allow exploration and discovery by leveraging semantic data and linked data in particular. I will present you the result of this survey today, a longer state of the art review will be available in the thesis mansuscript
Ok first some elements of context
I will start by reminding briefly give some important milestones both in term of techology and in terms of usage.
In 2001 Tim Berners Lee published the paper called « the semantic web » in scientific american. Of course thesemantic web vision and the first technical propositions were published earlier, in the nineties, but this article is an iconic one, that received a lot of attention of the research community. The first development phase the semantic web started at the time. During this period the semantic data were often domain-constrained, constituted of small datasets, with a low heterogeneiry. In other words they were difficult to leverage to build advanced end-user applications
In 2007 the linked open data initiative emerged and proposed a new way to massively publish huge real-world linked datasets: (by identifying datasets under a free licence and republishing them using the semantic web standards). So it was a grass-root approach. The LOD initiative had an important success and led to the massive publication of huge datasets, including in particular Dbpedia. The availability of such datasets gave a fresh boost to the semantic search community. New forms processing appeared, including notably the linked data based recommenders. We will discuss this point in detail later.
About the usage context. In 2006 Gary Marchionini published a paper: exploratory search: from finding to understanding“. The topic of exploratory search was not new, and similar topics were investigated before: information foraging, berry-picking search approach, sense-making, informationseeking and cognitive information-retrieval. This paper was an important inspiration for a community interested by the topic of data and information exploration. Right after in 2007 the Human Computer information retrieval workshop started,. And in 2012 the IESD workshop focused on the use of semantic technologies for exploration started at this turn. A lot of systems and approaches were proposed in these workshops and in other conferences. In other word we have a consequent amount of systems exists today.
It is also important to notice that the 3 major search engines started to deploy knowledge graph based functionalities: the knowledge panels. Such functionalities can be considered as basic exploratory search ones as they display structured data about topics, they proposes recommandation and assist the navigation thanks to breadcrumb functionalities in the case of Google.
It is an interesting shift that we need to follow as it impacts the lay users. It familiarizes them with exploratory search and semantic search to a certain extent.
Here we are, there is a need now to look at the achievements and tendencies in the field of exploration and discovery based on linked data in order to get a better understanding of the domain and to better orient the future research works.
Now lets talk about the method. What did we do ?
We reviewed the linked data based exploration and discovery systems within 3 broad aeras of classification:
The first category was browsers. Browsers allows the users to manually navigate in an information space. Browsing is a movement in a connected space. It involve a relatively hig user engagement.
The second category of system was recommenders. Recommenders are systems that aim to prefict the utility of an item. In this case the users are more passive, the recommender require only minimal or no interactions to produce a result.
The third category of system we reviewed are exploratory search systems, that are often the most advanced systems in terms of exploration and discovery. Systems that were built in the spirit of the Gary Marchionini paper and that often reference them. The problem, for us, with exploratory search systems is…
That they are very difficult to conceive, to evaluate, to compare and to review of course.
The problem is that till today there is no clear good practises or very strong guidelines to conceive such systems. So their design are very heterogeneous. They implement very different sets of functionalities. They is also a lack in term of
So the first thing we did was were the important criteria to review.
For this we started from the exploratory search task characteristics that are commonly cited in the literature and we identified the main « desired effects of the exploratory search systems. » We identified the following ones.
This model should be refined by the community. Encourage you to contact me if you have some ideas about it.
To complete our previous model we also linked the desired effects of the system to the widespread exploratory search features. You can notice there that several functionalities can be used to obtain the same effect: faceted interfaces and breadcrumb navigation features both encourage the users to explore multiple browsing paths.
Again this model should be refined by the community.
Of course, here is the final results. I show it to you because I think it is interesting to see how the column « desired effects of the systems constitute an interface between the exploratory search task characteristic and the functionalities.
The important point to remind is that the exploratory search systems interfaces are often rich in terms of functionalities. Such functionalities are gathered in an interaction model. The result forms a complex alchemy that is particularly subject to the tension between the interactions intuitiveness and precision.
We started to build a matrix in order to have a first synthetic view over the system. The matrix detailed the fourteen most adavanced systems (in our opinion) regarding a set of critera concerning important information retrieval and human computer interaction aspects.
This matrix consituted an important support during the analysis. It was also refined as we discoverd new important criteria or ways to classify the systems. The matrix is presented in the paper.
An important outcome of this state of the art review was the synthesis of the evolution of research over the time.
During the first development phase of the semantic web (2001 - 2007) several types of browsing paradigms were investigated. Text-based browsers inspired by the classic web browsing experience appeared. In their simplest form they allowed to navigate one resource-at-a- time using resources outgoing relations. During this period visual (graph) and
faceted browsers were also investigated. The small size and the relative homogeneity of the available datasets were favorable to such approaches.
In 2007 the Linked Open Data initiative renewed the research. The quality, size and coverage of generic datasets like DBpedia and Freebase opened the door
to innovative processing and interactions models. New browsing paradigms like set-based, multi-pivoting and hierarchical faceting ones were investigated.
Similarity measure computation and linked data based recommenders emerged at the same time. The similarity computation was domain-constrained at the beginning. Then cross-domain and lateral approaches were researched. Some of these recommenders constituted the basis of linked data based exploratory search systems.
Several from-scratch exploratory search systems, associated data processing and interactions models were also developed. As mentioned during the introduction the 3 major search engines deployed their entity-recommendation solutions, confirming the potential of such approaches for mainstream services.
I will conclude by opening the reflection. There is an imperative need for the users to maintain the coherence and the intuitiveness of the exploratory search interfaces.
In other word they need to pay attention to the alchemy they create. But how can they be sure that they are
The key is evaluation.
We relied again on the desired effects of the exploratory search engines to formulate evaluation hypotheses. In order to verify theses hypotheses we also proposed several innovative approaches (READ)
Here we see a participant commenting the screencast of its search session on the right. We also modeled the interactions. The result can be seen on the left.