This slides presents a novel collaborative document ranking model which aims at solving a complex information retrieval task involving a multi-faceted information need. For this purpose, we consider a group of users, viewed as experts, who collaborate by addressing the different query facets. We propose a two-step algorithm based on a relevance feedback process which first performs a document scoring towards each expert and then allocates documents to the most suitable experts using the Expectation-Maximisation learning-method. The performance improvement is demonstrated through experiments using TREC interactive benchmark.
This paper has been awarded at AIRS 2013 Conference.
http://link.springer.com/chapter/10.1007%2F978-3-642-45068-6_10
ftp://ftp.irit.fr/IRIT/SIG/2013_AIRS_STB.pdf
3. From Individual to Collaborative
Information Retrieval (CIR)
Information need
Information
retrieval system
Complex or exploratory tasks
Collaboration paradigms
[Shah, 2012]
[Foley et al., 2009; Morris and
Teevans, 2010]
Shared
information need
Collaborative
information
retrieval
system
.
.
.
- Division of labor
- Sharing of knowledge
- Awareness
Synergic Effect [Shah, 2012]
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
3
4. cost
From Individual to Collaborative
Information Retrieval (CIR)
.
.
.
Hubble
telescope
achievement
gravitational lenses
new cosmological
theories
Shared
information need
Collaborative
information
retrieval
system
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
4
5. Related Work: multi-faceted search
individual vs. collaborative search
•
Individual-based search
– Identifying query facets for facet-based retrieval
•
•
•
Terminological resources [Dakka et al., 2006]
Navigation-based classification [Dou et al., 2011]
Probabilistic model [Blei et al., 2003; Deerwester et al., 1990]
– Result diversification [Carbonell et al., 1998; Wang et al., 2012]
•
Collaborative-based search
– Relevance feedback-based CIR models [Foley et al., 2009;
Morris et al., 2008]
– Role-based CIR models [Pickens et al., 2008; Shah et al., 2010]
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
5
6. Facet 8
Research Questions
Facet 6
- How to leverage experts’ domain
knowledge for collaboratively ranking
documents?
Facet 4
Information
need
- How to mine query facets?
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
6
7. CIR Model for Multi-Faceted Search
Query Facet
Mining
Expert-based
Document
Scoring
Expert-Based
Document
Allocation
7
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
7
8. CIR Model for Multi-Faceted Search
Query Facet
Mining
Expert-based
Document
Scoring
Expert-Based
Document
Allocation
• Document diversification and facet mining
Shared
information need
t1 t2 t3 t4 t5 t6 t7 t8 t9 ...
t1 t2 t3 t4 t5 t6 t7 t8 t9 ...
[Carbonell and Goldstein, 1998]
t1 t2 t3 t4 t5 t6 t7 t8 t9 ...
Diversification
Latent Dirichlet
Allocation
[Blei et al., 2003]
t1 t2 t3 t4 t5 t6 t7 t8 t9 ...
...
t1 t2 t3 t4 t5 t6 t7 t8 t9 ...
Likelihood optimization
t1 t2 t3 t4 t5 t6 t7 t8 t9 ...
...
8
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
8
9. CIR Model for Multi-Faceted Search
Query Facets
Mining
Expert-based
Document
Scoring
Expert-Based
Document
Allocation
• Estimating the document relevance towards each expert
9
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
9
10. CIR Model for Multi-Faceted Search
Query Facet
Mining
Expert-based
Document
Scoring
Expert-Based
Document
Allocation
• Assigning a document to the most suitable expert considering experts' domain
expertise towards the query facets
Document allocation
Division of labor:
Intersection of document lists
simultaneously displayed to
experts is empty
10
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
10
11. Experimental Evaluation
Dataset and Collaboration simulation
•
Dataset
–
–
–
•
TREC Interactive 6-7-8
210 158 articles
277 collaborative search sessions from 20 TREC topics
Collaboration-simulation based framework [Foley et al., 2009]
For each TREC query Q:
All possible combinations of
size m>1 of domain experts
Domain experts
2-means
classification
…
…
Domain novices
…
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
11
12. Experimental Evaluation
Collaboration simulation
For each TREC query Q and each expert group: building the timeline of relevance feedback
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
12
15. Experimental Evaluation
Baselines
Mining
Query Facets
Expert-based
Document
Scoring
Expert-Based
Document
Allocation
Division of
Labor
• W/oEMDoL: the individual version of our model which
integrates only the expert-based document scoring.
• W/oDoL: our collaborative-based model by excluding the
division of labor principle.
• W/oEM: our collaborative-based model by excluding the
expert-based document allocation.
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
15
16. Experimental Evaluation
Metrics
•
Metrics
–
Relevance judgments: relevance feedback provided by TREC participants
including an agreement level
–
Measures [Shah and Gonzalez-Ibanez, 2011]
Precision: the more document lists include
relevant documents, the higher the precision is.
Coverage: the more document lists are diversified,
the higher the coverage is.
Relevant coverage: the more document lists include
diversified and relevant documents, the higher the
relevant coverage is.
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
16
17. Experimental Evaluation
Results
• 2-cross validation
– 2 random subsets of collaborative search sessions split on the basis of
TREC queries
• Parameter Tuning
–
–
–
–
Diversification: γ=1
LDA modeling: likelihood maximal at 200 topics
Number of considered facets for topical-based modeling: 5
Expert-based document scoring: λ=0.6
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
17
19. Experimental Evaluation
Results
• Complementary Analysis
• Retrieval effectiveness is generally
stable even with the increasing size
of the group
• Higher the agreement level is,
fewer documents are assessed as
relevant
favors the search failure
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
19
20. Conclusion and Future Work
• A 2-step collaborative ranking model for satisfying a multifaceted information need considering a group of experts.
– Expert-based document scoring
– Expert-based document allocation by means of the ExpectationMaximization learning method.
• Evaluation through a collaboration simulation-based
framework showing effective results.
• Future work:
– Design of other formal methods to emphasize division of labor
– Modeling of user profile through his behavior in addition to his
relevance feedback.
20
CIR | Related Work | Research questions | CIR for multi-faceted search | Experiments | Conclusion
20