In this paper, we argue why and how the integration of recommender systems for research can enhance the functionality and user experience in repositories. We present the latest technical innovations in the CORE Recommender, which provides research article recommendations across the global network of repositories and journals. The CORE Recommender has been recently redeveloped and released into production in the CORE system and has also been deployed in several third-party repositories. We explain the design choices of this unique system and the evaluation processes we have in place to continue raising the quality of the provided recommendations. By drawing on our experience, we discuss the main challenges in offering a state-of-the-art recommender solution for repositories. We highlight two of the key limitations of the current repository infrastructure with respect to developing research recommender systems: 1) the lack of a standardised protocol and capabilities for exposing anonymised user-interaction logs, which represent critically important input data for recommender systems based on collaborative filtering and 2) the lack of a voluntary global sign-on capability in repositories, which would enable the creation of personalised recommendation and notification solutions based on past user interactions.
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Towards effective research recommender systems for repositories
1. Towards effective research recommender systems
for repositories
Petr Knoth, Lucas Anastasiou, Aristotelis Charalampous,
Matteo Cancellieri, Samuel Pearce and Nancy Pontika
CORE
Knowledge Media institute, The Open University
United Kingdom
2. In this talk
• Why recommender systems for repositories
• Building recommender systems for research
• The CORE recommender system
• CPA-based recommender systems
4. Discoverability of content in repositories
• Criticism of repositories (Salo, 2008; Konkiel, 2012; Acharya,
2015)
“[The institutional repository] is like a roach motel. Data goes in,
but it doesn’t come out.” (Salo, 2008)
5. Exploratory research activities
• Focus on SEO, but …
• Researchers often involved in exploratory information seeking
activities (Marchionini, 2008):
6. Accessibility
Accessibility can be defined as the likelihood of
expressing a query leading to the retrieval of the
resource and the position of the resource in the search
results list.
[Azzopardi and Vinay, 2008]
7. By integrating recommender into repositories, we can increase
the number of incoming links to relevant resources, thereby
increasing their accessibility.
Why is it needed?
8. What use case do research recommender systems address?
“As a user, I want to receive recommendations about
content (papers, datasets, software, people to follow,
grant opportunities, methods, conferences, etc.) that is of
interest to me, so I can continuously increase knowledge
in my field.”
[ Next Generation Repositories WG, 2017]
9. What effect can it have?
• Increase accessibility
means more
engagement, higher
Click-Through Rate (CTR)
• Twice as often people
access resources on CORE
via its recommender
system than via search.
10. Recommender system and social networks
• Recsys proven to drive
engagement and growth
of networks
• Enabler of successful
solutions.
14. Aspects to optimise a recommender system for
• Type of recs:
• Novelty/recency
• Relevance
• Familiarity
• Serendipity
• Audience: post-doc, lecturer, professor, etc.
• Use case: reviewing literature, staying in touch,
learning about new field
More information at: http://dl.acm.org/citation.cfm?id=3057152
16. Recommendations as a service
CORE provides a non-personalised CBF-based
recommendation system for articles from across the
global network of repositories.
• 8 million full
texts
• 77 million
metadata
records
• 2683
repositories
17. Recs delivered
using:
• a plugin (Eprints,
DSPace +
general
repository)
• over the CORE
API
Recommendable items in the CORE recommender
18. How does the CORE recommender work?
• Currently an article-article recommender system
• Preprocessing prior to recsys: enrichment with e.g.
identifiers. document type and citation data
• Similarity measure/ranking function
• Post-filtering using record quality
• Feedback (crowdsourcing a black list)
19. What features is the similarity calculation based on?
• Features:
• Textual: title, authors, abstract, full text!
• Recency: publication year
• Popularity: citations, readership
• Document type (thesis/article/slides)
• Others: subject field, Citation Proximity
Analysis/Co-citations (in progress)
20. Combining features
• Evaluating different ranking functions (P,R,MAP, etc.):
• Weights for boosting
• Scaling function (e.g. exponential decay for
recency)
• Offline ground truths:
• MAG citation assumption
• MAG co-citation assumption
• Learning to rank (haven’t done yet)
• Online A/B testing (haven’t done yet)
21. Distinctive features of the CORE recommender
• Use of full text rather than just abstracts
• Ensure open access availability of recommended
articles (recsys for all)
• Free service
• Integration with repositories and availability over API
More information: https://arxiv.org/abs/1705.00578
22. Discussion & Challenges
• Effective content harvesting and TDM from
repositories
• We are losing valuable interaction data due to our
distributed infrastructure => voluntary global sign-on:
• Personalisation
• Collaborative filtering
• Citation data – movement towards having them
public
• Online vs offline testing (Beel & Langer, 2015)
24. Can we do better than co-citations?
• Build on the idea introduced by Beel et al. [1], putting
the concept of Citation Proximity Analysis (CPA) into
practice.
• Extending the co-citation assumption: “the more
often two articles are co-cited in a document, the
more likely they are related” taking proximity into
account
• Introducing and evaluating new proximity functions
27. Initial evaluation
• Corpus of 350k papers from CORE
• 6 topics, 5 recs each, 4 systems, 10 annotators (1,200
judgements), Fleiss’s
• Improvement of 25% over baseline for P@5 ProxSum
28. Conclusions
• Research recommenders have value, but potential
not fully realised yet
• Data availability and interoperability needed:
• Recommendable content/entities
• User interaction data and profiles
• Recommendations as a service: CORE recommender
• Repositories should allow CF and personalisation by:
• Exposing interaction data
• Voluntary global sign-on
• Both advocated by COAR Next generation
Repositories WG
Notes de l'éditeur
Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009).
Key phrases:
- Aggregating open access research papers from all over the world
Millions of research papers ready to text-mine
One point of access to public research knowledge for humans and machines
Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009).
Key phrases:
- Aggregating open access research papers from all over the world
Millions of research papers ready to text-mine
One point of access to public research knowledge for humans and machines
An essential glue to link related content from a global distributed network of digital libraries.
Achieving harmonized access to full text research literature (fresh access to distributed resources, content & entity deduplication and disambiguation, standardisation)
Extracting knowledge from full text of research papers in collaboration with other research groups: literature-based discovery (), research analytics (trends detection on full text), finding cutting-edge research (developing new research metrics based on full text beyond citations counting)
Increasing the visibility of repositories
Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009).
Key phrases:
- Aggregating open access research papers from all over the world
Millions of research papers ready to text-mine
One point of access to public research knowledge for humans and machines
Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009).
Key phrases:
- Aggregating open access research papers from all over the world
Millions of research papers ready to text-mine
One point of access to public research knowledge for humans and machines
voluntary global sign-on and functionality for openly releasing anonymised user-interaction data, to enable the creation of effective research recommender systems in the future.
voluntary global sign-on and functionality for openly releasing anonymised user-interaction data, to enable the creation of effective research recommender systems in the future.
voluntary global sign-on and functionality for openly releasing anonymised user-interaction data, to enable the creation of effective research recommender systems in the future.
The main idea of the CORE recommender is the idea of recommendations as a service