SlideShare une entreprise Scribd logo
1  sur  28
Towards effective research recommender systems
for repositories
Petr Knoth, Lucas Anastasiou, Aristotelis Charalampous,
Matteo Cancellieri, Samuel Pearce and Nancy Pontika
CORE
Knowledge Media institute, The Open University
United Kingdom
In this talk
• Why recommender systems for repositories
• Building recommender systems for research
• The CORE recommender system
• CPA-based recommender systems
Why recommender systems for
repositories
Discoverability of content in repositories
• Criticism of repositories (Salo, 2008; Konkiel, 2012; Acharya,
2015)
“[The institutional repository] is like a roach motel. Data goes in,
but it doesn’t come out.” (Salo, 2008)
Exploratory research activities
• Focus on SEO, but …
• Researchers often involved in exploratory information seeking
activities (Marchionini, 2008):
Accessibility
Accessibility can be defined as the likelihood of
expressing a query leading to the retrieval of the
resource and the position of the resource in the search
results list.
[Azzopardi and Vinay, 2008]
By integrating recommender into repositories, we can increase
the number of incoming links to relevant resources, thereby
increasing their accessibility.
Why is it needed?
What use case do research recommender systems address?
“As a user, I want to receive recommendations about
content (papers, datasets, software, people to follow,
grant opportunities, methods, conferences, etc.) that is of
interest to me, so I can continuously increase knowledge
in my field.”
[ Next Generation Repositories WG, 2017]
What effect can it have?
• Increase accessibility
means more
engagement, higher
Click-Through Rate (CTR)
• Twice as often people
access resources on CORE
via its recommender
system than via search.
Recommender system and social networks
• Recsys proven to drive
engagement and growth
of networks
• Enabler of successful
solutions.
Building recsys for research
Collaborative filtering (CF) vs Content-based filtering (CBF)
Common recsys approaches
Personalised vs non-personalised
Common recsys approaches
Aspects to optimise a recommender system for
• Type of recs:
• Novelty/recency
• Relevance
• Familiarity
• Serendipity
• Audience: post-doc, lecturer, professor, etc.
• Use case: reviewing literature, staying in touch,
learning about new field
More information at: http://dl.acm.org/citation.cfm?id=3057152
The CORE recommender system
Recommendations as a service
CORE provides a non-personalised CBF-based
recommendation system for articles from across the
global network of repositories.
• 8 million full
texts
• 77 million
metadata
records
• 2683
repositories
Recs delivered
using:
• a plugin (Eprints,
DSPace +
general
repository)
• over the CORE
API
Recommendable items in the CORE recommender
How does the CORE recommender work?
• Currently an article-article recommender system
• Preprocessing prior to recsys: enrichment with e.g.
identifiers. document type and citation data
• Similarity measure/ranking function
• Post-filtering using record quality
• Feedback (crowdsourcing a black list)
What features is the similarity calculation based on?
• Features:
• Textual: title, authors, abstract, full text!
• Recency: publication year
• Popularity: citations, readership
• Document type (thesis/article/slides)
• Others: subject field, Citation Proximity
Analysis/Co-citations (in progress)
Combining features
• Evaluating different ranking functions (P,R,MAP, etc.):
• Weights for boosting
• Scaling function (e.g. exponential decay for
recency)
• Offline ground truths:
• MAG citation assumption
• MAG co-citation assumption
• Learning to rank (haven’t done yet)
• Online A/B testing (haven’t done yet)
Distinctive features of the CORE recommender
• Use of full text rather than just abstracts
• Ensure open access availability of recommended
articles (recsys for all)
• Free service
• Integration with repositories and availability over API
More information: https://arxiv.org/abs/1705.00578
Discussion & Challenges
• Effective content harvesting and TDM from
repositories
• We are losing valuable interaction data due to our
distributed infrastructure => voluntary global sign-on:
• Personalisation
• Collaborative filtering
• Citation data – movement towards having them
public
• Online vs offline testing (Beel & Langer, 2015)
CPA-based recommender systems
Can we do better than co-citations?
• Build on the idea introduced by Beel et al. [1], putting
the concept of Citation Proximity Analysis (CPA) into
practice.
• Extending the co-citation assumption: “the more
often two articles are co-cited in a document, the
more likely they are related” taking proximity into
account
• Introducing and evaluating new proximity functions
Method
Proximity functions
1. Co-citations baseline (no proximity information)
2. ProxMin
3. ProxSum
4. ProxMean
Initial evaluation
• Corpus of 350k papers from CORE
• 6 topics, 5 recs each, 4 systems, 10 annotators (1,200
judgements), Fleiss’s
• Improvement of 25% over baseline for P@5 ProxSum
Conclusions
• Research recommenders have value, but potential
not fully realised yet
• Data availability and interoperability needed:
• Recommendable content/entities
• User interaction data and profiles
• Recommendations as a service: CORE recommender
• Repositories should allow CF and personalisation by:
• Exposing interaction data
• Voluntary global sign-on
• Both advocated by COAR Next generation
Repositories WG

Contenu connexe

Tendances

Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
petrknoth
 
Has anyone seen my data? Incentivising #opendata sharing with altmetrics
Has anyone seen my data? Incentivising #opendata sharing with altmetricsHas anyone seen my data? Incentivising #opendata sharing with altmetrics
Has anyone seen my data? Incentivising #opendata sharing with altmetrics
Nick Sheppard
 

Tendances (20)

Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...Academic SEO, or: How do I get my research to show up in search engines and d...
Academic SEO, or: How do I get my research to show up in search engines and d...
 
Standardising research data policies, research data network
Standardising research data policies, research data networkStandardising research data policies, research data network
Standardising research data policies, research data network
 
ALA 2016 NISO Standards Update Link Origin Tracking
ALA 2016 NISO Standards Update Link Origin TrackingALA 2016 NISO Standards Update Link Origin Tracking
ALA 2016 NISO Standards Update Link Origin Tracking
 
Current Research Information Systems
Current Research Information SystemsCurrent Research Information Systems
Current Research Information Systems
 
L cwebinar russell_wise_feb26-2015
L cwebinar russell_wise_feb26-2015L cwebinar russell_wise_feb26-2015
L cwebinar russell_wise_feb26-2015
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
Has anyone seen my data? Incentivising #opendata sharing with altmetrics
Has anyone seen my data? Incentivising #opendata sharing with altmetricsHas anyone seen my data? Incentivising #opendata sharing with altmetrics
Has anyone seen my data? Incentivising #opendata sharing with altmetrics
 
Valen Metadata and the [Data] Repository
Valen Metadata and the [Data] RepositoryValen Metadata and the [Data] Repository
Valen Metadata and the [Data] Repository
 
Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...
Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...
Lagace Presentation on the NISO Open Access Metadata and Indicators Project a...
 
Web Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated SearchWeb Scale Discovery Vs Federated Search
Web Scale Discovery Vs Federated Search
 
Federated to library discovery platfoms
Federated to library discovery platfomsFederated to library discovery platfoms
Federated to library discovery platfoms
 
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...Big Data (SOCIOMETRIC METHODS FOR  RELEVANCY ANALYSIS OF LONG TAIL  SCIENCE D...
Big Data (SOCIOMETRIC METHODS FOR RELEVANCY ANALYSIS OF LONG TAIL SCIENCE D...
 
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
 
Evaluation of web scale discovery services
Evaluation of web scale discovery servicesEvaluation of web scale discovery services
Evaluation of web scale discovery services
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
DDA/OAMI Update - NISO Update, ALA Annual Chicago 2013
 
Cameron Neylon - Lightning talk at NISO Altmetrics Initiative
Cameron Neylon - Lightning talk at NISO Altmetrics InitiativeCameron Neylon - Lightning talk at NISO Altmetrics Initiative
Cameron Neylon - Lightning talk at NISO Altmetrics Initiative
 
Who will use the open data? Mark Humphries keynote
Who will use the open data? Mark Humphries keynoteWho will use the open data? Mark Humphries keynote
Who will use the open data? Mark Humphries keynote
 
Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15
 

Similaire à Towards effective research recommender systems for repositories

OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
Open Science Fair
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
Joaquin Delgado PhD.
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
S. Diana Hu
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
PyData
 

Similaire à Towards effective research recommender systems for repositories (20)

OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
OSFair2017 Workshop | Building a global knowledge commons - ramping up reposi...
 
Next Generation Repositories
Next Generation RepositoriesNext Generation Repositories
Next Generation Repositories
 
The OCLC Research Library Partnership
The OCLC Research Library PartnershipThe OCLC Research Library Partnership
The OCLC Research Library Partnership
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
NISO Open Discovery Initiative January 2019
NISO Open Discovery Initiative January 2019NISO Open Discovery Initiative January 2019
NISO Open Discovery Initiative January 2019
 
Crossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarCrossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinar
 
Crossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinarCrossref for Ambassadors - Introductory webinar
Crossref for Ambassadors - Introductory webinar
 
A Pragmatic Approach to Facilitating Text and Data Mining
A Pragmatic Approach to Facilitating Text and Data Mining A Pragmatic Approach to Facilitating Text and Data Mining
A Pragmatic Approach to Facilitating Text and Data Mining
 
NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)NISO access related projects (presented at the Charleston conference 2016)
NISO access related projects (presented at the Charleston conference 2016)
 
Bergstrom, Carpenter, Jakobsen, Jurczyk, McKenna, Morris, and Nadav-Manes "C...
Bergstrom, Carpenter, Jakobsen, Jurczyk, McKenna, Morris, and Nadav-Manes  "C...Bergstrom, Carpenter, Jakobsen, Jurczyk, McKenna, Morris, and Nadav-Manes  "C...
Bergstrom, Carpenter, Jakobsen, Jurczyk, McKenna, Morris, and Nadav-Manes "C...
 
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
 
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
 
NISO Update ODI June 2014 Morse
NISO Update ODI June 2014 MorseNISO Update ODI June 2014 Morse
NISO Update ODI June 2014 Morse
 
How to improve the acceptance of AltMetrics
How to improve the acceptance of AltMetricsHow to improve the acceptance of AltMetrics
How to improve the acceptance of AltMetrics
 
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
An Example of Predictive Analytics: Building a Recommendation Engine Using Py...
 
What does open science mean? A stakeholder perspective
What does open science mean? A stakeholder perspectiveWhat does open science mean? A stakeholder perspective
What does open science mean? A stakeholder perspective
 
Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure
 
Analysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery toolsAnalysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery tools
 
UKSG webinar - TERMS revisited: developing the combination of electronic reso...
UKSG webinar - TERMS revisited: developing the combination of electronic reso...UKSG webinar - TERMS revisited: developing the combination of electronic reso...
UKSG webinar - TERMS revisited: developing the combination of electronic reso...
 
Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015 Scientific Data and peer review session at Dryad event, May 2015
Scientific Data and peer review session at Dryad event, May 2015
 

Plus de petrknoth

Towards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific PublicationsTowards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific Publications
petrknoth
 
From Open Access Metadata to Open Access Content: Two Principles for Increase...
From Open Access Metadata to Open Access Content: Two Principles for Increase...From Open Access Metadata to Open Access Content: Two Principles for Increase...
From Open Access Metadata to Open Access Content: Two Principles for Increase...
petrknoth
 
DEVCSI Core Mobile
DEVCSI Core MobileDEVCSI Core Mobile
DEVCSI Core Mobile
petrknoth
 

Plus de petrknoth (20)

Qui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishingQui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishing
 
CORE APIv3
CORE APIv3CORE APIv3
CORE APIv3
 
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in RepositoriesOAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
 
UKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet themUKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet them
 
Enabling Educators to Locate High-Quality Teaching Resources
Enabling Educators to LocateHigh-Quality Teaching ResourcesEnabling Educators to LocateHigh-Quality Teaching Resources
Enabling Educators to Locate High-Quality Teaching Resources
 
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository DashboardTracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
 
CORE Analytics Dashboard
CORE Analytics DashboardCORE Analytics Dashboard
CORE Analytics Dashboard
 
Assessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access PolicyAssessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access Policy
 
Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)
 
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSync
 
Semantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research EvaluationSemantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research Evaluation
 
Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...
 
My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?
 
FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)
 
Towards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific PublicationsTowards an Infrastructure for Mining Scientific Publications
Towards an Infrastructure for Mining Scientific Publications
 
From Open Access Metadata to Open Access Content: Two Principles for Increase...
From Open Access Metadata to Open Access Content: Two Principles for Increase...From Open Access Metadata to Open Access Content: Two Principles for Increase...
From Open Access Metadata to Open Access Content: Two Principles for Increase...
 
DiggiCORE: Digging into Connected Repositories
DiggiCORE: Digging into Connected RepositoriesDiggiCORE: Digging into Connected Repositories
DiggiCORE: Digging into Connected Repositories
 
DEVCSI Core Mobile
DEVCSI Core MobileDEVCSI Core Mobile
DEVCSI Core Mobile
 
Text mining in CORE (OR2012)
Text mining in CORE (OR2012)Text mining in CORE (OR2012)
Text mining in CORE (OR2012)
 

Dernier

怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
Health
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Abortion pills in Riyadh +966572737505 get cytotec
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
ptikerjasaptiker
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
q6pzkpark
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 

Dernier (20)

The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 

Towards effective research recommender systems for repositories

  • 1. Towards effective research recommender systems for repositories Petr Knoth, Lucas Anastasiou, Aristotelis Charalampous, Matteo Cancellieri, Samuel Pearce and Nancy Pontika CORE Knowledge Media institute, The Open University United Kingdom
  • 2. In this talk • Why recommender systems for repositories • Building recommender systems for research • The CORE recommender system • CPA-based recommender systems
  • 3. Why recommender systems for repositories
  • 4. Discoverability of content in repositories • Criticism of repositories (Salo, 2008; Konkiel, 2012; Acharya, 2015) “[The institutional repository] is like a roach motel. Data goes in, but it doesn’t come out.” (Salo, 2008)
  • 5. Exploratory research activities • Focus on SEO, but … • Researchers often involved in exploratory information seeking activities (Marchionini, 2008):
  • 6. Accessibility Accessibility can be defined as the likelihood of expressing a query leading to the retrieval of the resource and the position of the resource in the search results list. [Azzopardi and Vinay, 2008]
  • 7. By integrating recommender into repositories, we can increase the number of incoming links to relevant resources, thereby increasing their accessibility. Why is it needed?
  • 8. What use case do research recommender systems address? “As a user, I want to receive recommendations about content (papers, datasets, software, people to follow, grant opportunities, methods, conferences, etc.) that is of interest to me, so I can continuously increase knowledge in my field.” [ Next Generation Repositories WG, 2017]
  • 9. What effect can it have? • Increase accessibility means more engagement, higher Click-Through Rate (CTR) • Twice as often people access resources on CORE via its recommender system than via search.
  • 10. Recommender system and social networks • Recsys proven to drive engagement and growth of networks • Enabler of successful solutions.
  • 12. Collaborative filtering (CF) vs Content-based filtering (CBF) Common recsys approaches
  • 14. Aspects to optimise a recommender system for • Type of recs: • Novelty/recency • Relevance • Familiarity • Serendipity • Audience: post-doc, lecturer, professor, etc. • Use case: reviewing literature, staying in touch, learning about new field More information at: http://dl.acm.org/citation.cfm?id=3057152
  • 16. Recommendations as a service CORE provides a non-personalised CBF-based recommendation system for articles from across the global network of repositories. • 8 million full texts • 77 million metadata records • 2683 repositories
  • 17. Recs delivered using: • a plugin (Eprints, DSPace + general repository) • over the CORE API Recommendable items in the CORE recommender
  • 18. How does the CORE recommender work? • Currently an article-article recommender system • Preprocessing prior to recsys: enrichment with e.g. identifiers. document type and citation data • Similarity measure/ranking function • Post-filtering using record quality • Feedback (crowdsourcing a black list)
  • 19. What features is the similarity calculation based on? • Features: • Textual: title, authors, abstract, full text! • Recency: publication year • Popularity: citations, readership • Document type (thesis/article/slides) • Others: subject field, Citation Proximity Analysis/Co-citations (in progress)
  • 20. Combining features • Evaluating different ranking functions (P,R,MAP, etc.): • Weights for boosting • Scaling function (e.g. exponential decay for recency) • Offline ground truths: • MAG citation assumption • MAG co-citation assumption • Learning to rank (haven’t done yet) • Online A/B testing (haven’t done yet)
  • 21. Distinctive features of the CORE recommender • Use of full text rather than just abstracts • Ensure open access availability of recommended articles (recsys for all) • Free service • Integration with repositories and availability over API More information: https://arxiv.org/abs/1705.00578
  • 22. Discussion & Challenges • Effective content harvesting and TDM from repositories • We are losing valuable interaction data due to our distributed infrastructure => voluntary global sign-on: • Personalisation • Collaborative filtering • Citation data – movement towards having them public • Online vs offline testing (Beel & Langer, 2015)
  • 24. Can we do better than co-citations? • Build on the idea introduced by Beel et al. [1], putting the concept of Citation Proximity Analysis (CPA) into practice. • Extending the co-citation assumption: “the more often two articles are co-cited in a document, the more likely they are related” taking proximity into account • Introducing and evaluating new proximity functions
  • 26. Proximity functions 1. Co-citations baseline (no proximity information) 2. ProxMin 3. ProxSum 4. ProxMean
  • 27. Initial evaluation • Corpus of 350k papers from CORE • 6 topics, 5 recs each, 4 systems, 10 annotators (1,200 judgements), Fleiss’s • Improvement of 25% over baseline for P@5 ProxSum
  • 28. Conclusions • Research recommenders have value, but potential not fully realised yet • Data availability and interoperability needed: • Recommendable content/entities • User interaction data and profiles • Recommendations as a service: CORE recommender • Repositories should allow CF and personalisation by: • Exposing interaction data • Voluntary global sign-on • Both advocated by COAR Next generation Repositories WG

Notes de l'éditeur

  1. Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009). Key phrases: - Aggregating open access research papers from all over the world Millions of research papers ready to text-mine One point of access to public research knowledge for humans and machines
  2. Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009). Key phrases: - Aggregating open access research papers from all over the world Millions of research papers ready to text-mine One point of access to public research knowledge for humans and machines
  3. An essential glue to link related content from a global distributed network of digital libraries.
  4. Achieving harmonized access to full text research literature (fresh access to distributed resources, content & entity deduplication and disambiguation, standardisation) Extracting knowledge from full text of research papers in collaboration with other research groups: literature-based discovery (), research analytics (trends detection on full text), finding cutting-edge research (developing new research metrics based on full text beyond citations counting) Increasing the visibility of repositories
  5. Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009). Key phrases: - Aggregating open access research papers from all over the world Millions of research papers ready to text-mine One point of access to public research knowledge for humans and machines
  6. Each year approximately 1.5 million research papers are being published (and only 4% of these are available via an open access journal) (Bj¬ork and Lauri, 2009). Key phrases: - Aggregating open access research papers from all over the world Millions of research papers ready to text-mine One point of access to public research knowledge for humans and machines
  7. voluntary global sign-on and functionality for openly releasing anonymised user-interaction data, to enable the creation of effective research recommender systems in the future.
  8. voluntary global sign-on and functionality for openly releasing anonymised user-interaction data, to enable the creation of effective research recommender systems in the future.
  9. voluntary global sign-on and functionality for openly releasing anonymised user-interaction data, to enable the creation of effective research recommender systems in the future.
  10. The main idea of the CORE recommender is the idea of recommendations as a service
  11. CORE recommender (add screenshots)
  12. CORE recommender (add screenshots)
  13. CORE recommender (add screenshots)
  14. CORE recommender (add screenshots)
  15. CORE recommender (add screenshots)
  16. CORE recommender (add screenshots)
  17. CORE recommender (add screenshots)
  18. CORE recommender (add screenshots)
  19. CORE recommender (add screenshots)