COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of Value
2 Oct 2017•0 j'aime•1,496 vues
Télécharger pour lire hors ligne
Signaler
Formation
Presentation given by Joseph Greene, Research Repository Librarian at University College Dublin Library, at the LIBER 2017 Conference in Patras, Greece, on July 6, 2017.
COUNTER Standards for Open Access: the Value of Measuring/ the Measuring of Value
1. UCD Library
University College Dublin,
Belfield, Dublin 4, Ireland
Leabharlann UCD
An Coláiste Ollscoile, Baile Átha Cliath,
Belfield, Baile Átha Cliath 4, Eire
COUNTER standards for Open
Access: The value of
measuring/the measuring of
value
LIBER 2017
Patras, 6 July
Joseph Greene
Research Repository Librarian
University College Dublin
joseph.greene@ucd.ie
http://researchrepository.ucd.ie
3. Call: define
success
“…[we] too often conflate several
rather different objectives for
transforming scholarly
communications...”
https://scholarlykitchen.sspnet.org/2017/05/23/open-access-scholarly-communication-defining-success/
5. Measure
distribution
Tipping point: in 2014,
more than 50% of recent
papers (2011-2013) were
found to be Open access
Archambault, E. et al. (2014). Proportion of Open Access Papers Published in Peer-Reviewed Journals at the European and World Levels:
1996–2013 (41p.). Produced for the European Commission DG Research & Innovation.
7. OA Citation
advantage
• At least 40 separate studies show that
Open Access increases citations1,2
• Wide variations between disciplines
• 35% increase in mathematics2
• 500% increase in citations in
physics/astronomy2
• Most recent study: 3.3 million papers3
• Average: OA = 50% more citations
• (Green is overall the better strategy)
1Wagner, B. (2010) ‘Open Access Citation Advantage: An Annotated Bibliography’. DOI: 10.5062/F4Q81B0W
2Swan, A. (2010) ‘The Open Access citation advantage: Studies and results to date’. https://eprints.soton.ac.uk/268516/
3Archambault, E. (2016) ‘Research impact of paywalled versus open access papers’. www.1science.com/oanumbr.html
10. BOAI15
'Means should exist that will permit
having some idea of the value and
quality of each document, for
example, a number of metrics having
to do with views, downloads,
comments, corrections'
Guédon, Jean-Claude (2017-02). Open Access: Toward the Internet of the Mind. http://www.budapestopenaccessinitiative.org/open-
access-toward-the-internet-of-the-mind
11. European
Commission
'Usage metrics are highly relevant for
open-science'
Recommend 'making better use of
existing metrics for open science'
including usage metrics
Directorate-General for Research and Innovation (2017-03). Next-generation metrics: Responsible metrics and evaluation for open science
DOI:10.2777/337729
12. Coalition for
Networked
Information
'Researchers and librarians at several
universities are working to make
analytics on use of items in IRs more
reliable‘
But 'statistics generated by the
systems are poor and do not
demonstrate impact'
CNI Executive Roundtable (2017-04). Rethinking Institutional Repository Strategies. https://www.cni.org/topics/publishing/rethinking-
institutional-repository-strategies
16. Usage data
are not
perfect
• Up to 85% of OA repository
downloads come from non-human
agents1
• At least 40% of OA journal
downloads are not human2
• Even with robot detection, there is
room for improvement3
• DSpace stats: 62% human
• EPrints stats: 55% human
• U. Minho DSpace stats: 59-73%
human
1Greene, J. (2016) 'Web robot detection in scholarly Open Access institutional repositories'. Library Hi Tech, 34 (3):500-520
2Huntington, P., Nicholas, D., & Jamali, H. R. (2008). Web robot detection in the scholarly information environment. Journal of Information
Science, 34(5), 726-741
3Greene, J. (2016) 'How Accurate are IR Usage Statistics?’. Open Repositories (OR2016) Dublin, 13-16 June 2016
18. Problems
• Many ways to do robot detection
• (At least 23 in the literature , not to
mention combinations)
• Nothing resembling a standard
available
• Cross-platform comparison and
aggregation impossible
19. Addressing
the
problem
• COUNTER Robots Working Group
• Joseph Greene, UCD, RIAN (chair)
• Lorraine Estelle, Project COUNTER
• Paul Needham, IRUS-UK/COUNTER
• Representatives from EBSCO, Elsevier, Wiley,
ScholarlyIQ, DSpace, EPrints, DigitalCommons,
OpenAIRE, Base Bielefeld and Open Journal
Systems
“…to devise ‘adaptive filtering systems’ that will allow publishers/repositories/services to
follow a common set of rules to dynamically identify and filter out unusual usage and
robot activity”
20. Usage data sources
.csv
.csv
.txt
Source: Bielefeld/OJS (x3)
Lines: 233,000
Source: IRUS-UK (97 IRs)
Lines: 1.9 million
Source: Wiley
Lines: Several million
PostgreSQL database
Several million rows Period: 3-9 October 2016
21. Robot
detection
• Simple random sample taken
• 202-204 downloads for each dataset
• 95% certainty
• 12 syntactic variables from SQL
queries or added manually
• E.g. IP address, agent, IP owner
• 12-13 behavioural variables added
using SQL queries or API calls
• E.g. number of downloads by user,
number of items downloaded,
dates/times seen
24. Testing
filters
• Test existing COUNTER robots list
• Test existing COUNTER double-click
filter
• Rate of requests
• Volume of requests
• User agents per IP address
• Requests where requested item =
referring URL
25. Testing
filters
• Simulate a set of filters on the
datasets
• Assign true/false positives,
true/false negatives compared
with manual determination
• Calculate:
• Recall, precision (excluded stats)
• Inverse recall, inverse precision
(reported stats)
• Find best combination of filters,
balance of practicality and
accuracy