OpenAIRE usage statistics session slides at #DI4R2017. 1 Dec. 2017 - by Dimitris Pierrakos, Athena Research Center and Jochen Schirrwagen, Bielefeld University.
Low Rate Young Call Girls in Sector 63 Mamura Noida ✔️☆9289244007✔️☆ Female E...
Everything Counts in Large Amounts: Measuring the Impact of Usage Activity in Open Access Scholarly Environments - #DI4R2017 session
1. @openaire_eu
Everything Counts in Large
Amounts:
Measuring the Impact of Usage Activity in Open
Access Scholarly Environments
DI4R December 2017, Brussels
Dimitris Pierrakos, Athena Research Center
Jochen Schirrwagen, Bielefeld University
2. ● OpenAIRE infrastructure and Usage Statistics Service.
● Usage Data Collection strategies.
● Using Piwik for tracking and analytics.
● Applying COUNTER rules.
● Metrics in the Repository Manager Dashboard.
● Relation to Open Metrics and Next Generation Metrics.
Overview
3. • A pan-European Research Information platform to
monitor OA research outcomes from EC and other
national funders.
• Research analytics tools to promote new scientific
metrics & support evidence-based decision-making.
• Implementation of an OpenAIRE usage statistics
service for usage data collected from data providers.
OpenAIRE 2020
4. ● Task in OpenAIRE2020 covers:
○ aligning policies and standards for gathering and sharing of usage data
-> guidelines
○ considering legal aspects (data protection / data privacy)
○ relating usage statistics to other kinds of metrics
○ collecting and processing of usage data and producing consolidated,
standards-based usage statistics
● Task team: Athena Research Center, University of
Bielefeld, University of Minho, Jisc IRUS-UK,
Couperin + NOADs
Usage Statistics in OpenAIRE
5. ● OpenAIRE collects from 980 compatible data providers
~21 Mio documents
● currently 32 active data providers participating in
Usage statistics + IRUS-UK
● Usage statistics deployment under cc-0.
○ in OpenAIRE dashboard, portal and API.
Usage Statistics in the OpenAIRE Infrastructure
6. ● Tracking of views and downloads / collecting
COUNTER reports
○Push or Pull collection workflows.
● Anonymisation of IP-addresses.
● Metadata de-duplication enables accumulation of
views and downloads for same documents
● COUNTER Code of Practice compatibility.
○standards based usage statistics.
○enables comparability with statistics from other data sources.
Usage Statistics Service Features
7. • World's leading open-source analytics platform.
• Valuable insights into website traffic and visitors activity.
• Piwik collects and stores PII (personally identifiable
information).
• Keeps full data ownership and can control who has access.
• Robot filtering plugin.
• Compliant with EU regulations.
• Recommended by privacy organizations such as ULD
(Germany) and CNIL (France).
Piwik Analytics platform
8. Piwik Google Analytics
Number of Hits per Month Unlimited 10 million
Number of user accounts per login Unlimited 10
Data storage time Unlimited 25 months
Number of properties
(websites, apps etc.) tracked per
account Unlimited 50
Custom Variables 5 5
Data Export Unlimited 5000 rows
Real time Analytics
Piwik offers real-time web
analytics
in all of its reports.
GA monitors user activity right
after it happens,
although period of delay is not
explicitly stated.
Piwik Facts
10. • An institutional repository is registered in Piwik.
• Server side tracking: Plugins (Dspace) or patches
(Eprints) using Piwik’s HTTP API.
• Usage Activity is tracked and logged at Piwik
platform in real time.
• Ιnformation is transferred offline, using Piwik’s API,
to OpenAIRE’s DBs for statistical analysis.
• Statistics are deployed via OpenAIRE’s Portal or
Sushi-Lite API.
Tier-1: Push Usage Statistics Tracking Workflow
11. Parameter Description
idSite the ID of the repository
idVisit a visitor/session ID (an 8 byte binary string)
visitIP (optionally anonymized) the IP address of the visitor
action the action performed (view, download, outlink, etc)
url the url of the requested item
timestamp the date & time of the request
OAI-PMH Identifier
the Open Access Initiative identifier of the item being
viewed/downloaded
agent the Web Browser and the operating system of the visitor
referrer The url linked to the item requested
Tier-1: Piwik Tracking Parameters
12. ● Usage events can be considered privacy-sensitive
information (user-agent, ip-address, ...)
● Usage statistics services must comply with
data protection laws and regulations for both
usage data- and service-providers
○ but legal situation differs between the countries
○ OpenAIRE must comply with the EU-General Data Protection
Regulation
● Tracking plugins issued by OpenAIRE anonymize
usage data already on the client-side
Data Protection Aspects
15. • Applying data processing rules according to
COUNTER Code of Practice:
• ie. counting requests depending on session duration, tracing double-
clicks
• Bot filtering
• Piwik Bot Plugin
• COUNTER Robots Working Group
• Link of usage event with metadata record in
OpenAIRE
• Accumulate views and counts of de-duplicated
records
Cleaning and Consolidation
17. • Gathering of consolidated statistics reports from
aggregation services, such as IRUS-UK, using protocols
such as SUSHI-Lite.
• Statistics are stored to OpenAIRE’s DB for statistical
analysis.
• Statistics are deployed via OpenAIRE’s Portal or Sushi-
Lite API.
Tier-2: Collecting (Pull) Consolidated Usage Statistics
Reports
26. ● Available as beta with the help of IRUS-UK
○ http://beta.services.openaire.eu/usagestats/sushilite/
● Supports COUNTER R4 compatible reports:
○ Article Reports (AR) and Book Reports (BR) using identifiers like
openaire, doi, oai-record-id
○ Item Reports (IR)
○ Repository Reports (RR) using identifiers issued by OpenAIRE or
OpenDOAR
○ Journal Reports (JR) using identifiers like ISSN
SUSHI-Lite Interface
28. • Quantitative indicators for research
• Governance
• Management
• Assessment
• Dimensions
• Robust metrics in terms of accuracy and scope;
• Humble metrics recognizing that quantitative evaluation should support qualitative,
expert assessment;
• Open and Transparent metrics;
• Diverse metrics by field in order to support the plurality of research and researcher career
paths across the system;
• Reflexible metrics for recognising, anticipating and updating the systemic and potential
effects of indicators.
OpenAIRE: A Usage Statistics Hub for
Responsible Metrics
29. • Standardization: following COUNTER Code of
Practice
• by update to COUNTER R5
• by contribution to COUNTER Robots Working Group
• Put usage statistics into context with conventional
and alternative metrics and (open) peer review
Considering the HLEG Altmetrics
Recommendations
30. ● Develop Piwik plugins for other Repository platforms
(eg. Fedora, Samvera)
● Promote the service to content provider managers
● Support national usage statistics initiatives to
become a node in OpenAIRE Usage Statistics
● Contribute to the Open Metrics concept and vision
● Activities in OpenAIRE-Advance starting in 2018:
○support LA Referencia to set up a regional usage statistics network and
interlink
○working towards Open Metrics
Next Steps
31. ●Standardize usage statistics to enable assessment of research impact
○Standardize usage statistic metrics across OpenAIRE and EOSC-hub
○Collaborate with RDA (e.g. Make Data Count BoF working group)
○Promote common guidelines to and across communities
○Take EC rules and GDPR regulations into account
●Enable the collection/aggregation of usage stats from content providers
○Adopt OpenAIRE and EOSC-hub services for collecting user statistics, services in scope:
■EGI: Accounting System, AppDB
■ EUDAT: DPMT, B2SHARE, B2FIND, B2SAFE
○Adopt OpenAIRE Usage Statistics Services to collect user stats for all products of
science
■ e.g. literature, datasets, software, research objects
■ Integrating with EOSC-hub services for usage statistics/metrics
Collaboration with EOSC-hub
what this talk is about
usage statistics is published under cc-0.
how many data provider participate today ? 2700 repositories and eJournals, ~980 compatible data provider, > 21 Mio items
what this talk is about
what this talk is about
diagram about the components and dataflow
usage data collection strategies
what this talk is about
Column headers
37 registered / 32 active (repositories + OpenAIRE)
http://beta.repomanager.openaire.eu/#landing
workflow to enable metrics
tracking plugins for repositories
metrics visualization
http://beta.services.openaire.eu/usagestats/sushilite/
reports cover the data sources tracked by Piwik + data sources collected from statistics nodes (eg. IRUS-UK)
for fulltext downloads (ft_total) and metadata views (abstract)
collection of questions
next steps (towards OpenAIRE Advance)