The RJ Broker is a middleware tool that automates the delivery of research output from publishers and subject repositories to institutional repositories. It accepts metadata and full text deposits, identifies target repositories, and handles deposition through protocols like SWORD. Trials were conducted with Nature Publishing Group and Europe PMC to demonstrate functionality and scalability. The future involves transitioning the RJ Broker to a sustainable service that recruits more data suppliers and repositories and adds additional features to support the open access goals of funders.
2. Overview
• Need for a broker
• Development of the RJ Broker
• Publisher & Subject Repository Trials
• Future
• Conclusion
RSP Webinar - 29 May 20132
4. Focus
Support Open Access & Funder mandates:
• To increase the number of deposits to UK repositories
• To minimise effort by depositors and IR managers
Institutional Repo
Subject repo
Publishers
Author 2 /
Representative
Funders
Publisher System
IR
IR
IR
Author/PI
RSP Webinar - 29 May 20134
Broker
6. Previous & Current Projects
• EDINA led several projects in the repositories
area since 2006:
– Prospero, The Depot, OpenDepot.org, EM-Loader, OA-
RJ, UK RepositoryNet+ (RepNet), ORI and RJ Broker
– http://edina.ac.uk/projects/
• RJ Broker
– April 2012 to March 2013
– Extension to July 2013
– Component of the RepNet infrastructure
RSP Webinar - 29 May 20136
7. Abstract Model: A Delivery Service
• RJ Broker is a delivery service for research
output
– Deposit: parcel or letter
– Metadata: address
– Notification: postcard
• Vision:
– http://oarepojunction.wordpress.com/2013/01/10/rj-
broker-a-research-output-delivery-service/
RSP Webinar - 29 May 20137
8. RJ Broker
• RJ Broker is an independent middleware tool
– Accept deposit of research articles
NLM DTD, bespoke format, SWORD, Eprints
– Process the deposits into a common format
RJ Broker code
– Identify target repositories from metadata
Organisation and Repository Identification (ORI)
http://ori.edina.ac.uk/
– Handle deposition to registered repositories
SWORD, plugins (Eprints, DSpace,Fedora,…)
– Provide tracking ID to content supplier
URIs
RSP Webinar - 29 May 20138
9. RJ Broker
• RJ Broker also
– Allow browsing, search and download
GUI & APIs (Eprints)
– Notify other repositories with relevant content
Monthly email
“View”
useful for non SWORD systems (CRIS),
individuals
RSP Webinar - 29 May 20139
11. Publisher & Subject Repository
Trials with the Pilot RJ Broker
RSP Webinar - 29 May11
12. Pilot RJ Broker
• Demonstrate the functionality
• Real data
• Test the scalability
• Publisher: Nature Publishing Group (NPG)
• Subject Repository: Europe PubMed Central
RSP Webinar - 29 May12
13. Publisher: NPG
• Record includes
– Metadata: rich, embargo, funder, multiple authors,
ORCID in the future…
– Content: Multi-part publication (some content may be
embargoed) i.e. full text
• Development work:
– Agree format for the record (NLM DTD based)
– EDINA developed an importer for the data
– Transfer using SWORD 1.3
– NPG added new stream in their publication workflow
that send data to the RJ Broker
RSP Webinar - 29 May 201313
14. NPG
• Legal agreements to respect embargo periods
– Between NPG & EDINA
– Between EDINA & IRs
• MIT signed the IR agreement
– Working on data importer for DSpace
• Worth considering to receive:
– Quality: Full text publication & rich metadata
– Timely: Straight from the publisher during the
publication process even if embargoed
• Template agreement on request
RSP Webinar - 29 May 201314
15. NPG
• Set up took several months
– Time difference
– Relies on voluntary participation
– Requires small amount of development work
– Legal framework
• Successful data transfer trial between NPG
& RJ Broker in February 2013
• NPG ready to start continuous data feed
– A couple of journals first to increase with take up
• Transfer to test IRs
RSP Webinar - 29 May 201315
16. Subject Repository: Europe PMC
• Use case supported by Jisc, RepNet & Wellcome
Trust
– UK focus
– Support funders mandate
• Record includes:
– Metadata only: funders, grant numbers, first author
only, DOI to full text…
– Fully Open Access
– New publication or Update to existing publication
RSP Webinar - 29 May 201316
17. Europe PMC
• Development work:
– Agree format for the record (bespoke)
– EDINA developed an importer for the data
– Transfer using SWORD 1.3
– MIMAS/EBI get regular data feed from PMC
– Push data from their regular feed to the RJ Broker
• Set up took a few weeks
• Successful data transfer trial between
Europe PMC & RJ Broker in February 2013
• Ready to start continuous data feed
– Average 160,000 records per month
• Transfer to test IRs
RSP Webinar - 29 May 201317
18. Europe PMC Trial in Numbers
~67,000
~60,000
~58,500
~22,500
~14,500
1,665
RSP Webinar - 29 May 201318
67,000 records in the trial dataset
(~12 days based on an average 160,000 per month)
7,000 no affiliation 60,000 sent to RJ
Broker
1,500 errors (bad format) 58,500 successfully
received by RJ Broker
36,000 with no identifiable
organisation
RJ Broker identifies
organisation for 22,500
8,000 no repositories 14,500 have
repositories
13,000 worldwide (not UK) 1,665 in the UK
20. Europe PMC Trial in Numbers
RSP Webinar - 29 May20
Number of
associated
repositories for
records with one
organisation
identified
21. Europe PMC Trial in Numbers
RSP Webinar - 29 May 201321
Country
Code
Country Number of
records
us USA 5934
gb United
Kingdom
1665
ca Canada 1099
jp Japan 722
au Australia 655
se Sweden 313
es Spain 304
nl Netherlands 299
de Germany 239
tw Taiwan 181
fr France 180
br Brazil 179
it Italy 176
be Belgium 174
th Thailand 168
za South Africa 160
sd Sudan 155
55 other countries with
less than 1% of records
each
1836
22. Top UK Institutions Destination Number of
records
University of Oxford 170
University of Cambridge 139
University College London 119
Imperial College 103
University of Edinburgh 88
University of Manchester 63
University of Bristol 61
University of Nottingham, University of Newcastle Upon Tyne 56
Liverpool 55
University of Glasgow 52
RSP Webinar - 29 May 201322
Europe PMC Trial in Numbers
78 UK Institutions in total
24. RJ Broker Trial Installation
• GUI preview access
• OA records from NPG & Europe PMC are
available for browsing & downloading
– Check what we have for your institution!
– http://devel.edina.ac.uk:1203/
– !!! It is only trial & development installation
– !!! Not a service yet
RSP Webinar - 29 May 201324
25. RJ Broker Trial
Demonstrate features:
• Importing records from different suppliers
• Storing & Processing (~2s per record)
• Repository Identification
• Delivery
• Browsing & Download
More end-to-end use cases
• External IRs
• Different IR platforms (Eprints, DSpace, Fedora…)
RSP Webinar - 29 May 201325
27. Immediate Future
• Project extension (31 July 2013)
• Prepare transition to service
– Service installation
– Add functionality
• Email notification to all (non-registered) IRs
• Improve support for different repository platforms
• Bulk transfer of data backlog
• Support RIOXX metadata export
– Early adopters
• IRs
• Data suppliers to establish data feeds
– Start building data store
• Content kept for 1 year to start with
RSP Webinar - 29 May 201327
28. Future (after July 2013)
• Transition to Service
– Within RepNet Infrastructure
– SLD
– Roadmap for adding further functionality
• Open for recruitment
RSP Webinar - 29 May 201328
29. Proposed Service to IRs
• All IRs:
– Browse and download OA content through public APIs
and GUI to the RJ Broker
– Ready to accept registration of new IRs
– Info „pack‟ on how to register to the RJ Broker delivery
service
– Monthly email notification to IRs with a list of citations
for the OA publications which have been supplied to
the RJ Broker in the last month and are relevant to
the institution, includes instruction on how to access
the APIs and GUI for browsing and downloading the
content and an invitation to register to the RJ Broker
deliver service.
RSP Webinar - 29 May 201329
30. Proposed Service to IRs
• Registration process:
– IR provides SWORD endpoint credentials to RJ Broker
– IR is configured to accept RJ Broker data
– Option to opt-in to receive embargo content requires to
sign a legal agreement
• Registered IRs:
– Direct delivery of new OA content from all suppliers to
IR sword endpoint
– Direct delivery of embargoed content from all suppliers
to IR sword endpoint subject to legal agreement
RSP Webinar - 29 May 201330
31. Proposed Service to Data Suppliers
• Ready to accept new data suppliers
• Info „pack‟ on how to become an RJ Broker data supplier
• Enable regular data feed into RJ Broker
• Forward supplied content to registered IRs (when
identification of target IRs is possible)
• Email notification of supplied content to relevant IRs
• Expose supplied OA content through APIs and GUI for
browsing and downloading
• Allow tracking and full access to own content
RSP Webinar - 29 May 201331
33. Conclusion
• Effective solution to content dissemination
• Benefits all
– Increase OA & transparency
– Help with reporting
– Support promotion
– Saves time & effort (money)
• Appeal of service will grow
• Small amount of development work needed
locally but it is worth it!
RSP Webinar - 29 May 201333
34. Thanks
• You
• RSP
• EDINA Team
– Ian Stuart - Cesare Bellini
– Muriel Mewissen - Christine Rees
– Peter Burnhill - Theo Andrews
• NPG, MIT, Europe PMC, MIMAS, EBI, Wellcome
Trust
• UK RepositoryNet+
• Jisc
RSP Webinar - 29 May 201334