FundRef on the AAP/PSP panel: CHORUS: A Collaborative Approach to Public Access

CHORUS
A Collaborative Approach to Public Access
AAP/PSP
6 February 2014
Carol Anne Meyer
CrossRef

@meyercarol

ORCID: 0000-0003-2443-2804

Today I’m going to be introducing and explaining the new FundRef initiative from CrossRef.

First just a few words about CrossRef for anyone who isn’t a member or might not be familiar with us as an organization. CrossRef is a not-for-profit
membership organization of international scholarly publishers.

A not-for-profit trade association of
global scholarly publishers

First just a few words about CrossRef for anyone who isn’t a member or might not be familiar with us as an organization. CrossRef is a not-for-profit
membership organization of international scholarly publishers.

CrossRef has 1900 members,
representing 4627 publishers

Why is membership growing steadily? Because publishers think that having DOIs will increase their visibility in the scholarly community.

We have 1900 voting members, representing 4600 publishers.

Members come from 100 countries

5

Services

•Reference
linking

Services

•Reference
linking
•Cited-‐by
linking

Services

Powered
by

iThenticate

•Reference
linking
•Cited-‐by
linking
•Plagiarism
screening

Services

Powered
by

iThenticate

•Reference
linking
•Cited-‐by
linking
•Plagiarism
screening
•Update
identification

Services

Powered
by

iThenticate

•Reference
linking
•Cited-‐by
linking
•Plagiarism
screening
•Update
identification
Metadata
feeds

•third
parties to

Services

Powered
by

iThenticate

•Reference
linking
•Cited-‐by
linking
•Plagiarism
screening
•Update
identification
Metadata
feeds

•third
parties to

•Funding
identification

Our community includes
Affiliates and Libraries

Now we have 90 affiliates and 2045 libraries

We have two offices: 
Lynnfield, MA and Oxford, UK

We have 24 employees

Departments
•Technical
• Finance & Operations
• Marketing & Business Development
• Strategic Initiatives
• Product Management

!

Revenue Tiers ($ million)

The Long Tail of Members

>500
201-500
101-200
51-100
26-50
10-25
5-10
1-5
<1

2

6
4

10
12

43
64

151

# of Members

At $50K, 0.32% or 6 members account for 20%
At $275, 85% or 1600 members account for 30%

!

Note that this is a logarithmic scale

1612

Mission
To be a trusted collaborative
organization with broad community
connections; authoritative and
innovative in support of a
persistent, sustainable
infrastructure for scholarly
communication.

6-Word Mission

Improving scholarly
communication through
community collaboration

Intel inside? The engine of scholarly communication?

Distribu(on+of+CrossRef+DOIs+by+
Content+Type+September+2013+
6%$ 2%$
11%$

Journals$
Books$
81%$

Conference$Proceedings$
Components$

Traﬃc
to
publishers’
sites
# of CrossRef DOI resolutions or “clicks” each year
800000
600000

(000)
400000
200000
0

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

In other words, traffic generated to publishers by CrossRef DOIs.

Basic CrossRef Metadata

• author (s)

• journal title

• article title

• volume

• issue

• publication
date

• ISSN

Basic journal citation metadata:

• page numbers

• article IDs

• internal identiﬁers

• URL

• DOI

Additional metadata

• ORCID

• CrossMark

•
•

Updates (related CrossRef DOIs)

Publication record information

Text and Data Mining Data

NISO Open Access Identiﬁer

!

ORCID is the Open Researcher ID
CrossMark is our update and version identiﬁcation service
Text and Data Mining is the artist previously known as Prospect

Just a word on this NISO recommended practice: the comment period recently ended, and it is fairly simple. It recommends two tags “free_to_read” and
“license_ref”, and they can be further modiﬁed by effective dates to accommodate embargo periods.

!

CrossRef participates on the group and is committed to implementing supporting the resulting data

Additional metadata

• Funder Name
• Funder Identiﬁer
• Award Number

ORCID is the Open Researcher ID

A standard way of reporting funding
sources for published scholarly research
Launched May 2013

FundRef launched in May 2013 to a great response. FundRef’s purpose is pretty simple - it has been developed to provide a standard way of reporting funding sources for published scholarly research. I’m
going to start by covering why this is important.

For further reading

http://fundref.crossref.org/docs/funder_kpi_metadata_best_practice.html

Let’s take a look at some sample articles. Many journals and other publications include the authors’ acknowledgement of funding sources, but where and how this information is displayed varies widely. In this
article it’s at the end just before the reference section under Acknowledgements, and it tells you the source of the funding and the grant numbers.

...in this PDF article it’s just below the abstract in a section labelled “Funding”. It names the organisation that funded the research, but doesn’t include a grant number.

...and in this one it’s at the end again in the acknowledgements section, and does include an award number. As you can see the location of funding information varies from publication to publication - sometimes
in the metadata, sometimes with the references or only in the full text behind a paywall. !
And it’s not just the placement of this information on the page, it’s also how it is formatted and displayed.

There are a couple of issues with the formatting of funding information in publications. One is that it’s mostly very hard to retrieve in any technical way - if you want to extract this information as part of the
article’s metadata, or to search on it you will struggle because many publishers don’t mark this information up in their XML.!
Here’s an article with funding information at the top with the other metadata. It has its own section and heading, so it stands out to the reader browsing the page, but if you look at the XML all of the information
is grouped together as free text in a paragraph tag. This is not helpful to a machine or search engine that might be looking for this information.

<fn fn-type="ﬁnancial-disclosure">

<p>This work was supported in part by NIH
grant R01 GM094800B to G.J.J., a gift to Caltech from
the Gordon and Betty Moore Foundation, and a stipend
from the Bayerische Forschungsstiftung to M.P. The
funders had no role in study design, data collection and
analysis, decision to publish, or preparation of the
manuscript.</p>

</fn>

</fn-group>

</back>

</article>

There are a couple of issues with the formatting of funding information in publications. One is that it’s mostly very hard to retrieve in any technical way - if you want to extract this information as part of the
article’s metadata, or to search on it you will struggle because many publishers don’t mark this information up in their XML.!
Here’s an article with funding information at the top with the other metadata. It has its own section and heading, so it stands out to the reader browsing the page, but if you look at the XML all of the information
is grouped together as free text in a paragraph tag. This is not helpful to a machine or search engine that might be looking for this information.

<body>

... 
<sec> 
<title>Funding</title> 
<p>This work was supported by the
<grant-sponsor xlink:href="http://
www.grf.org" id="GS1">Generic
Research Foundation</grant-sponsor>,
the <grant-sponsor
xlink:href="http://www.energy.gov"
id="GS2">Department of Energy</grantsponsor> Office of Science grant
number <grant-num rid="GS2">DE-FG0204ER63803</grant-num>, and the
<grant-sponsor xlink:href="http://
www.nih.gov" id="GS3">National
Institutes of Health</grant-sponsor>. 
</p> 
</sec> 
</body>

And even when publishers do tag up funding information in their XML, as in this example, there are still problems. The tags are likely to vary from publisher to publisher - this publisher uses “Grant Sponsor” and
“Grant Num”. Another might use “Funding Source” and “Award Number”. And not all publishers are making this information mandatory on submission, so there will be gaps where authors leave out grant
numbers. !

! top of this there’s the lack of standardization in naming of the funding bodies themselves...
On

Why does this matter?

So why is this important? Without any central database or standard way to store or search this data, all of the stakeholders struggle to get the information they need to fully analyze the outputs of funded
research, and this impacts funding bodies, publishers and institutions. And any kind of large-scale analysis is extramely hard without a means to get hold of the data in a machine-readable format

Funding bodies cannot easily track the published
output of funding


output of funding
Publishers cannot easily report which articles result
from research supported by specific funders or grants


output of funding
Institutions cannot easily link funding received to
published output


output of funding
Institutions cannot easily link funding received to
published output
Lack of standard metadata for funding sources makes
it diﬃcult to analyze or mine the data


National Institutes of Health
NIH? N.I.H.? National Institute of Health?
Abbreviations, misspellings, translations...

When funding information is entered as free-form text by the author you are going to have inconsistencies - people will use abbreviations or alternative names or will misspell things. There’s no guarantee that
you’ll be able to match up or de-duplicate the funding bodies and so a search for NIH might not return any publications that had research supported by National Institutes of Health, and so on… Also, NIH itself
is ambiguous. Which nations are we talking about?

The public

Funders

Publishers

Established award
systems and
research
management
processes

Established
publishing and
peer-review
systems
Relationship with
authors submitting
manuscripts

Relationship with
researchers funded by
agencies

Institutions
Relationship with
researchers
So FundRef is a collaborative solution to this problem, devised by both publishers and funders. It can beneﬁt publishers, funders, research institutions, researchers, and the public. All parties have an interest in
the outcomes of FundRef, and many have well-established processes for recording the distribution of funds and monitoring the research process, and the other for ingesting, processing and publishing the
outcomes of the research. The piece that has been missing is the one that links these two sets of processes, and that is where FundRef comes in, recording this link and making it more visible.

That would truly be the holy grail--interoperability between systems at research institutions, publishers, and funding agencies—and transparency to the
public.

! are seeing a lot of interest in having this central, standardized store for funding information - it will be a huge beneﬁt to all of those involved in the funding of research and the publicaiton of research
We
outcomes. !
There are no fees for FundRef services

FundRef Pilot

We ran a year-long FundRef pilot that ran until March 2013, and involved these organisations - the publishers on the left, and the funding bodies on the right. On successful completion of the pilot project the
CrossRef board approved the FundRef service to go into production, which we did with our launch on May 28th.

The FundRef Registry is
taxonomy of 5500 funder names.

One of the key things that came out of the pilot and is central to the project is an agreed taxonomy of funding bodies. The FundRef Registry has been created from a list donated to the project by Elsevier, and
currently consists of around 5500 international funder names, up 18% since our May 2013 launch. The list data is and will be freely available under a CC0 license waiver. The Registry is updated monthly, and
new organizations suggested by publishers or funding bodies themselves are added after curation. This is the list that publishers should use to collect information from authors on submission.

5500 funder names and ID numbers from curated
Elsevier SciVal registry, donated to FundRef


Hosted by CrossRef, available under CC0


Updated and extended monthly—


Publishers use this list to ensure consistency


Publishers use this list to ensure consistency

www.crossref.org/fundref/fundref_registry.html


To put this into context and explain in more detail how the process works: CrossRef hosts the funder registry which provides standard funder names to publisher submission systems. Publishers ask authors, at
submission, to provide the name or names of the funding bodies and accompanying grant numbers. This funding information goes in to publishers’ production systems where it is stored as tagged XML and
submitted to CrossRef with all of the other deposited metadata for each piece of content. !
Once the funding information is in the CrossRef database it becomes a searchable, either through our search interfaces or via one of our APIs, and publishers, funders, and other interested parties can query on
a funding organisation or grant number to discover the resultant publications, or can look up a piece of content using other metadata and ﬁnd out the funding sources. !
Publishers will be able to display this funding information in a structured way. For those publishers who are participating in CrossMark, the funding data will automatically appear in the Record tab of the
CrossMark dialogue box. We strongly encourage publishers submitting FundRef information to also participate in CrossMark, as this further standardises the location of the information for readers, but of course
it can also be displayed on the publisher’s site in metadata and full text.

FundRef

Registry


FundRef

Registry

Publisher
Submission System
Funder
Grant Number


FundRef

Registry

Publisher
Submission System
Funder
Grant Number

Production

Systems


CrossRef

Database &

Query APIs

FundRef

Registry

Publisher
Submission System
Funder
Grant Number

Production

Systems


CrossRef

Database &

Query APIs

FundRef

Registry

Funders

Researchers

Institutions

Publishers
SHARE

Publisher
Submission System
Funder
Grant Number

Production

Systems


DOI

Funding
Source

Award
Number

But the key piece is that the funding information is now centrally stored in the CrossRef database and can be queried. These three pieces of information - the DOI, the funding source or sources and award
numbers are tied together in the metadata, making each of them discoverable via any of the other. !
Taking this a step further, once this information is in the CrossRef database and ORCIDs are also being deposited, you have a scenario in which you can look up a researcher, ﬁnd their publications, and see
how their research was funded, or look up a grant number, see its associated DOIs and which researchers contributed to those publications. I’ll come back to querying the data later….!

! in the future if funders decide to assign CrossRef DOIs to grants, we could relate ORCIDs directly with awards in addition to go through a published document.
[Note:

Submission Workflow
1. Collect funding data from authors on submission using
FundRef Registry taxonomy

The ﬁrst thing that publishers need to do is collect the funding data from authors when they submit their paper. Submission of grant numbers should be encouraged but isn’t mandatory, and of course the author
will need to be able to submit multiple grant numbers and multiple funders. There will need to be an option for “no funding source” and also the opportunity for authors to select “other” and input the name of the
organization if their source isn’t found in the Registry. If they do this, the name they input will be stored in the CrossRef metadata and will be added to a list to be veriﬁed and added to the Registry. !

! press announced their fund ref integration in November.
ejournal

Workflow Issues
1. Collect funding data from authors on submission using
FundRef Registry taxonomy

http://www.crossref.org/fundref
The funder name that the author submits should come from the FundRef Registry and should be the standardised version of that name. In our FundRef Search we use an auto-complete function as the user
types.

Implementation
Widget - http://labs.crossref.org

We’ve also built a widget that you can drop into your submission pages to collect this information. The widget always references the most up to date version of the registry so you won’t need to worry about
downloading the ﬁle unless you are wanting to parse backﬁle information for deposit. The widget is available on the CrossRef Labs page, where you can also download the code if you’re interested in making
use of it.

Implementation
2. Pass funding data from submission system to production
systems
Publisher
Submission System
Funder
Grant Number

Production

Systems

So once you’ve integrated the Registry to allow you to collect funding body names from authors, you will then need to make sure that your production systems can ingest this additional data from your
submission systems, ready to be deposited with CrossRef.... this may require some changes to ensure that you can load the additional metadata. !

Implementation
systems
Publisher
Submission System
Funder
Grant Number

Of course that’s how the process was designed.!

Editorial
Check

Production

Systems

! in the real world, it turns out that about 20% of the funder-article relationships that have been deposited are not in the registry. !
Back
!
The!

Implementation
systems
Publisher
Submission System
Funder
Grant Number

Editorial
Check

Production

Systems

So once you’ve integrated the Registry to allow you to collect funding body names from authors, you will then need to make sure that your production systems can ingest this additional data from your
submission systems, ready to be deposited with CrossRef.... this may require some changes to ensure that you can load the additional metadata. !

The second issue, as I think Mark will mention, is that funding agencies are eager to see FundRef data populated and associated with publication records.
So by deﬁnition, capturing the data at submission means that there is a pipeline delay for FundRef data being associated with published literature as these
submissions make their way through the peer review process.

http://labs.crossref.org

If you do have backﬁle content with funding information that you have already extracted, we’ve put together a tool to help you match the funding names in
your content with the FundRef Registry. It uses Google Reconcile and is available on CrossRef Labs at this URL - there’s a really handy tutorial video that
will talk you through how to use the tools to add FundRef IDs to your metadata.

Workflow
3. Deposit FundRef data with CrossRef
CrossMark participants should
deposit FundRef data within
CrossMark deposits

CrossMark participation
recommended for standard
display of funding information

And that’s step three - deposit the funding information with CrossRef. We are strongly encouraging our members to also join CrossMark and submit the funding data as part of their CrossMark deposits. !
For CrossMark participants the funding data will automatically appear in the record tab of the CrossMark dialogue box, giving the advantage of standardisation across publisher websites for the reader, and
automatically highlighting the publisher’s participation in FundRef.

Workflow
3. Deposit funding data with CrossRef

If you do deposit within CrossMark, this is an example of what it will look like. You can see here that were we’ve got a very simple CrossMark deposit - the basic required information, and the FundRef data for
one grant from one funding organisation, the National Science Foundation. The funder name and funder identifier are taken from the FundRef Registry, and you’ll notice that these funder identifiers are DOIs, for
uniqueness and persistence. The funder name and funder identifier are required, the award number is optional. !

! an author submits a funder name that is not present in the Registry you *can* deposit it with CrossRef without an associated ID. Please don’t try to send us your own internal IDs because they will be rejected.
If
Deposits with funder_names that aren’t in the Registry will be flagged to us and will be reviewed manually before being added to expand the registry. But I would stress that wherever a funder name does
appear in the Registry it must be matched and deposited with it’s FundRef Funder ID. If you don’t submit the funder ID numbers your content will not appear in FundRef Search. !

!

I should mention at this point that if you are holding off joining CrossMark because you’re working out what additional metadata you will deposit and how
to get hold of that metadata, it’s perfectly acceptable to join CrossMark without the additional metadata in order to get FundRef information showing for
your content.
In this CrossMark example the publisher has supplied publication dates. You don’t have to have any of this extra metadata ready - you can always deposit
any additional CrossMark data at a later date, but take advantage of CrossMark to ensure that funding information is prominently displayed,

Look up funding data

http://search.crossref.org/fundref
Then, when this data is in the CrossRef database, institutions, publishers, funders, and other interested parties can search on it, either through our FundRef Search interface or using one of our query APIs.
FundRef Search is an interface speciﬁcally for looking up funding bodies and seeing papers that have resulted from their grants. If you want to look up award numbers or papers you will need to use CrossRef
Metadata Search, which I will come to in a moment.

Search for funder

FundRef Search directs the user to search using one of the funding body names in the registry. It handles acronyms so NIH will bring up the National
Institutes of Health as in this example. You’ll see that countries are listed - which is important because more than one country has a “National Science
Foundation”...

Results

Here I’ve used FundRef Search to look up the US NIH. You can see that it has returned a list of articles that have the NIH listed as a funder. In the ﬁrst result we have the NIH listed as the funder but with no
grant number. The second and third results show one grant number and then several. And the fourth article has NIH listed as one of three funding bodies, each with their own related award numbers. !
Looking to the left of the screen you can see the hierarchy of funding bodies taken from the Registry. The NIH falls under the US Dept of Health and Human Services, and below are all of the subsidiary funding
bodies of the NIH itself. The default results are research funded by the organisation you searched on - but you have the option to include all subsidiary organisations too by checking the box at the top of the list.
Then you will see results that list NIH and all of its subsidiary organisations. !
We’ve just added this heirarchical browsing in so it’s a little bit of a work in progress and we’re missing the heirarchies for a few organisations, but it should give some really useful options for viewing a wider or
narrower group of related funding bodies. !

Search by other metadata

As I said the FundRef Search interface lets you look up on funder names only - this is to allow us to pre-populate the standard names in the FundRef
Registry as search terms. If you have other metadata and want to search on something else, you should use CrossRef Metadata Search, which as you might
expect searches across all of the metadata in the CrossRef database.

Here I’ve entered a grant number and it has returned the associated journal article.

Or you can enter a CrossRef DOI and get the corresponding article metadata, including the funding information where it’s available.

Or you can enter an ORCID and return that author’s papers with the funding information included. While I’m explaining all of this I must include the
obvious caveat: FundRef has been running for a little over 6 months now and while the data is growing we still have relatively little funding information in
our database - around 50,000 DOIs have funding data at present, so if you search on a funding body and don’t see any results please don’t be alarmed as more publishers deposit funding metadata you will start to see these results appear. The same is true of ORCIDs - a relatively small number of publisher
depositing ORCIDs at this time, but again this will grow in the course of the year.
So we really need our member publishers to join and start depositing as soon as possible in order to make this a useful resource. I hope that these
examples give you an idea of the huge potential for discovery that FundRef is going to offer.

So, how are we doing?
•

52,000 + unique documents with FundRef
records

•
•

71,000 + funder-document relationships
80% of the funder names from these
relationships are in the FundRef Registry

This is up from 28K in October.

!
90%
!

or 47K of these are unique documents with at least one FundRef name from the registry

Because some documents have more than one funder, there are actually 71K “relationships”

These deposits come from
9 publishers of 30 signed up
American Chemical Society
American Diabetes Association
American Institute of Physics
American Psychiatric Publishing
American Psychological Association
American Physical Society
American Society of Neuroradiology
Association for Computing Machinery
BioMed Central
Bioscientifica
Copernicus GmBH
eLife Sciences Publications
Elsevier
FapUNIFESP (SciELO)
Hindawi Publishing Corporation
Institute of Electrical & Electronics Engineers
!

International Union of Crystallography
Internet Medical Publishing
IOP Publishing
Journal of Rehabilitation Research & Development
Just Medical Media, Ltd.
Kowsar Medical Institute
Landes Bioscience
National Library of Serbia
Optical Society of America
Oxford University Press
Royal Society of Chemistry
ScienceOpen
Taylor & Francis
The Royal Society
Wiley-Blackwell

http://www.crossref.org/fundref/fundref_agreement.html

These are the CrossRef members who have signed up to FundRef so far (30) - those in bold have also started depositing metadata. Please do ensure that
you sign the FundRef agreement before you start depositing so that we can list you as an official participant.

Publishers: sign up now!
FundRef Terms & Conditions:
www.crossref.org/fundref
No fees for FundRef deposits

! is now open for any CrossRef member to join - and we would really like you to! We are seeing a lot of interest in having this central, standardized store for funding information - it will be a huge beneﬁt
FundRef
to all of those involved in the funding of research and the publicaiton of research outcomes. We really want to encourage publishers to sign up sooner rather than later so that we can build up this database of
funding information over the coming months. !
There are no fees for FundRef deposits - we simply ask that member publishers agree to the of Terms and Conditions, which as I’ve said are available as simple click-through agreement on the CrossRef
website. Please do make sure you complete the terms and conditions before you start to deposit. !

! individual or organization interested in querying the FundRef data can use FundRef Search — which is freely available to anyone. Or, organizations can sign up for one of our query or metadata afﬁliate
Any
accounts and make use of the CrossRef APIs and web interfaces to access that data. !
!

But, what does all this have to do with
CHORUS?
• CrossRef (with FundRef ) provides the social and
technology standards and practices that makes CHORUS
possible.

•

CrossRef DOIs directs interested parties to the correct
documents

•

CrossRef’s existing metadata database will hold data about
ORCID, FundRef, Open Access Indicator, Text and Data
mining

•

CrossRef’s Application Programming Interfaces (APIs) and
search interfaces will serve these new types of data.

Full Disclosure:
CrossRef plays the field

•

CrossRef staﬀ participate on the Technical Working Groups
of CHORUS and SHARE

•

CrossRef also has expressed an openness to make its
infrastrure available for other public access initiatives.

•

CrossRef does not do custom development for projects
that are specific to that project and not generalizable to
the industry.

Thank you!

www.crossref.org/fundref
cmeyer@crossref.org

FundRef on the AAP/PSP panel: CHORUS: A Collaborative Approach to Public Access

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to FundRef on the AAP/PSP panel: CHORUS: A Collaborative Approach to Public Access

Similar to FundRef on the AAP/PSP panel: CHORUS: A Collaborative Approach to Public Access (20)

More from Crossref

More from Crossref (20)

Recently uploaded

Recently uploaded (20)

FundRef on the AAP/PSP panel: CHORUS: A Collaborative Approach to Public Access