Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
1. February 11, 2020
Anita de Waard, Alberto Zigoni, Elsevier
Enhancing Data Discovery,
Sharing and Reuse
2. 11.02.2020
Meet our hero: a cancer biologist
• Specialized in liver cancer
• Runs a lab of 6 people
• Focusing on understanding
the molecular mechanics of
liver cancer to identify
biomarkers for diagnosis and
prognosis
• Mostly pre-clinical research,
mouse models and diseased
tissue samples
• Research has been
supported by various NIH
grants
3. Private
11.02.2020
The research (data) journey: private and public
•literature (and data) search to assess
state of the art and identify gapsDiscover
•Run experiments and collect data
•Collaborate
•Collaborate on project data with team
members and external researchers
Collaborate
•Report on data that is produced for a
specific grant
•Share data in domain specific and
generalist / institutional repository
Publish
Researcher
• Data published in
institutional data repositoryCurate
• Data and related
publications / projects /
awards in public portal
Showcase
• Datasets published
everywhere
• Compliance to funders’
policies
Track
Institution
Public
4. 11.02.2020
Discover: Mendeley Datasearch: find related work
Sequencing Data (NIH repository) Western Blots (Mendeley Data)
Deep indexing
Acronym expansion
HCC = Hepatocellular
Carcinoma
20.5m datasets from
1,700+ sources
5. Collaborate: store and share experiment/project data
Project
members can
share data
sources inside
project
Multiple data sources
(incl. institutional
servers) available
Datasets can be
viewed & edited by
all project members
(role based access)
Large files copied server-to-
server asynchronously.
7. 11.02.2020
Publish: add metadata and license to data
Type-ahead from taxonomy
Custom metadata
templates configured
Broad choice of data and
software licenses
Semantic links to
articles, datasets,
software
Reserved DOI prior to
publication, can be used
in manuscript to cite data
Multiple sharing options
100 GB per dataset
(configurable)
11. 11.02.2020
Track: monitor institutional datasets published anywhere
Enriched dataset metadata include
institutional IDs (Scopus, SciVal,
Mendeley) to facilitate tracking
1700+ data sources
indexed
APIs for integration
12. Mendeley Data is a full research data management suite:
Data
Repository
Publish
MD
Data
Search
Discover
dB Institutional Product
Data
Manager
Curate
Data
Monitor
Report
Track
NIH
Collaborate
Free
Open,
API’s
Secure
Tailored
Paid
• Data is always
owned by the
user
• 16 possible open
licenses
• Archiving with
DANS
• Open API”s to
many platforms
Data
Repository
Publish
Showcase MD
15. Anita de Waard, a.dewaard@elsevier.com
Alberto Zigoni, a.zigoni@elsevier.com
Eric Livingston, e.livingston@elsevier.com
Thank you!
• Anita de Waard, a.dewaard@Elsevier.com
• Alberto Zigoni, a.zigoni@Elsevier.com
https://data.mendeley.com/
17. 1billion
Over 16 m people a month use ScienceDirect,
our flagship platform for academic research
and download over 1 billion articles/year
73m+
Scopus, the leading abstract and citation
database of research literature, contains
over 73m records across 24,000 journals,
sourced from more than 5,000 publishers.
25,000
Our products are used at more than
25,000 Academic and Government
institutes globally
7,500
Elsevier has 7,500 employees and
serves customers in over 180
countries.
430,000
Elsevier publishes 430,000 peer-
reviewed articles annually
9 m
Mendeley is a scientific social networking tool
that enables over 11 million users worldwide
to organize, write, collaborate and promote
their research.
~11m
490+
ClinicalKey provides over 490 clinical
overviews that gives quick clinical answers
and summaries; over 4m images and 51,000
medical and surgical videos in a single, fully
integrated site.
> 6 my
ScienceDirect contains over 6
million articles, 2,500 journals, 900
full open access journals, 39,000
books and 330,000 topic pages
Elsevier in numbers:
19. • Following TOP guidelines for data deposition:
• Data deposition (and citation) fully integrated in article submission process
• Following Force11 Data citation principles
Supporting research data sharing through our publishing workflow:
This Photo by Unknown Author is licensed under CC BY-NC-ND
This Photo by Unknown Author is licensed under CC BY-SAhttps://cos.io/our-services/top-guidelines/
https://www.elsevier.com/connect/elsevier-supports-top-guidelines-in-ongoing-
efforts-to-ensure-research-quality-and-transparency
21. Scholix: Multi-stakeholder Linked Data Effort
Publishers
Data
Centers
Repositories
Past: disconnected sources using
heterogeneity of practices
Publishers
Data Centers
Repositories
Current: standard set of guidelines for exposing
and consuming links, supported by hubs
http://www.scholix.org/
23. NHLBI Stage Project
https://www.nhlbidatastage.org/about/
Seven Bridges
Fair4CURES Platform
Storage
Task 1 Workflow
Input
Task 2
Research Object Profiler
Add annotation and
relationships (metadata)
to collection to describe a
research object:
- URI
- Length
- Filename
- Checksums
etc.
Mendeley Data
Research Object Serializer (a
manifest itemizing file names)
Serialise Research Object in
standard format based BagIt
Mix of digital
objects
Research Object ComposerFile source
TOPMed
R
O
Open API
R
O
Future: need to address how to
deserialize the BagIt so that there isn’t
additional work for user to repack
Storing ROsCreating the RO
Zenodo
DataVerse
https://github.com/Research
Object/research-object-
composer/blob/master/introduction.ipynb
25. Mendeley Data integrates through open APIs
+ 35 repositories
(BePress planned)
• Mendeley Data Repository
datasets are automatically
synced with the Pure
curation workflow
• Projects, grants,
equipment, showcase
on portal (planned)
• Mendeley Data Search results
are visible on Scopus
• Notify new articles to Monitor
for data sharing compliance
• Datasets appear as records
on Scopus (planned)
• Mendeley Data usage is
accessible through Plum API
and widget
• Plumx metrics (citations,
usage, social mentions) are
captured and shown on
Mendeley Data Repository
Publish datasets
alongside an article
on Mendeley Data
within the SSRN
publication flow
Publish or link datasets
alongside an article on
Mendeley Data within the
ScienceDirect publication flow
Researcher and
Institutional
Dataset metrics
• User identity & login
• Library (planned)
• Notes (planned)
• Projects (planned)
Mendeley Data
Existing integration
Planned integration
• Mendeley Data indexed
by OpenAIRE index
• OpenAire Zenodo
repository indexed by
Mendeley Data Search
Long-term
preservation of
published datasets
Links between articles and datasets:
• Contributed by Mendeley
Data to Scholix
• Indexed by Menndeley Data
Search and Data Monitor
• Consumed by Scopus and
ScienceDirect
Integrate with machine
readabledata management plans
• For more than 35 repositories the
metadata as well as the underlying
datasets are indexed by Mendeley
Data Search
• First repositories are actively
integrating with the free and open
‘push API’ of Mendeley Data
Search
• Mint DOIs for Mendeley Data
Repository
• Data Cite indexed by
Mendeley Data Search