The document discusses open science and the role of identifiers like DOIs. It describes how research data sharing has become core to open science due to the Internet and digital archives. Researchers now publish their data in addition to papers. Well-managed metadata standards and identifier systems help integrate data across its life cycle from creation to archiving. The DOI system provides persistent links for digital objects and is increasingly used for research data through registration agencies like DataCite.
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Open Science and Identifiers
1. Open Science and Identifiers
Hideaki Takeda
National Institute of Informatics
takeda@nii.ac.jp
ORCID: 0000-0002-2909-7163
Keynote talk, JATS-CON Asia in Tokyo, JST Tokyo headquarter, Tokyo, Japan, October 19, 2015
7. So Science is becoming Open Science
• Open science can be discussed in
philosophical, political, methodological, or any
kind of views.
• “Open Science NOW” is geared and realized
by Internet as Architecture
• So data sharing is the core of Open Science
14. Data Life Cycle
• Data is created, shared, published, and archived
• But, just “published” is not enough, it should be
“openly published” (open data)
Data ShareCreate Publish Archive
Research Phase In Progress Results
15. Open Data
• “A piece of data or content is open if anyone is free
to use, reuse, and redistribute it — subject only, at
most, to the requirement to attribute and/or share-
alike.” http://opendefinition.org/
• Open data is data publication with some open
license
– Open license ensues the above condition
16. Data Life Cycle
• Different tools for different stages of life cycle
– Data sharing: generating, federating, …
– Data publishing: searching, harvesting, …
– Data archiving: migration, …
• The architecture CAN be shared
Data ShareCreate Publish Preserve
Research Phase In Progress Results
Stakeholder
Research Institute
Researcher/R. Group
18. Four reasons for openness of research data
• Demands from Society
– Knowledge sharing among society
– Accountability of public money
• Demands in Science
– Future development of Science itself
• “Standing on the shoulders of giants”
(nanos gigantum humeris insidentes)
– Reproducibility
24. Repository
Architecture of data sharing
Identifier
Data
Format
Metadata
Metadata Schema
Systematic Integration across the layers
Interoperability on each layer
25. Metadata Description Language, Collectoin and sharing, Conversion
ス Schema Description Language, collection and sharing, conversion
System Development, Community
管理 Organization, systems, ID federation
Repository
Architecture of data sharing
Identifier
Data
Format
Metadata
Metadata Schema
DOI ORCID FundRef
DataCite CrossRef JaLC Dublin Core DCAT CKAN Linked Data
Organization Schema System Technology
Coordination and Competition
Dspace Fedora Weko
26. Research Activities and Related Entities
Survey
Article Writing
Data
Digital
Articles
Acquiring Data
Publishing Data
Funding agencies
Research
Institutions
affiliated
Projects
Supported
Academic Societies
Digital objects Digital objects
Topics
27. Research Activities and Related Entities
Survey
Article Writing
Data
Digital
Articles
Acquiring Data
Publishing Data
Funding agencies Projects
Research
Institutions
affiliated
Supported
Academic Societies
Digital objects Digital objects
Topics
ID
ID ID
ID
ID ID
ID
ID
IDID
ID
28. Research Activities and Related Entities
Survey
Article Writing
Acquiring Data
Publishing Data
Funding agencies Projects
affiliated
Supported
ID
ID ID
ID
ID ID
ID
ID
IDID
ID
Data
Digital
Articles
Research
Institutions
Academic Societies
Topics
29. Identifies for research
• A research activity is represented with a
structure of identifies
– Planned and submitted
– Organized and executed
– Concluded and evaluated
ID
ID ID
ID
ID
ID
ID
ID
IDID
ID
30. Identifies for research
ID
ID ID
ID
ID
ID
ID
ID
IDID
ID
• ID for
– Article
– Data
– Researcher
– Institutions, affiliation
– Funding agency, funded project
– Academic society
– Topic
– …
31. Nature of IDs for research
Local
Global
AuthorizedOpen
DOI
ORCID
Institution
Member ID
URI
ResearchGate/Academia.e
du/…
Grant ID
Kaken
Grant ID
Kaken
Researcher ID
PubMed ID
ResearchMap
Facebook
32. Nature of IDs for Science
• Balance in some features
– Global vs. Local
• Global: Unified service
• Local: Specialized service
– Authorized vs. Open
• Authorized: Trusted, restricted
• Open: no restrictions
– Charged vs. Free
• Multiple IDs can co-exist in a single category
• How to mange multiple IDs
– Integration/mapping/associating/discovering
– Control/Manage/Authorize
– Private/Share/Open
34. ç√ç√
ス
管理
Repository
DOI in Architecture of Data Sharing
Identifier
Data
Format
Metadata Schema
DOI
DataCite Metadata Schema
JaLC Metadata Schema
JaLC DataCite
Metadata
Members (data providers)
Domain-specific
metadata schemata
35. DOI (Digital Object Identifier)
• Service to translate DOI names to URIs
containing digital objects
• Service managed by International DOI
Foundation (IDF)
• Initially started by STM publishers to share
identifiers for digital publications
• Distributed management
– Delegation of registration tasks to Registration
Agencies (RAs)
36. DOI (Digital Object Identifier)
• Service to translate DOI names to URIs
containing digital objects
doi: 10.1007/978-3-642-21616-9_30
http://www.springerlink.com/content
/xkj2386758245u85/
DOI URL
http://doi.org/10.1007/978-3-642-
21616-9_30
http://www.springerlink.com/content
/xkj2386758245u85/
DOI as URL URL
37. Management Structure of DOI
• There Layers: International DOI Foundation (IDF), Registration
Agency (RA), members
• RAs contributes to IDF by registration to Registry DBs,
management of Registry DBs, and members fees
• RAs offers services for DOI registration to their members
• Members can register DOIs to their digital objects through
RAs
Members
RAs
IDF
CrossRef
PublishersPublishersPublishers Publishers
DataCite
UniversityLibrary
Research
Institute
JaLC
PublisherUniversity
Academic
Society
38. Roles of DOI
• Provide resolvable, persistent, interoperable
links
– Resolvable: standard syntax + mapping by handle
system
– Persistent
• Technically: management of registry DBs
• Socially: organizational operations and duties for
members
– Interoperability: sharing datamodel
39. Registration Agencies (RAs)
Airiti, Inc. CrossRef
China National Knowledge
Infrastructure (CNKI)
DataCite
EIDR (Entertainment Identifier
Registry)
ISTIC (The Institute of
Scientific and Technical
Information of China)
JaLC (Japan Link Center)
mEDRA (Multilingual European
DOI Registration Agency)
OP (Publications Office of the European Union)
40. CrossRef
• Ensure accessibility and citation of articles and
books in STM publications
• Started in 1999
• Largest and oldest RA of IDF
– Most of DOI registered are via CrossRef
– Members over 70 countries, most are publishers
• Functions
– DOI Registration
– Metadata Management
• Bibliographic metadata
• Citation
– Services with metadata
• Search for bibliographic metadata and citation
• Reverse look up
41. DataCite
• IDF RA for research data
• a not-for-profit organization since 1 December
2009
42. Japan Link Center (JaLC)
• Founded in March 2012
• Aimed to register DOIs for academic contents produced
in Japan or in Japanese, to circulate information in Japan and overseas.
• Controlled by four national organizations:
Japan Science and Technology Agency (JST)
National Institute for Materials Science (NIMS)
National Institute of Informatics (NII)
National Diet Library (NDL)
• Operated by JST
• Membership system
(Academic societies, Publishers, University libraries, etc)
• External coordination
JaLC is a member of CrossRef and DataCite(Mar. 2014)
42
Over 1,300,000 DOI registered
45. Experiment Project
to register DOIs for Research Data
• Goal
− Establish operation flows to register DOIs for
research data and have stable operation
• Objectives
− Set policies in registering DOIs for research data
− Establish operation flows to register DOIs for
research data with the next version of JaLC system.
Ensure that by performing registration tests
− October 2014 – October 2015
45
47. Members of the project
• National Bioscience Database Center (NBDC), Japan Science and Technology
Agency (JST)
• National Institute of Polar Research (NIPR)
• National Institute of Informatics (NII)
• DIAS-P Project (National Institute of Informatics (NII))
– Japan Agency for Marine-Earth Science and Technology (JAMSTEC)
– University of Tokyo
– Kyoto University
– National Institute for Environmental Studies (NIES)
• National Institute of Advanced Industrial Science and Technology (AIST)
• National Institute of Information and Communications Technology (NICT)
– Kyoto University
– National Institute of Informatics (NII)
– InfoProto Co.,Ltd.
– Japan Aerospace Exploration Agency (JAXA)
– National Institute of Polar Research (NIPR)
• Chiba University Library
• National Institute for Materials Science (NIMS)
• Neuroinformatics Japan Center, Brain Science Institute (BSI), RIKEN
48. Issues in Data DOI
• Flow of operations
• Persistent access
• Granularity of data in registration
• Dynamics of data
• Landing page
• Quantity of data
• Applications
48
49. Issues in Data DOI
• Flow of operations: Who, When, How
− Who registers data?: Researcher/Project
manager/Librarian
− When is data registered?
− How is metadata provided for data?
• Persistent access
− What persistency can we expect for data?
− Can time-limited projects participate? Who will ensure the
persistency of the data?
(ex.)
The representative institute takes over all of the data
Registering DOIs only for data managed by real organizations
among the members of the project
49
51. ID
metadata
Data
Register
Create Register Modify
saveCreate publish Modify remove
Life cycle of data and stakeholders
- in case of data -
51
Create Register Modify
Researcher
Library Research Institution
Project
JaLC
Metadata
Domain
Metadata
52. Issues in Data DOI (cont’d)
• Granularity of data in registration
– Some aspects for granularity of data
• Good for citation
• Granularity of data itself
– Observation data/Experiment data/Simulation data
• Easy for access
• Easy for management
• Quantity of data
52
53. Issues in Data DOI (cont’d)
• Dynamics of data
− Adding data after registration of DOI
− Some options:
− Different DOIs
− Add relationship metadata to denote the relation to the original
DOIs
− Use the original DOI
− Versioning: add the link to the new data while keep the link to the
original data
− History of changes in the single DOI
− No descriptions (e.g., data in observing)
53
54. Issues in Data DOI (cont’d)
• Landing page
− Metadata description
− For open/closed data
• Quantity of data
− Registering DOI for a large amount of data
• Applications
− Citing DOIs for research data
− Developing other applications
54
55. Recommendations for Data DOIs
• Recognition of variety of the nature of data
• Minimal Commitment
– Persistency, Interoperability, Usability,
manageability
• Design own DOI registration policy
57. ORCID
(Open Researcher and Contributor Identifier)
• ID for researchers and contributors of research to
identify uniquely
• Managed by ORCID, Inc. (NPO) 2011-
– Members: STM publishers, universities, funding
agencies
• Service started in October, 2012
• How to use ORICD
– When submitting manuscripts
– Author information in articles
– Faculty Management
– …
62. Linked Data
• Network of metadata
• Sharing metadata
among RA
– CrossRef
– DataCite
– (JaLC)
Image
Title
Yokoham
a
Museum
Isamu Noguchiisamu@noguchi.jp
1989
近寄るとなぜか覗きたくなって
しまう「真夜中の太陽」越しに
「無言のうちに歩いている」を
見る。いつもと違った作品に出
会えます。
Description
Work
URI
URI
Creator
URI
3-4-1, Minato
Mirai, Nishi-ku,
Yokohama
045-221-0300
MuseumPlace
URI
真夏の太陽
Date
Creator
Is_located_in
Label Address
Phone
Category
Image
Image
NameE-address
wikipedia
63. Summary
• Open Science backed by data-sharing
• Data-sharing architecture
– Interoperability should be guaranteed
– Layers
• ID/Metadata Schema/Metadata/Data
format/Data/Repository
– Cooperation and Competition
• DOI is the promising ID for data but different in
use from one for literature
– DOI registration policy is needed
Notes de l'éditeur
You may think Internet is jus a tool.
No It is more than tools.