SlideShare une entreprise Scribd logo
1  sur  10
Télécharger pour lire hors ligne
Citation and Research Objects:
Toward Active Research Objects
Research Objects 2019,
eScience 2019, 24 September 2019
Daniel S. Katz
(d.katz@ieee.org, http://danielskatz.org, @danielskatz)
Assistant Director for Scientific
Software & Applications, NCSA
Research Associate Professor,
CS, ECE, iSchool
https://doi.org/10.5281/zenodo.3338176
Definitions
• Simple research object
• Small unit of citable work
• E.g., paper, dataset, version of software, etc.
• Complex research object
• Collection of multiple simple research object
• E.g., Research Objects as thought of in this workshop
Citing simple research objects
• Recent progress in creating principles for citing simple research
objects, such as data [1] and software [2]
• Different principles because these are fundamentally different objects [3]
• More recently, community efforts to implement these citation
principles
• Data: FORCE11 Data Citation Implementation Group
• Software: FORCE11 Software Citation Implementation Working Group
• Data (in the context of the FAIR data principles): Enabling FAIR Data [4]
• Note: There’s no widely-accepted equivalent of FAIR data
principles for software or for other research objects, though some
researchers are working in this area
[1] Joint declaration of data citation principles https://doi.org/10.25490/a97f-egyk
[2] Software citation principles https://doi.org/10.7717/peerj-cs.86
[3] Software vs. data in the context of citation https://doi.org/10.7287/peerj.preprints.2630v1
[4] The FAIR guiding principles for scientific data management and stewardship
https://doi.org/10.1038/sdata.2016.18
How to cite simple research objects
• Follow the example of long-established method for citing papers:
1. Deposit item (data, software) and associated metadata in an archival
repository
• Possible peer-review or repository checks
2. The repository (aka publisher) stores/archives the item and metadata;
provides an identifier that can be used to retrieve them
3. Identifier and metadata are used to cite the object
Citing complex research objects?
• Complex research objects are objects that contain other objects,
e.g., “Research Objects”
• What could be cited?
• Entire complex object (as a single entity)
• Some of the contained objects (which may already have identifiers)
• Both
• How to cite?
• Two proposals follow
Basic citation of complex research objects
• Proposal 1: Treat complex research object as a container and a set of contents & cite both
complex research object and all the contained objects that were used
• FORCE11 Software Citation Implementation Working Group recently defined some
challenges [5]
• One is how to cite complex software objects, namely frameworks that include components
• A framework can have lots of components
• Only some components are used in a particular research project
• So a set of citations for that project should cite the framework and the components that were used
• Citation of Research Objects (ROs) [6] is similar
• RO itself should be cited, plus objects in RO that are used, not those that are not
• Citing objects in the RO can then be handled similarly to how those objects outside an RO would be
cited, whether they are data, software, or something else
• Note: this relies on separability of objects; not the case for some complex research objects, e.g.,
Jupyter Notebooks, where all the software, data, and text are bundled in such a way that they cannot
be separated and individually cited
[5] Software citation implementation challenges http://arxiv.org/abs/1905.08674
[6] Why linked data is not enough for scientists https://doi.org/10.1016/j.future.2011.08.004
How to cite complex research objects
• The necessary steps are thus:
1. Tracking what parts of the RO was used (both the RO itself and the objects within it)
2. Finding identifiers & other metadata for the RO and its objects that were used
3. Building correctly formatted citations for the RO and its objects that were used
• Step 1 is the greatest challenge
• With current Research Objects, this must be done outside the RO, either manually or by
tools that use the RO (e.g., an electronic notebook system)
• For Steps 2 and 3
• Cite the RO as a data object; follow data citation principles
• Cite software, data, and documentation objects in an RO as you would for any software
or data objects or papers
• Contents may have identifiers already based on their existence outside the RO, or they
can be given identifiers when the RO is given an identifier, with suitable relationship
metadata between the RO and the content
Active research objects and citation
• Move beyond current Research Objects to automatically track usage of object inside ROs
• As stated on http://www.researchobject.org: “Enriching these resources and collections with any and all additional
information required to make research reusable, and reproducible!”
• Proposal 2: Active Research Objects (AROs), adds internal data and methods to the RO
• Basic ARO methods: put() and get() to place and access the object within the ARO
• put() requires data beyond the object being placed
• Data currently required by many ROs, including description, checksum, etc.
• External identifier (DOI) and a citation
• Perhaps also internal identifier (e.g., IDO [7])
• get() tracks when an object is accessed
• ARO data includes: flags for each internal object
• Initially set to false when object is put
• Set to true when then object is accessed via get()
• Next ARO method: validate() method to provide fixity
• Final ARO method: citation(), similar to the citation method in R [8], except can be used to obtain
citation for whole RO, citations for RO and internal objects that have been used, or citation for one
specific internal object
[7] Identifiers for Digital Objects: the Case of Software Source Code Preservation https://hal.archives-ouvertes.fr/hal-01865790
[8] Citing R https://cran.r-project.org/doc/FAQ/R-FAQ.html#Citing-R
Acknowledgements
• Prior support from NIH Data Commons Pilot
Program Consortium (DCPPC) via Harvard as
part of Team Sodium
• Thanks!
• Questions?
Citation and Research Objects: Toward Active Research Objects

Contenu connexe

Tendances

Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
KU Leuven
 

Tendances (20)

Search Me: Using Lucene.Net
Search Me: Using Lucene.NetSearch Me: Using Lucene.Net
Search Me: Using Lucene.Net
 
Text Indexing and Retrieval
Text Indexing and RetrievalText Indexing and Retrieval
Text Indexing and Retrieval
 
Liberating Laboratory Data - Eureka
Liberating Laboratory Data - EurekaLiberating Laboratory Data - Eureka
Liberating Laboratory Data - Eureka
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Lucene
LuceneLucene
Lucene
 
Web search engines
Web search enginesWeb search engines
Web search engines
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
 
Context Based Web Indexing For Semantic Web
Context Based Web Indexing For Semantic WebContext Based Web Indexing For Semantic Web
Context Based Web Indexing For Semantic Web
 
Open Annotation Collaboration Briefing
Open Annotation Collaboration BriefingOpen Annotation Collaboration Briefing
Open Annotation Collaboration Briefing
 
Data Management Lab: Session 2 slides
Data Management Lab: Session 2 slidesData Management Lab: Session 2 slides
Data Management Lab: Session 2 slides
 
Cloud Storage Client Application Analysis
Cloud Storage Client Application AnalysisCloud Storage Client Application Analysis
Cloud Storage Client Application Analysis
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Apache Tika: 1 point Oh!
Apache Tika: 1 point Oh!Apache Tika: 1 point Oh!
Apache Tika: 1 point Oh!
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Annotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University LibraryAnnotating Digital Texts in the Brown University Library
Annotating Digital Texts in the Brown University Library
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
WoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific DataWoSC19: Serverless Workflows for Indexing Large Scientific Data
WoSC19: Serverless Workflows for Indexing Large Scientific Data
 

Similaire à Citation and Research Objects: Toward Active Research Objects

NEEShub – Final Recommendations
NEEShub – Final RecommendationsNEEShub – Final Recommendations
NEEShub – Final Recommendations
Evan
 

Similaire à Citation and Research Objects: Toward Active Research Objects (20)

SEAD 2.0 Multi-Repository Member Node
SEAD 2.0 Multi-Repository Member NodeSEAD 2.0 Multi-Repository Member Node
SEAD 2.0 Multi-Repository Member Node
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publications
 
20160607 citation4software panel
20160607 citation4software panel20160607 citation4software panel
20160607 citation4software panel
 
Concepts for Object Oriented Databases.ppt
Concepts for Object Oriented Databases.pptConcepts for Object Oriented Databases.ppt
Concepts for Object Oriented Databases.ppt
 
Data management
Data management Data management
Data management
 
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
20170222 ku-librarians勉強会 #211 :海外研修報告:英国大学図書館を北から南へ巡る旅
 
Citation and reproducibility in software
Citation and reproducibility in softwareCitation and reproducibility in software
Citation and reproducibility in software
 
RELIANCE ROHub hackathon
RELIANCE ROHub hackathonRELIANCE ROHub hackathon
RELIANCE ROHub hackathon
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
 
Labman: a Research Information System to Foster Insight Discovery Through Vis...
Labman: a Research Information System to Foster Insight Discovery Through Vis...Labman: a Research Information System to Foster Insight Discovery Through Vis...
Labman: a Research Information System to Foster Insight Discovery Through Vis...
 
Software Citation: Principles, Implementation, and Impact
Software Citation:  Principles, Implementation, and ImpactSoftware Citation:  Principles, Implementation, and Impact
Software Citation: Principles, Implementation, and Impact
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS
 
DSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRISDSpace standard Data model and DSpace-CRIS
DSpace standard Data model and DSpace-CRIS
 
Introduction to digital curation
Introduction to digital curationIntroduction to digital curation
Introduction to digital curation
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
 
Data management (newest version)
Data management (newest version)Data management (newest version)
Data management (newest version)
 
NEEShub – Final Recommendations
NEEShub – Final RecommendationsNEEShub – Final Recommendations
NEEShub – Final Recommendations
 
IPRES 2014 paper presentation: significant environment information for LTDP
IPRES 2014 paper presentation: significant environment information for LTDPIPRES 2014 paper presentation: significant environment information for LTDP
IPRES 2014 paper presentation: significant environment information for LTDP
 
Digital Object Identifiers for EOSDIS data
Digital Object Identifiers for EOSDIS dataDigital Object Identifiers for EOSDIS data
Digital Object Identifiers for EOSDIS data
 

Plus de Daniel S. Katz

Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
Daniel S. Katz
 

Plus de Daniel S. Katz (20)

Research software susainability
Research software susainabilityResearch software susainability
Research software susainability
 
Software Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSASoftware Professionals (RSEs) at NCSA
Software Professionals (RSEs) at NCSA
 
Parsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in PythonParsl: Pervasive Parallel Programming in Python
Parsl: Pervasive Parallel Programming in Python
 
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
Requiring Publicly-Funded Software, Algorithms, and Workflows to be Made Publ...
 
What is eScience, and where does it go from here?
What is eScience, and where does it go from here?What is eScience, and where does it go from here?
What is eScience, and where does it go from here?
 
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
FAIR is not Fair Enough, Particularly for Software Citation, Availability, or...
 
Fundamentals of software sustainability
Fundamentals of software sustainabilityFundamentals of software sustainability
Fundamentals of software sustainability
 
Software Citation in Theory and Practice
Software Citation in Theory and PracticeSoftware Citation in Theory and Practice
Software Citation in Theory and Practice
 
URSSI
URSSIURSSI
URSSI
 
Research Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSIResearch Software Sustainability: WSSSPE & URSSI
Research Software Sustainability: WSSSPE & URSSI
 
Software citation
Software citationSoftware citation
Software citation
 
Expressing and sharing workflows
Expressing and sharing workflowsExpressing and sharing workflows
Expressing and sharing workflows
 
Summary of WSSSPE and its working groups
Summary of WSSSPE and its working groupsSummary of WSSSPE and its working groups
Summary of WSSSPE and its working groups
 
Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...Working towards Sustainable Software for Science: Practice and Experience (WS...
Working towards Sustainable Software for Science: Practice and Experience (WS...
 
20160607 citation4software opening
20160607 citation4software opening20160607 citation4software opening
20160607 citation4software opening
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
 
What do we need beyond a DOI?
What do we need beyond a DOI?What do we need beyond a DOI?
What do we need beyond a DOI?
 
Looking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSFLooking at Software Sustainability and Productivity Challenges from NSF
Looking at Software Sustainability and Productivity Challenges from NSF
 
Scientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative resultsScientific research: What Anna Karenina teaches us about useful negative results
Scientific research: What Anna Karenina teaches us about useful negative results
 
Panel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still WorkPanel: Our Scholarly Recognition System Doesn’t Still Work
Panel: Our Scholarly Recognition System Doesn’t Still Work
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Citation and Research Objects: Toward Active Research Objects

  • 1. Citation and Research Objects: Toward Active Research Objects Research Objects 2019, eScience 2019, 24 September 2019 Daniel S. Katz (d.katz@ieee.org, http://danielskatz.org, @danielskatz) Assistant Director for Scientific Software & Applications, NCSA Research Associate Professor, CS, ECE, iSchool https://doi.org/10.5281/zenodo.3338176
  • 2. Definitions • Simple research object • Small unit of citable work • E.g., paper, dataset, version of software, etc. • Complex research object • Collection of multiple simple research object • E.g., Research Objects as thought of in this workshop
  • 3. Citing simple research objects • Recent progress in creating principles for citing simple research objects, such as data [1] and software [2] • Different principles because these are fundamentally different objects [3] • More recently, community efforts to implement these citation principles • Data: FORCE11 Data Citation Implementation Group • Software: FORCE11 Software Citation Implementation Working Group • Data (in the context of the FAIR data principles): Enabling FAIR Data [4] • Note: There’s no widely-accepted equivalent of FAIR data principles for software or for other research objects, though some researchers are working in this area [1] Joint declaration of data citation principles https://doi.org/10.25490/a97f-egyk [2] Software citation principles https://doi.org/10.7717/peerj-cs.86 [3] Software vs. data in the context of citation https://doi.org/10.7287/peerj.preprints.2630v1 [4] The FAIR guiding principles for scientific data management and stewardship https://doi.org/10.1038/sdata.2016.18
  • 4. How to cite simple research objects • Follow the example of long-established method for citing papers: 1. Deposit item (data, software) and associated metadata in an archival repository • Possible peer-review or repository checks 2. The repository (aka publisher) stores/archives the item and metadata; provides an identifier that can be used to retrieve them 3. Identifier and metadata are used to cite the object
  • 5. Citing complex research objects? • Complex research objects are objects that contain other objects, e.g., “Research Objects” • What could be cited? • Entire complex object (as a single entity) • Some of the contained objects (which may already have identifiers) • Both • How to cite? • Two proposals follow
  • 6. Basic citation of complex research objects • Proposal 1: Treat complex research object as a container and a set of contents & cite both complex research object and all the contained objects that were used • FORCE11 Software Citation Implementation Working Group recently defined some challenges [5] • One is how to cite complex software objects, namely frameworks that include components • A framework can have lots of components • Only some components are used in a particular research project • So a set of citations for that project should cite the framework and the components that were used • Citation of Research Objects (ROs) [6] is similar • RO itself should be cited, plus objects in RO that are used, not those that are not • Citing objects in the RO can then be handled similarly to how those objects outside an RO would be cited, whether they are data, software, or something else • Note: this relies on separability of objects; not the case for some complex research objects, e.g., Jupyter Notebooks, where all the software, data, and text are bundled in such a way that they cannot be separated and individually cited [5] Software citation implementation challenges http://arxiv.org/abs/1905.08674 [6] Why linked data is not enough for scientists https://doi.org/10.1016/j.future.2011.08.004
  • 7. How to cite complex research objects • The necessary steps are thus: 1. Tracking what parts of the RO was used (both the RO itself and the objects within it) 2. Finding identifiers & other metadata for the RO and its objects that were used 3. Building correctly formatted citations for the RO and its objects that were used • Step 1 is the greatest challenge • With current Research Objects, this must be done outside the RO, either manually or by tools that use the RO (e.g., an electronic notebook system) • For Steps 2 and 3 • Cite the RO as a data object; follow data citation principles • Cite software, data, and documentation objects in an RO as you would for any software or data objects or papers • Contents may have identifiers already based on their existence outside the RO, or they can be given identifiers when the RO is given an identifier, with suitable relationship metadata between the RO and the content
  • 8. Active research objects and citation • Move beyond current Research Objects to automatically track usage of object inside ROs • As stated on http://www.researchobject.org: “Enriching these resources and collections with any and all additional information required to make research reusable, and reproducible!” • Proposal 2: Active Research Objects (AROs), adds internal data and methods to the RO • Basic ARO methods: put() and get() to place and access the object within the ARO • put() requires data beyond the object being placed • Data currently required by many ROs, including description, checksum, etc. • External identifier (DOI) and a citation • Perhaps also internal identifier (e.g., IDO [7]) • get() tracks when an object is accessed • ARO data includes: flags for each internal object • Initially set to false when object is put • Set to true when then object is accessed via get() • Next ARO method: validate() method to provide fixity • Final ARO method: citation(), similar to the citation method in R [8], except can be used to obtain citation for whole RO, citations for RO and internal objects that have been used, or citation for one specific internal object [7] Identifiers for Digital Objects: the Case of Software Source Code Preservation https://hal.archives-ouvertes.fr/hal-01865790 [8] Citing R https://cran.r-project.org/doc/FAQ/R-FAQ.html#Citing-R
  • 9. Acknowledgements • Prior support from NIH Data Commons Pilot Program Consortium (DCPPC) via Harvard as part of Team Sodium • Thanks! • Questions?