SlideShare une entreprise Scribd logo
1  sur  19
The Dryad Digital Repository:Published evolutionary data as part of the greater data ecosystem Todd Vision (UNC-CH/NESCent) UNC-CH <MRC> Sarah Carrier Elena Feinstein Jane Greenberg Hollie White NESCent Kevin Clarke Hilmar Lapp Heather Piwowar Peggy Schaeffer Ryan Scherle Kristin Antelman (NCSU) Bill Michener (UNM/DataONE) Bill Piel (Yale) Funding: NSF, IMLS
http://datadryad.org Functional goals To publish the data reported in the biological literature. To promote the reuse of the data. Organizational goals Shared governance by a consortium of journals. Responsible long-term stewardship.
Henry Oldenburg
Use and reuse of archived data in evolutionary biology n=27 articles from 5 journals
Sharing data on request is not effective Wicherts et al (2006) requested from from 141 articles in the field of psychology. “6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied  In a survey among geneticists by Campbell et al. (2002) the most frequent reason for withholding data was the effort required to share it (80%). 28% were unable confirm others published research because of data withholding.
Archiving at publication is effective The point in time when authors are most prepared to archive their data. No opportunity for loss, corruption, etc., of data files  Publication can be both carrot and stick. The “GenBank model” is uniquely successful.
Further incentives to authors Increases impact of one’s own work A quid pro quo for access to others’ data Guarantees data preservation Ad hoc data sharing is a burden
Evoldir survey March 2008n=414 “Do you think the data underlying published scientific results should be made publicly accessible?”  Yes: 395 (95.4%) No: 19 (4.6%)  “If yes, do you think journals should require data sharing of their authors, or should it be voluntary?” Required: 220 (55.6%) Voluntary: 176 (44.4%)
Joint Data Archiving Policy  Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future.  [This journal] requires, as a condition for publication, that data supporting the results in thearticle should be deposited in an appropriate public archive. Authors may elect to …  embargo access to the data for a period up to a year after publication.  Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species. Whitlock, M. C., M. A. McPeek, M. D. Rausher, L. Rieseberg, and A. J. Moore. 2010. Data Archiving. American Naturalist. 175(2):145-146. 	DOI:10.1086/650340
That’s all well and good, but where’s this “appropriate public archive”?
Potential archiving solutions Author-managed websites Avoids some of the hazards of informal sharing, but is fragile. Specialized databases (e.g. GenBank, TreeBase) Will cover some datatypes well, some not at all;  High quality data, but with greater submission burden;  May have issues with sustainability. Supplementary materials online Publisher provides basic infrastructure, but with low level of service. Shared public archive (e.g. Dryad) Permanent identifiers (DOIs) and trackable data citations;  Explicit terms (CCZero) for reuse;  No paywall to access;  Searchable across publishers & repositories;  Metadata enhanced for discoverability;  Support for standard APIs;  Commitment to preservation perpetuity, incl. migration of formats;  Files updatable;  Support for embargoes, etc.
Dryad is a digital librarynot a traditional bioinformatic database
Repository priorities Integration Sharing Discovery Preservation
Low-burden data submission
DataONE
Lessons from Dryad (so far) The importance of journals in data publication. The value of a shared public repository to promotion of data reuse. The delicate balance of benefit and burden to data authors. The need to break down data silos. Achieving long-term data preservation by achieving long-term organizational sustainability.
To learn more: http://blog.datadryad.org http://datadryad.org/wiki dryad-users@nescent.org Follow us on Facebook & Twitter

Contenu connexe

Tendances

Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...GigaScience, BGI Hong Kong
 
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMaking it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMichel Dumontier
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceMerce Crosas
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET
 
Laurie Goodman: Overcoming Hurdles to Data Publication
Laurie Goodman: Overcoming Hurdles to Data PublicationLaurie Goodman: Overcoming Hurdles to Data Publication
Laurie Goodman: Overcoming Hurdles to Data PublicationGigaScience, BGI Hong Kong
 
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceScott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceGigaScience, BGI Hong Kong
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Merce Crosas
 
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...GigaScience, BGI Hong Kong
 
Why study Data Sharing? (+ why share your data)
Why study Data Sharing?  (+ why share your data)Why study Data Sharing?  (+ why share your data)
Why study Data Sharing? (+ why share your data)Heather Piwowar
 
dkNET Poster ENDO 2019
dkNET Poster ENDO 2019dkNET Poster ENDO 2019
dkNET Poster ENDO 2019dkNET
 
Building an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by BitBuilding an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by Bitreadkev
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge DiscoveryMichel Dumontier
 
W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesMichel Dumontier
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience, BGI Hong Kong
 

Tendances (19)

Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
 
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMaking it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental Metadata
 
The DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with ConfidenceThe DataTags System: Sharing Sensitive Data with Confidence
The DataTags System: Sharing Sensitive Data with Confidence
 
dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019dkNET Poster Experimental Biology 2019
dkNET Poster Experimental Biology 2019
 
Laurie Goodman: Overcoming Hurdles to Data Publication
Laurie Goodman: Overcoming Hurdles to Data PublicationLaurie Goodman: Overcoming Hurdles to Data Publication
Laurie Goodman: Overcoming Hurdles to Data Publication
 
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScienceScott Edmunds: Revolutionizing Data Dissemination: GigaScience
Scott Edmunds: Revolutionizing Data Dissemination: GigaScience
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
FAIR data and the Etsin service
FAIR data and the Etsin serviceFAIR data and the Etsin service
FAIR data and the Etsin service
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
Laurie Goodman at #aibsdata: Beyond Data Release Mandates - Helping Authors M...
 
Why study Data Sharing? (+ why share your data)
Why study Data Sharing?  (+ why share your data)Why study Data Sharing?  (+ why share your data)
Why study Data Sharing? (+ why share your data)
 
dkNET Poster ENDO 2019
dkNET Poster ENDO 2019dkNET Poster ENDO 2019
dkNET Poster ENDO 2019
 
Building an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by BitBuilding an NIH Data Catalog: Bit by Bit
Building an NIH Data Catalog: Bit by Bit
 
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
2016 ACS Semantic Approaches for Biochemical Knowledge Discovery
 
STI Summit 2011 - LS4 LS Khaos
STI Summit 2011 - LS4 LS KhaosSTI Summit 2011 - LS4 LS Khaos
STI Summit 2011 - LS4 LS Khaos
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 
W3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description GuidelinesW3C HCLS Dataset Description Guidelines
W3C HCLS Dataset Description Guidelines
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.
 

En vedette

Data as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly recordData as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly recordTodd Vision
 
Uso de Dspace en la Universidad de Los Andes, Venezuela
Uso de Dspace en la Universidad de Los Andes, VenezuelaUso de Dspace en la Universidad de Los Andes, Venezuela
Uso de Dspace en la Universidad de Los Andes, VenezuelaRodrigo Torrens
 
Panorama Actual del Acceso Abierto en Latinoamerica
Panorama Actual del Acceso Abierto en LatinoamericaPanorama Actual del Acceso Abierto en Latinoamerica
Panorama Actual del Acceso Abierto en LatinoamericaRodrigo Torrens
 
Repositorios de Datos para comunidades científicas. Caso Comunidad LAGO
Repositorios de Datos para comunidades científicas. Caso Comunidad LAGORepositorios de Datos para comunidades científicas. Caso Comunidad LAGO
Repositorios de Datos para comunidades científicas. Caso Comunidad LAGORodrigo Torrens
 
Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...
Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...
Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...AMBACienSalud
 

En vedette (9)

Data as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly recordData as research output, data as part of the scholarly record
Data as research output, data as part of the scholarly record
 
Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag...
 Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag... Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag...
Repositorio Institucional para el manejo de Investigaciones de la UNAN-Manag...
 
Uso de Dspace en la Universidad de Los Andes, Venezuela
Uso de Dspace en la Universidad de Los Andes, VenezuelaUso de Dspace en la Universidad de Los Andes, Venezuela
Uso de Dspace en la Universidad de Los Andes, Venezuela
 
Virtualizacion nueva estrategia utilizacion evaluacion dspace [modo de compat...
Virtualizacion nueva estrategia utilizacion evaluacion dspace [modo de compat...Virtualizacion nueva estrategia utilizacion evaluacion dspace [modo de compat...
Virtualizacion nueva estrategia utilizacion evaluacion dspace [modo de compat...
 
Panorama Actual del Acceso Abierto en Latinoamerica
Panorama Actual del Acceso Abierto en LatinoamericaPanorama Actual del Acceso Abierto en Latinoamerica
Panorama Actual del Acceso Abierto en Latinoamerica
 
Repositorios de Datos para comunidades científicas. Caso Comunidad LAGO
Repositorios de Datos para comunidades científicas. Caso Comunidad LAGORepositorios de Datos para comunidades científicas. Caso Comunidad LAGO
Repositorios de Datos para comunidades científicas. Caso Comunidad LAGO
 
Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...
Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...
Dspace: Herramienta de apoyo para la implementación de un Repositorio Institu...
 
Control de integridad y calidad en repositorios DSpace
Control de integridad y calidad en repositorios DSpaceControl de integridad y calidad en repositorios DSpace
Control de integridad y calidad en repositorios DSpace
 
Körpersprache
KörperspracheKörpersprache
Körpersprache
 

Similaire à The Dryad Digital Repository: Published evolutionary data as part of the greater data ecosystem

The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...Hilmar Lapp
 
Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?Sandra Binning
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...Hilmar Lapp
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Susanna-Assunta Sansone
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutIUPUI
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsARDC
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things dataARDC
 
BioMed Central's open data initiatives
BioMed Central's open data initiativesBioMed Central's open data initiatives
BioMed Central's open data initiativesiainh_z
 
Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...Sarah Shreeves
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data ManagementCarole Goble
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data ChallengesPhilip Bourne
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesMicah Altman
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research DataMichael Day
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesRebekah Cummings
 
Metadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at RiskMetadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at RiskNico Carver
 
Laurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & ReuseLaurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & ReuseGigaScience, BGI Hong Kong
 

Similaire à The Dryad Digital Repository: Published evolutionary data as part of the greater data ecosystem (20)

The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?Public Data Archiving in Ecology and Evolution: How well are we doing?
Public Data Archiving in Ecology and Evolution: How well are we doing?
 
The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...The blessing and the curse: handshaking between general and specialist data r...
The blessing and the curse: handshaking between general and specialist data r...
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
Introduction to Data Management and Sharing
Introduction to Data Management and SharingIntroduction to Data Management and Sharing
Introduction to Data Management and Sharing
 
Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014Open Access Week - Oxford, 20-24 Oct 2014
Open Access Week - Oxford, 20-24 Oct 2014
 
NIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - HandoutNIH Data Sharing Plan Workshop - Handout
NIH Data Sharing Plan Workshop - Handout
 
Publishing perspectives on data management & future directions
Publishing perspectives on data management & future directionsPublishing perspectives on data management & future directions
Publishing perspectives on data management & future directions
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
BioMed Central's open data initiatives
BioMed Central's open data initiativesBioMed Central's open data initiatives
BioMed Central's open data initiatives
 
Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...Small Science: First Impressions of Curation Needs. Presentation at Digital L...
Small Science: First Impressions of Curation Needs. Presentation at Digital L...
 
Open Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon HodsonOpen Science Governance and Regulation/Simon Hodson
Open Science Governance and Regulation/Simon Hodson
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 
Linking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual ArchivesLinking Data to Publications through Citation and Virtual Archives
Linking Data to Publications through Citation and Virtual Archives
 
Curation of Research Data
Curation of Research DataCuration of Research Data
Curation of Research Data
 
Research Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and HumanitiesResearch Data Management and Sharing for the Social Sciences and Humanities
Research Data Management and Sharing for the Social Sciences and Humanities
 
Metadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at RiskMetadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at Risk
 
Laurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & ReuseLaurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
 

Dernier

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Dernier (20)

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

The Dryad Digital Repository: Published evolutionary data as part of the greater data ecosystem

  • 1. The Dryad Digital Repository:Published evolutionary data as part of the greater data ecosystem Todd Vision (UNC-CH/NESCent) UNC-CH <MRC> Sarah Carrier Elena Feinstein Jane Greenberg Hollie White NESCent Kevin Clarke Hilmar Lapp Heather Piwowar Peggy Schaeffer Ryan Scherle Kristin Antelman (NCSU) Bill Michener (UNM/DataONE) Bill Piel (Yale) Funding: NSF, IMLS
  • 2. http://datadryad.org Functional goals To publish the data reported in the biological literature. To promote the reuse of the data. Organizational goals Shared governance by a consortium of journals. Responsible long-term stewardship.
  • 4. Use and reuse of archived data in evolutionary biology n=27 articles from 5 journals
  • 5. Sharing data on request is not effective Wicherts et al (2006) requested from from 141 articles in the field of psychology. “6 months later, after … 400 emails, [sending] detailed descriptions of our study aims, approvals of our ethical committee, signed assurances not to share data with others, and even our full resumes…” only 27% of authors complied In a survey among geneticists by Campbell et al. (2002) the most frequent reason for withholding data was the effort required to share it (80%). 28% were unable confirm others published research because of data withholding.
  • 6. Archiving at publication is effective The point in time when authors are most prepared to archive their data. No opportunity for loss, corruption, etc., of data files Publication can be both carrot and stick. The “GenBank model” is uniquely successful.
  • 7. Further incentives to authors Increases impact of one’s own work A quid pro quo for access to others’ data Guarantees data preservation Ad hoc data sharing is a burden
  • 8.
  • 9. Evoldir survey March 2008n=414 “Do you think the data underlying published scientific results should be made publicly accessible?” Yes: 395 (95.4%) No: 19 (4.6%) “If yes, do you think journals should require data sharing of their authors, or should it be voluntary?” Required: 220 (55.6%) Voluntary: 176 (44.4%)
  • 10. Joint Data Archiving Policy Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. [This journal] requires, as a condition for publication, that data supporting the results in thearticle should be deposited in an appropriate public archive. Authors may elect to … embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species. Whitlock, M. C., M. A. McPeek, M. D. Rausher, L. Rieseberg, and A. J. Moore. 2010. Data Archiving. American Naturalist. 175(2):145-146. DOI:10.1086/650340
  • 11. That’s all well and good, but where’s this “appropriate public archive”?
  • 12. Potential archiving solutions Author-managed websites Avoids some of the hazards of informal sharing, but is fragile. Specialized databases (e.g. GenBank, TreeBase) Will cover some datatypes well, some not at all; High quality data, but with greater submission burden; May have issues with sustainability. Supplementary materials online Publisher provides basic infrastructure, but with low level of service. Shared public archive (e.g. Dryad) Permanent identifiers (DOIs) and trackable data citations; Explicit terms (CCZero) for reuse; No paywall to access; Searchable across publishers & repositories; Metadata enhanced for discoverability; Support for standard APIs; Commitment to preservation perpetuity, incl. migration of formats; Files updatable; Support for embargoes, etc.
  • 13. Dryad is a digital librarynot a traditional bioinformatic database
  • 14. Repository priorities Integration Sharing Discovery Preservation
  • 17.
  • 18. Lessons from Dryad (so far) The importance of journals in data publication. The value of a shared public repository to promotion of data reuse. The delicate balance of benefit and burden to data authors. The need to break down data silos. Achieving long-term data preservation by achieving long-term organizational sustainability.
  • 19. To learn more: http://blog.datadryad.org http://datadryad.org/wiki dryad-users@nescent.org Follow us on Facebook & Twitter