SlideShare une entreprise Scribd logo
1  sur  46
A Journal’s Perspective on Data
                 Standards and Biocuration
                                                     Alexandra Basford, PhD



w w w. g i g a s c i e n c e j o u r n a l . c o m
Overview
          /           The Curation
                      Challenges of a
  Introduction        Journal/Database


                       Reproducibility/Reuse
Data Publishing

                       Utility/Usability
   Our DOI
   Adventures
                       Standards/Searchability/
                       Sharing
Overview
          /           The Curation
                      Challenges of a
  Introduction        Journal/Database
How do we deal with “big data”?
                       Reproducibility/Reuse
Data Publishing

                       Utility/Usability
   Our DOI
   Adventures
                       Standards/Searchability/
                       Sharing
vs.
      ?
What is

          ?
w w w. g ig asci en cej o u rn al . co m
is a new open-access open-
  data journal for the publication of all types of
  biological studies that use or create large-
  scale data sets

The scope spans the biomedical and life sciences,
including:
      - “Omics”          - Ecology
      - Imaging          - Medicine
      - Neuroscience     - Systems biology

                       … “big and sharable”
       Published by
                         in partnership with
Editorial Board – International

Stephan Beck, UK               Stephen O'Brien, USA
Alvis Brazma, UK               Hanchuan Peng, USA
Ann-Shyn Chiang, Taiwan        Russell Poldrack, USA
Richard Durbin, UK             Ming Qi, China/USA
Paul Flicek, UK                Susanna-Assunta Sansone, UK
Robert Hanner, Canada          Michael Schatz, USA
Yoshihide Hayashizaki, Japan   David Schwartz, USA
Henning Hermjakob, UK          Fritz Sommer, USA
Wolfgang Huber, Germany        Lincoln Stein, Canada
Gary King, USA                 Sumio Sugano, Japan
Tin-Lap Lee, Hong Kong         Thomas Wachtler, Germany
Donald Moerman, Canada         Jun Wang, China
Karen Nelson, USA              Alistair Young, New Zealand
Francis Ouellette, Canada      Zang Yufeng, China
Lennart Hammarström, Sweden    Marie Zins, France
Paul Horton, Japan
Editorial Board – Multidisciplinary

Stephan Beck, Epigenomics              Stephen O'Brien, Genomics
Alvis Brazma, Transcriptomics          Hanchuan Peng, Imaging/Neuro
Ann-Shyn Chiang, Neuroscience          Russell Poldrack, Neuroscience
Richard Durbin, Genetics/Genomics      Ming Qi, Genetics
Paul Flicek, Genomics                  Susanna-Assunta Sansone, Standards
Robert Hanner, DNA Barcoding/Ecology   Michael Schatz, Cloud Computing
Yoshihide Hayashizaki, Genomics        David Schwartz, Optical Mapping
Henning Hermjakob, Proteomics          Fritz Sommer, Neuroscience
Wolfgang Huber, Functional Genomics    Lincoln Stein, Cloud Computing
Gary King, Medicine                    Sumio Sugano, Genomics
Tin-Lap Lee, Genomics                  Thomas Wachtler, Neuroscience
Donald Moerman, Functional Genomics    Jun Wang, Genomics
Karen Nelson, Metagenomics             Alistair Young, Medical Imaging
Francis Ouellette, Genomics            Zang Yufeng, Neuroscience
Lennart Hammarström, Immuno/Genetics   Marie Zins, Medicine
Paul Horton, Genetics/Tools
Now
  accepting
submissions
What is   ?
w w w. G i g a D B . o r g
&
✕
vs.
      !
An Unusual Format
• GigaScience combines standard manuscript
  publication with an ever expanding database
• Evolving data repository
   – Integrating tools for public access, viewing, and analysis of
     the stored data
   – Improvements driven by community input
• All datasets are assigned data digital object
  identifiers (DOIs) to make them easy to access, track,
  and cite

                              &
Data Sharing Hurdles
• Technical
   – too large volumes
   – too heterogeneous
   – no home for many data types
• Economic
   – too expensive
   – no long-term funding
• Cultural
   –   inertia
   –   no incentives to share
   –   unaware of how           ?
   –   too time consuming
Changing Trends

Cultural shift towards data sharing.

   Growing/widening user base.

      The long tail of new “big-data” producers?

            Curation, cutation, curation

                     ?
Use of Data = Importance + Usability


               subjective?   easier to assess
Challenges for a Journal/Database

           Reproducibility/Reuse


Utility/Usability

                Standards/Searchability/Shari
                                          ng

 Data publishing/DOI        DOI®
Why DOI®s?
• Guarantee of permanency                                          .org
• Clear method for data tracking and data citation,
  allowing:
   – Increased the searchability (and hopefully use) of data
   – Credit for data production, making it clear who produced
     the data and when
   – Credit to original authors for their data’s use
   – The ability to track and receive feedback on data usage
   – A data citation metric potentially rivaling and
     complementary to the impact factor
   – The potential make the data available and receive credit
     for it earlier, then later publishing papers on the dataset
Largest Sequencing Capacity in the World




           Sequencers                     Data Production
137   Illumina/HiSeq 2000                   5.6 Tb / day
27    LifeTech/SOLiD 4            > 1500X of human genome / day
16    AB/3730xl + 110 MegaBACEs
                                  Multiple Supercomputing Centers
2     Illumina iScan
                                      157 TB   Flops
                                      20 TB    Memory
                                      12.6 PB Storage
BGI – “Sequence it.”
Early BGI DOI®s
Datasets
                             Vertebrates
Invertebrates                Giant panda               Plants
                             Macaque                   Chinese cabbage
Ant
                             - Chinese rhesus          Cucumber
- Florida carpenter ant
                             - Crab-eating
- Jerdon’s jumping ant                                 Foxtail millet
                             Naked mole rat            Pigeonpea
- Leaf-cutter ant
                             Penguin                   Potato
Roundworm
                             - Emperor penguin         Sorghum
Silkworm
                             - Adelie penguin
                             Pigeon, domestic
Human
                             Polar bear
Asian individual (YH)
                             Sheep
- DNA Methylome
                             Tibetan antelope
- Genome Assembly
- Transcriptome              Microbe
Ancient DNA (coming soon)
                             E. Coli O104:H4 TY-2482
- Saqqaq Eskimo
- Aboriginal Australian      Cell Line
                             Chinese Hamster Ovary
The Success of E. coli
Our First DOI®


To maximize its utility to the research community and aid those fighting the current
epidemic, genomic data is released here into the public domain under a CC0
license. Until the publication of research papers on the assembly and whole-
genome analysis of this isolate we would ask you to cite this dataset as:

Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang,
Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun,
Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ;
Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482
isolate genome sequencing consortium (2011)
Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen.
doi:10.5524/100001
http://dx.doi.org/10.5524/100001
          To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring
          rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
N Engl J Med 2011; 365:718-724.
The Macaque Story
Analysis paper published
Data DOIs appear in the paper
Sorghum as the New Gold Standard
• Data also submitted to NCBI (including SV data
  to dbVar)
• Submission to public databases complemented
  by its citable form in GigaDB:
     - Assemblies of three strains   - Raw data
     - SNPs                          - InDels
     - CNVs                          - SV
In the paper…
In the references…
Is the DOI.
Progress!
We begin issuing
  data DOIs               Journals accept
                         articles with data   August
     July               that have data DOIs




                   Data DOIs listed in journal
                                                       October
                           articles


            Data DOIs are properly cited in the
                                                                 November
            reference section of journal articles
                                              (It’s been a busy year.)
Challenges for a Journal/Database

           Reproducibility/Reuse


Utility/Usability

                Standards/Searchability/Shari
                                          ng

 Data publishing/DOI        DOI®
Challenges for                 /

            Reproducibility/Reuse


 Utility/Usability

                 Standards/Searchability/Shari
                                           ng

✔Data publishing/DOI         DOI®
Reproducibility/Reuse
             • BGI Cloud Computing resources for
               handling and analyzing large-scale data.
             • Integrated tools to promote more
               widespread access, viewing, and analysis
               of data.
             • Encourage and aid use of workflow
               systems for methods (e.g. submission of
               Galaxy XML files).
Utility/Usability = ease of access
          • Special series/hub for cloud-based tools
             - Technical notes: test tools in the BGI-Cloud.
             - Tools + test data (BGI or user) in one place.
             - Aids reproducibility.
             - Aids reviewers (free)
             - Aids authors: visibility (pubmed, etc.)
                              hosting (included/free offers)
                            –contact us: editorial@gigasciencejournal.com
                                                                   Oledoe flickr cc
Utility/Usability = tools




                            Tin-Lap Lee, CUHK
Standards/Searchability/Sharing
             • ISA-Tab compatibility to aid and promote
               best practice in metadata reporting.
             • All supporting data must be publically
               available.
             • Ask for MIBBI compliance and use of
               reporting checklists.
             • Part of the Biosharing network and the
               International Neuroinformatics
               Coordinating Facility.
Big Data
                  •Initiated 505 plant and animal genome
                  projects
                  •Completed fine or draft genome maps for
                  over 100 species

ldl.genomics.cn   •Finished the sequencing of about 200
                  species
Editor-in-Chief: Laurie Goodman, PhD
 Editor: Scott Edmunds, PhD
 Assistant Editor: Alexandra Basford, PhD

 Contact: editorial@gigasciencejournal.com
Follow GigaScience on Twitter @GigaScience

  w w w. g i g a s c i e n c e j o u r n a l . c o m
          w w w. g i g a D B . o r g

Contenu connexe

Tendances

Facilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic WebFacilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic WebKristi Holmes
 
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...GigaScience, BGI Hong Kong
 
Data sharing and data management – what are they all about?
Data sharing and data management –  what are they all about?Data sharing and data management –  what are they all about?
Data sharing and data management – what are they all about?Belinda Weaver
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingUniversity of Arizona
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnTodd Vision
 
2. ratner orcid getting to launch v5
2. ratner orcid getting to launch v52. ratner orcid getting to launch v5
2. ratner orcid getting to launch v5ORCID, Inc
 
The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...Todd Vision
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8Scott Edmunds
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...GigaScience, BGI Hong Kong
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! TheContentMine
 
Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Todd Vision
 
Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Todd Vision
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!petermurrayrust
 

Tendances (16)

Facilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic WebFacilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic Web
 
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
Peter Li: GigaDB and Galaxy - revolutionizing data dissemination, organizatio...
 
Data Management
Data ManagementData Management
Data Management
 
Data sharing and data management – what are they all about?
Data sharing and data management –  what are they all about?Data sharing and data management –  what are they all about?
Data sharing and data management – what are they all about?
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data Sharing
 
Shorthouse
ShorthouseShorthouse
Shorthouse
 
Knowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, BonnKnowledge Exchange, Nov 2011, Bonn
Knowledge Exchange, Nov 2011, Bonn
 
2. ratner orcid getting to launch v5
2. ratner orcid getting to launch v52. ratner orcid getting to launch v5
2. ratner orcid getting to launch v5
 
The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika! ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
Whither Small Data?
Whither Small Data?Whither Small Data?
Whither Small Data?
 
Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...
 
Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 

En vedette

Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0
Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0
Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0Hang Wu
 
Engineering Innovation - Electronic
Engineering Innovation - ElectronicEngineering Innovation - Electronic
Engineering Innovation - ElectronicJohn Breslin
 
Manifesting futures
Manifesting futuresManifesting futures
Manifesting futuresFrances Ting
 
How Design Triggers Transformation presented by Tjeerd Hoek
How Design Triggers Transformation presented by Tjeerd HoekHow Design Triggers Transformation presented by Tjeerd Hoek
How Design Triggers Transformation presented by Tjeerd Hoekfrog
 
Scaling citizen science genomics
Scaling citizen science genomicsScaling citizen science genomics
Scaling citizen science genomicsMelanie Swan
 
Imagineering - Virtual Worlds
Imagineering - Virtual WorldsImagineering - Virtual Worlds
Imagineering - Virtual WorldsPrithwis Mukerjee
 
Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...
Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...
Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...acmarkham
 
14.40 o1 i neupane
14.40 o1 i neupane14.40 o1 i neupane
14.40 o1 i neupaneNZIP
 
Thoughts On The Future Of Human Evolution
Thoughts On The Future Of Human EvolutionThoughts On The Future Of Human Evolution
Thoughts On The Future Of Human EvolutionWeaver D. R. Weinbaum
 
(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ Harvard
(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ Harvard(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ Harvard
(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ HarvardHumanity Plus
 
Military 2.0 - Patrick Lin - H+ Summit @ Harvard
Military 2.0 - Patrick Lin - H+ Summit @ HarvardMilitary 2.0 - Patrick Lin - H+ Summit @ Harvard
Military 2.0 - Patrick Lin - H+ Summit @ HarvardHumanity Plus
 
FUTURE MILITARY WEAPONS study guide
FUTURE MILITARY WEAPONS study guideFUTURE MILITARY WEAPONS study guide
FUTURE MILITARY WEAPONS study guideZain Azzaino
 
Simulation Singularity - when simulation faithfully mirrors the real world?
Simulation Singularity - when simulation faithfully mirrors the real world?Simulation Singularity - when simulation faithfully mirrors the real world?
Simulation Singularity - when simulation faithfully mirrors the real world?Andy Fawkes
 
Extreme Simulation Scenarios
Extreme Simulation ScenariosExtreme Simulation Scenarios
Extreme Simulation ScenariosUKH+
 
Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.
Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.
Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.Alexander Dubynin
 
YBCO: Superconductor
YBCO: SuperconductorYBCO: Superconductor
YBCO: SuperconductorAlex Melvin
 
High temperature Superconductor
High temperature SuperconductorHigh temperature Superconductor
High temperature SuperconductorSergey Ilyukhin
 
IFRS and Aaoifi, Harmonisation or Convergence?
IFRS and Aaoifi, Harmonisation or Convergence?IFRS and Aaoifi, Harmonisation or Convergence?
IFRS and Aaoifi, Harmonisation or Convergence?Nik Hasyudeen
 

En vedette (20)

Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0
Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0
Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0
 
Engineering Innovation - Electronic
Engineering Innovation - ElectronicEngineering Innovation - Electronic
Engineering Innovation - Electronic
 
Manifesting futures
Manifesting futuresManifesting futures
Manifesting futures
 
The Sixth Sense
The Sixth SenseThe Sixth Sense
The Sixth Sense
 
How Design Triggers Transformation presented by Tjeerd Hoek
How Design Triggers Transformation presented by Tjeerd HoekHow Design Triggers Transformation presented by Tjeerd Hoek
How Design Triggers Transformation presented by Tjeerd Hoek
 
Scaling citizen science genomics
Scaling citizen science genomicsScaling citizen science genomics
Scaling citizen science genomics
 
Imagineering - Virtual Worlds
Imagineering - Virtual WorldsImagineering - Virtual Worlds
Imagineering - Virtual Worlds
 
Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...
Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...
Magneto-Inductive NEtworked Rescue System (MINERS): Taking Sensor Networks Un...
 
14.40 o1 i neupane
14.40 o1 i neupane14.40 o1 i neupane
14.40 o1 i neupane
 
Thoughts On The Future Of Human Evolution
Thoughts On The Future Of Human EvolutionThoughts On The Future Of Human Evolution
Thoughts On The Future Of Human Evolution
 
(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ Harvard
(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ Harvard(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ Harvard
(Reverse) Engineering Intelligence - Noah Goodman - H+ Summit @ Harvard
 
Military 2.0 - Patrick Lin - H+ Summit @ Harvard
Military 2.0 - Patrick Lin - H+ Summit @ HarvardMilitary 2.0 - Patrick Lin - H+ Summit @ Harvard
Military 2.0 - Patrick Lin - H+ Summit @ Harvard
 
FUTURE MILITARY WEAPONS study guide
FUTURE MILITARY WEAPONS study guideFUTURE MILITARY WEAPONS study guide
FUTURE MILITARY WEAPONS study guide
 
Simulation Singularity - when simulation faithfully mirrors the real world?
Simulation Singularity - when simulation faithfully mirrors the real world?Simulation Singularity - when simulation faithfully mirrors the real world?
Simulation Singularity - when simulation faithfully mirrors the real world?
 
Extreme Simulation Scenarios
Extreme Simulation ScenariosExtreme Simulation Scenarios
Extreme Simulation Scenarios
 
Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.
Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.
Chirality: from particles and nuclei to quantum materials. Дмитрий Харзеев.
 
Creators on Creating
Creators on CreatingCreators on Creating
Creators on Creating
 
YBCO: Superconductor
YBCO: SuperconductorYBCO: Superconductor
YBCO: Superconductor
 
High temperature Superconductor
High temperature SuperconductorHigh temperature Superconductor
High temperature Superconductor
 
IFRS and Aaoifi, Harmonisation or Convergence?
IFRS and Aaoifi, Harmonisation or Convergence?IFRS and Aaoifi, Harmonisation or Convergence?
IFRS and Aaoifi, Harmonisation or Convergence?
 

Similaire à Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and Biocuration

GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience, BGI Hong Kong
 
Scott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data CitationScott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data CitationGigaScience, BGI Hong Kong
 
Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Anita de Waard
 
Global Biodiversity Information Facility (GBIF) - 2012
Global Biodiversity Information Facility (GBIF) - 2012Global Biodiversity Information Facility (GBIF) - 2012
Global Biodiversity Information Facility (GBIF) - 2012Dag Endresen
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...GigaScience, BGI Hong Kong
 
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014Susanna-Assunta Sansone
 
The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...Hilmar Lapp
 
An Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data ResourceAn Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data ResourcePhilippa Griffin
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...GarethKnight
 
Laurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & ReuseLaurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & ReuseGigaScience, BGI Hong Kong
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
 
Stories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureStories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureResearch Data Alliance
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...Susanna-Assunta Sansone
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...GigaScience, BGI Hong Kong
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science Carole Goble
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)Dag Endresen
 
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...Phoenix Bioinformatics
 

Similaire à Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and Biocuration (20)

GigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDBGigaScience: data and beta-database launch. Announcing GigaDB
GigaScience: data and beta-database launch. Announcing GigaDB
 
Scott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data CitationScott Edmunds at DataCite 2012: Adventures in Data Citation
Scott Edmunds at DataCite 2012: Adventures in Data Citation
 
Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013
 
Global Biodiversity Information Facility (GBIF) - 2012
Global Biodiversity Information Facility (GBIF) - 2012Global Biodiversity Information Facility (GBIF) - 2012
Global Biodiversity Information Facility (GBIF) - 2012
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
FAIR data and NPG Scientific Data: RIKEN Yokohama, 25 June, 2014
 
The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...The Dryad Digital Repository: Published data as part of the greater data ecos...
The Dryad Digital Repository: Published data as part of the greater data ecos...
 
An Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data ResourceAn Oz Mammals Bioinformatics and Data Resource
An Oz Mammals Bioinformatics and Data Resource
 
Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...Research Data Management: What is it and why is the Library & Archives Servic...
Research Data Management: What is it and why is the Library & Archives Servic...
 
Laurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & ReuseLaurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
Laurie Goodman at NDIC: Big Data Publishing, Handling & Reuse
 
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Making your data work for you: Scratchpads, publishing & the biodiversity dat...
Making your data work for you: Scratchpads, publishing & the biodiversity dat...
 
Stories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global InfrastructureStories of “Glocality"—Nations in a Global Infrastructure
Stories of “Glocality"—Nations in a Global Infrastructure
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
 
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
Scott Edmunds: Data Dissemination: Difficulties, Data Citation, DOI's (and Gi...
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
A Few Simple Things Authors Can Do to Make Their Data More Discoverable and R...
 
Data Publishing in Archaeozoology
Data Publishing in ArchaeozoologyData Publishing in Archaeozoology
Data Publishing in Archaeozoology
 

Plus de GigaScience, BGI Hong Kong

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...GigaScience, BGI Hong Kong
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteGigaScience, BGI Hong Kong
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...GigaScience, BGI Hong Kong
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...GigaScience, BGI Hong Kong
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...GigaScience, BGI Hong Kong
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...GigaScience, BGI Hong Kong
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixGigaScience, BGI Hong Kong
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserGigaScience, BGI Hong Kong
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...GigaScience, BGI Hong Kong
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...GigaScience, BGI Hong Kong
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveGigaScience, BGI Hong Kong
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...GigaScience, BGI Hong Kong
 

Plus de GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 

Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and Biocuration

  • 1. A Journal’s Perspective on Data Standards and Biocuration Alexandra Basford, PhD w w w. g i g a s c i e n c e j o u r n a l . c o m
  • 2. Overview / The Curation Challenges of a Introduction Journal/Database Reproducibility/Reuse Data Publishing Utility/Usability Our DOI Adventures Standards/Searchability/ Sharing
  • 3. Overview / The Curation Challenges of a Introduction Journal/Database How do we deal with “big data”? Reproducibility/Reuse Data Publishing Utility/Usability Our DOI Adventures Standards/Searchability/ Sharing
  • 4. vs. ?
  • 6. w w w. g ig asci en cej o u rn al . co m
  • 7. is a new open-access open- data journal for the publication of all types of biological studies that use or create large- scale data sets The scope spans the biomedical and life sciences, including: - “Omics” - Ecology - Imaging - Medicine - Neuroscience - Systems biology … “big and sharable” Published by in partnership with
  • 8. Editorial Board – International Stephan Beck, UK Stephen O'Brien, USA Alvis Brazma, UK Hanchuan Peng, USA Ann-Shyn Chiang, Taiwan Russell Poldrack, USA Richard Durbin, UK Ming Qi, China/USA Paul Flicek, UK Susanna-Assunta Sansone, UK Robert Hanner, Canada Michael Schatz, USA Yoshihide Hayashizaki, Japan David Schwartz, USA Henning Hermjakob, UK Fritz Sommer, USA Wolfgang Huber, Germany Lincoln Stein, Canada Gary King, USA Sumio Sugano, Japan Tin-Lap Lee, Hong Kong Thomas Wachtler, Germany Donald Moerman, Canada Jun Wang, China Karen Nelson, USA Alistair Young, New Zealand Francis Ouellette, Canada Zang Yufeng, China Lennart Hammarström, Sweden Marie Zins, France Paul Horton, Japan
  • 9. Editorial Board – Multidisciplinary Stephan Beck, Epigenomics Stephen O'Brien, Genomics Alvis Brazma, Transcriptomics Hanchuan Peng, Imaging/Neuro Ann-Shyn Chiang, Neuroscience Russell Poldrack, Neuroscience Richard Durbin, Genetics/Genomics Ming Qi, Genetics Paul Flicek, Genomics Susanna-Assunta Sansone, Standards Robert Hanner, DNA Barcoding/Ecology Michael Schatz, Cloud Computing Yoshihide Hayashizaki, Genomics David Schwartz, Optical Mapping Henning Hermjakob, Proteomics Fritz Sommer, Neuroscience Wolfgang Huber, Functional Genomics Lincoln Stein, Cloud Computing Gary King, Medicine Sumio Sugano, Genomics Tin-Lap Lee, Genomics Thomas Wachtler, Neuroscience Donald Moerman, Functional Genomics Jun Wang, Genomics Karen Nelson, Metagenomics Alistair Young, Medical Imaging Francis Ouellette, Genomics Zang Yufeng, Neuroscience Lennart Hammarström, Immuno/Genetics Marie Zins, Medicine Paul Horton, Genetics/Tools
  • 11. What is ?
  • 12. w w w. G i g a D B . o r g
  • 14. An Unusual Format • GigaScience combines standard manuscript publication with an ever expanding database • Evolving data repository – Integrating tools for public access, viewing, and analysis of the stored data – Improvements driven by community input • All datasets are assigned data digital object identifiers (DOIs) to make them easy to access, track, and cite &
  • 15. Data Sharing Hurdles • Technical – too large volumes – too heterogeneous – no home for many data types • Economic – too expensive – no long-term funding • Cultural – inertia – no incentives to share – unaware of how ? – too time consuming
  • 16. Changing Trends Cultural shift towards data sharing. Growing/widening user base. The long tail of new “big-data” producers? Curation, cutation, curation ?
  • 17. Use of Data = Importance + Usability subjective? easier to assess
  • 18. Challenges for a Journal/Database Reproducibility/Reuse Utility/Usability Standards/Searchability/Shari ng Data publishing/DOI DOI®
  • 19. Why DOI®s? • Guarantee of permanency .org • Clear method for data tracking and data citation, allowing: – Increased the searchability (and hopefully use) of data – Credit for data production, making it clear who produced the data and when – Credit to original authors for their data’s use – The ability to track and receive feedback on data usage – A data citation metric potentially rivaling and complementary to the impact factor – The potential make the data available and receive credit for it earlier, then later publishing papers on the dataset
  • 20. Largest Sequencing Capacity in the World Sequencers Data Production 137 Illumina/HiSeq 2000 5.6 Tb / day 27 LifeTech/SOLiD 4 > 1500X of human genome / day 16 AB/3730xl + 110 MegaBACEs Multiple Supercomputing Centers 2 Illumina iScan 157 TB Flops 20 TB Memory 12.6 PB Storage
  • 23. Datasets Vertebrates Invertebrates Giant panda Plants Macaque Chinese cabbage Ant - Chinese rhesus Cucumber - Florida carpenter ant - Crab-eating - Jerdon’s jumping ant Foxtail millet Naked mole rat Pigeonpea - Leaf-cutter ant Penguin Potato Roundworm - Emperor penguin Sorghum Silkworm - Adelie penguin Pigeon, domestic Human Polar bear Asian individual (YH) Sheep - DNA Methylome Tibetan antelope - Genome Assembly - Transcriptome Microbe Ancient DNA (coming soon) E. Coli O104:H4 TY-2482 - Saqqaq Eskimo - Aboriginal Australian Cell Line Chinese Hamster Ovary
  • 24. The Success of E. coli
  • 25. Our First DOI® To maximize its utility to the research community and aid those fighting the current epidemic, genomic data is released here into the public domain under a CC0 license. Until the publication of research papers on the assembly and whole- genome analysis of this isolate we would ask you to cite this dataset as: Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011) Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen. doi:10.5524/100001 http://dx.doi.org/10.5524/100001 To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
  • 26.
  • 27.
  • 28. N Engl J Med 2011; 365:718-724.
  • 31. Data DOIs appear in the paper
  • 32. Sorghum as the New Gold Standard
  • 33.
  • 34. • Data also submitted to NCBI (including SV data to dbVar) • Submission to public databases complemented by its citable form in GigaDB: - Assemblies of three strains - Raw data - SNPs - InDels - CNVs - SV
  • 38. Progress! We begin issuing data DOIs Journals accept articles with data August July that have data DOIs Data DOIs listed in journal October articles Data DOIs are properly cited in the November reference section of journal articles (It’s been a busy year.)
  • 39. Challenges for a Journal/Database Reproducibility/Reuse Utility/Usability Standards/Searchability/Shari ng Data publishing/DOI DOI®
  • 40. Challenges for / Reproducibility/Reuse Utility/Usability Standards/Searchability/Shari ng ✔Data publishing/DOI DOI®
  • 41. Reproducibility/Reuse • BGI Cloud Computing resources for handling and analyzing large-scale data. • Integrated tools to promote more widespread access, viewing, and analysis of data. • Encourage and aid use of workflow systems for methods (e.g. submission of Galaxy XML files).
  • 42. Utility/Usability = ease of access • Special series/hub for cloud-based tools - Technical notes: test tools in the BGI-Cloud. - Tools + test data (BGI or user) in one place. - Aids reproducibility. - Aids reviewers (free) - Aids authors: visibility (pubmed, etc.) hosting (included/free offers) –contact us: editorial@gigasciencejournal.com Oledoe flickr cc
  • 43. Utility/Usability = tools Tin-Lap Lee, CUHK
  • 44. Standards/Searchability/Sharing • ISA-Tab compatibility to aid and promote best practice in metadata reporting. • All supporting data must be publically available. • Ask for MIBBI compliance and use of reporting checklists. • Part of the Biosharing network and the International Neuroinformatics Coordinating Facility.
  • 45. Big Data •Initiated 505 plant and animal genome projects •Completed fine or draft genome maps for over 100 species ldl.genomics.cn •Finished the sequencing of about 200 species
  • 46. Editor-in-Chief: Laurie Goodman, PhD Editor: Scott Edmunds, PhD Assistant Editor: Alexandra Basford, PhD Contact: editorial@gigasciencejournal.com Follow GigaScience on Twitter @GigaScience w w w. g i g a s c i e n c e j o u r n a l . c o m w w w. g i g a D B . o r g

Notes de l'éditeur

  1. Integrated tools to promote more widespread access, viewing, and analysis of the stored data. BGI Cloud Computing resources for handling and analyzing large-scale data. All Data given a DOI to allow ease of finding and citing datasets, as well as for citation tracking.
  2. Our facilities feature Sanger and next-generation sequencing technologies, providing the highest throughput sequencing capacity in the world. Powered by 137 IlluminaHiSeq 2000 instruments and 27 Applied BiosystemsSOLiD™ 4 Systems, we provide, high-quality sequencing results with industry-leading turnaround time. As of December 2010, our sequencing capacity is 5 Tb raw data per day, supported by several supercomputing centers with a total peak performance up to 102 Tflops, 20 TB of memory, and 10 PB storage. We provide stable and efficient resources to store and analyze massive amounts of data generated by next generation sequencing.
  3. Not shown: 1,000 Medelian Disorders Project, Autism Sequencing Project, Netherlands sequencing…
  4. Assemblies and raw data are still going to NCBI.
  5. Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data todbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  6. Integrated tools to promote more widespread access, viewing, and analysis of the stored data. BGI Cloud Computing resources for handling and analyzing large-scale data. All Data given a DOI to allow ease of finding and citing datasets, as well as for citation tracking.
  7. Integrated tools to promote more widespread access, viewing, and analysis of the stored data. BGI Cloud Computing resources for handling and analyzing large-scale data. All Data given a DOI to allow ease of finding and citing datasets, as well as for citation tracking.
  8. Have all of the metadata fields, working on integrating the tools.