SlideShare une entreprise Scribd logo
1  sur  73
Ways and Needs to Promote 
Rapid Data Sharing 
Laurie Goodman, PhD 
Editor-in-Chief GigaScience 
ORCID ID: 0000-0001-9724-5976
Scientific Communication 
Via Publication 
• Scholarly articles are merely advertisement of scholarship . 
The actual scholarly artefacts, i.e. the data and 
computational methods, which support the 
scholarship, remain largely inaccessible --- Jon B. 
Buckheit and David L. Donoho, WaveLab and reproducible 
research, 1995 
• Core scientific statements or assertions are intertwined and 
hidden in the conventional scholarly narratives 
• Lack of transparency, lack of credit for anything other than 
“regular” dead tree publication
A Tale of Two Bacteria 
1. On May 2, 2011 German Doctors Reported the first case of an 
E.coli infection, that was accompanied by hemolytic-uremic 
syndrome 
2. On May 21, 2011 the first death occurred from this bacteria 
(denoted E.coli O104:H4) 
3. On June 3, 2014, BGI completed a draft sequence of E.coli 
O104:H4 from a sample provided by doctors at the University 
Medical Centre Hamburg-Eppendorf 
4. At this point- the leaders at BGI held a discussion about 
whether to release the sequence data immediately: what were 
the potential repercussions of doing so 
The question arose: 
If the data were released now- would it affect 
their ability to publish later?
A Tale of Two Bacteria 
• In one world- the researchers — who were concerned about their 
ability to publish as this is the way to obtain recognition and 
obtain grants (which are essential for them to work) — waited. 
The first publication appeared on July 29th 
• In another world, the researchers — who decided public health 
was more important than obtaining a publication — released the 
data immediately. 
The first publication appeared on July 29th — but was not 
from that group who released the data (though information on 
that data was included.
Whether the concern about the ability to publish 
if data are released early is real or imagined 
Researchers act on that concern
Whether the concern about the ability to publish 
if data are released early is real or imagined 
Researchers act on that concern
These data were put on an FTP 
server under a CCO waiver and also 
given a DOI to make access 
‘permanent’ 
To maximize its utility to the research community and aid those fighting 
the current epidemic, genomic data is released here into the public domain 
under a CC0 license. Until the publication of research papers on the 
assembly and whole-genome analysis of this isolate we would ask you to 
cite this dataset as: 
Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, 
J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; 
Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; 
Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the 
Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium 
(2011) 
Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI 
Shenzhen. doi:10.5524/100001 
http://dx.doi.org/10.5524/100001 
To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to 
Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
Downstream consequences: 
1. Citations (~180) 2. Therapeutics (primers, antimicrobials) 3. Platform Comparisons 
4. Example for faster & more open science 
“Last summer, biologist Andrew Kasarskis was eager to help decipher the genetic origin of the Escherichia coli 
strain that infected roughly 4,000 people in Germany between May and July. But he knew it that might take days 
for the lawyers at his company — Pacific Biosciences — to parse the agreements governing how his team could 
use data collected on the strain. Luckily, one team had released its data under a Creative Commons licence that 
allowed free use of the data, allowing Kasarskis and his colleagues to join the international research effort and 
publish their work without wasting time on legal wrangling.”
1.3 The power of intelligently open data 
The benefits of intelligently open data were powerfully 
illustrated by events following an outbreak of a severe gastro-intestinal 
infection in Hamburg in Germany in May 2011. This 
spread through several European countries and the US, 
affecting about 4000 people and resulting in over 50 deaths. All 
tested positive for an unusual and little-known Shiga-toxin– 
producing E. coli bacterium. The strain was initially analysed by 
scientists at BGI-Shenzhen in China, working together with 
those in Hamburg, and three days later a draft genome was 
released under an open data licence. This generated interest 
from bioinformaticians on four continents. 24 hours after the 
release of the genome it had been assembled. Within a week 
two dozen reports had been filed on an open-source site 
dedicated to the analysis of the strain. These analyses 
provided crucial information about the strain’s virulence and 
resistance genes – how it spreads and which antibiotics are 
effective against it. They produced results in time to help 
contain the outbreak. By July 2011, scientists published papers 
based on this work. By opening up their early sequencing 
results to international collaboration, researchers in Hamburg 
produced results that were quickly tested by a wide range of 
experts, used to produce new knowledge and ultimately to 
control a public health emergency.
All that aside 
Can we all agree that releasing the E.coli data 
ahead of publication was ‘good’ 
At least from a public health perspective 
Here are the numbers for the E.coli 2011 Outbreak 
In total, ~4000 people were infected and 53 died
From a Public Health perspective…Deaths 
Worldwide* 
Infectious Disease 
Measles: 122,000 per year 
Hepatitis C-related liver disease: 350,000-500,000 per year 
Malaria: 627,000 per year 
HIV/AIDS: 1.4-1.7 million per year 
Non-communicable, with genetic predisposition 
Prostate cancer: 307,000 per year 
Breast cancer: 522,000 per year 
Suicide: 800,000 per year 
Diabetes: 1.5 million per year 
Cancer: 8.2 million per year 
Cardiovascular Disease: 17.5 million per year 
Non-genetic/Non-infectious 
Pesticide Poisoning: 250,000 per year 
Malnutrition: 2.8 million children (under 5) per year 
*World Health Organization Fact Sheets http://www.who.int/en/
Sharing Data is Essential for Many 
Reasons
Sharing aids fields… 
Rice v Wheat: consequences of publically available genome data 
700 
600 
500 
400 
300 
200 
100 
0 
rice wheat 
Every 10 datasets collected contributes to at least 4 papers in the 
following 3-years. 
Piwowar, HA, Vision, TJ, & Whitlock, MC (2011). Data archiving is a good investment Nature, 473 
(7347), 285-285 DOI: 10.1038/473285a
Sharing aids authors… 
Sharing Detailed Research 
Data Is Associated with 
Increased Citation Rate. 
Piwowar HA, Day RS, Fridsma DB (2007) 
PLoS ONE 2(3): e308. 
doi:10.1371/journal.pone.0000308
Lack of Sharing Impacts Reproducibility 
Out of 18 microarray papers, results 
from 10 could not be reproduced 
1. Ioannidis et al., (2009). Repeatability of published microarray gene expression analyses. Nature Genetics 41: 14 
2. Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8)
Sharing can reduce retractions 
>15X increase in last decade 
Strong correlation of “retraction index” with 
higher impact factor 
At current % increase by 2045 as 
many papers published as 
retracted! 
1. Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.html 
2. Retracted Science and the Retraction Index ▿ http://iai.asm.org/content/79/10/3855.abstract?
Data Sharing Hurdles 
? 
If only it were easy… 
There are numerous reasons why researchers 
do not share data: 
The majority of which are good reasons
Wiley Researcher Data Insights Survey 
Our objective was to establish a baseline view of data sharing 
practices, attitudes, and motivations globally, with participation 
from researchers in every scholarly field. 
In March 2014, more than 90,000 researchers around the world 
were invited to participate in Wiley’s Researcher Data Insights 
Survey. Participants were researchers who had published at least 
one journal article in the past year with any publisher. 
We received an overwhelming 2,886 responses from around the 
world. 
Slide from Catherine Giffi, Director, Strategic Market Analysis, Global Research, Wiley
Wiley Researcher Data Insights Survey 
Key Findings 
• Most researchers are sharing their data. 
• Those not sharing have a variety of reasons. 
• Data that’s being shared typically is <10 GB. 
• The most common type of data that is being 
shared is flat, tabular data (.csv, .txt, .xl) 
• Data is usually saved on hard drives. 
Slide from Catherine Giffi, Director, Strategic Market Analysis, Global Research, Wiley
Wiley Researcher Data Insights Survey 
Why Researchers Do Not Share 
• Intellectual property or confidentiality issues (59%) 
• Concerned research might be “scooped” (39%) 
• Concerns about misinterpretation or misuse (32%) 
• Concerns about attribution/citation credit (31%) 
• Ethical concerns (24%) 
• Insufficient time/resources (19%) 
• Funder/institution does not require sharing (13%) 
• Lack of funding (13%) 
• Not sure where to share (5%) 
• Not sure how to share (3%) 
Slide from Catherine Giffi, Director, Strategic Market Analysis, Global Research, Wiley 
See also: 
http://exchanges.wiley.com/blog/2014/11/03/how-and-why-researchers-share-data-and-why-they-dont/ 
http://scholarlykitchen.sspnet.org/2014/11/11/to-share-or-not-to-share-that-is-the-research-data-question/
How Can Publishers Promote Data Sharing 
Researchers are never so captive as when they publishing 
But we need to help — not just harass. 
Carrots and Sticks 
And- why us? 
– Create Journal Data Release Policies 
– Check Data Release Policy is followed 
– Find Ways to Aid Researchers in Releasing Data 
– Consider ways to support/protect researchers 
who do share ahead of publications 
– Promote Data Citation
How Can Publishers Promote Data Sharing 
Researchers are never so captive as when they publishing 
But we need to help — not just harass. 
Carrots and Sticks 
And- why us? 
– Create Journal Data Release Policies 
– Check Data Release Policy is followed 
– Find Ways to Aid Researchers in Releasing Data 
– Consider ways to support/protect researchers 
who do share ahead of publications 
– Promote Data Citation
Incentives/credit 
Credit where credit is overdue: 
“One option would be to provide researchers who release data to 
public repositories with a means of accreditation.” 
“An ability to search the literature for all online papers that used a 
particular data set would enable appropriate attribution for those 
who share. “ 
Nature Biotechnology 27, 579 (2009) 
Prepublication data sharing 
(Toronto International Data Release Workshop) 
“Data producers benefit from creating ? 
a citable reference, as it can 
later be used to reflect impact of the data sets.” 
Nature 461, 168-170 (2009)
Genomics Data Sharing Policies… 
Bermuda Accords 1996/1997/1998: 
1. Automatic release of sequence assemblies within 24 hours. 
2. Immediate publication of finished annotated sequences. 
3. Aim to make the entire sequence freely available in the public domain for 
both research and development in order to maximise benefits to society. 
Fort Lauderdale Agreement, 2003: 
1. Sequence traces from whole genome shotgun projects are to be 
deposited in a trace archive within one week of production. 
2. Whole genome assemblies are to be deposited in a public nucleotide 
sequence database as soon as possible after the assembled sequence 
has met a set of quality evaluation criteria. 
Toronto International data release workshop, 2009: 
The goal was to reaffirm and refine, where needed, the policies related to 
the early release of genomic data, and to extend, if possible, similar data 
release policies to other types of large biological datasets – whether from 
proteomics, biobanking or metabolite research.
Sharing Data from Large-scale Biological Research Projects: A System of 
Tripartite Responsibility (From the Fort Lauderdale Meeting 2003) 
http://www.genome.gov/pages/research/wellcomereport0303.pdf
Citing Data Isn’t New 
The Physical Sciences have been doing this for a while 
DataCite and DOIs 
“increase acceptance of research data as 
legitimate, citable contributions to the 
scholarly record”. 
Aims to: 
“data generated in the course of research 
are just as valuable to the ongoing 
academic discourse as papers and 
monographs”.
How We Envision Research Publication 
(Communicating Science) 
Open-access journal Data Publishing Platform 
Data Sets in 
GigaDB 
Analyses in 
GigaGalaxy 
Paper in 
GigaScience 
Data Analysis Platform
Other Journals are now doing similar 
This is most commonly done in the form of a Data Paper 
rather than a release of data that is citable in itself. 
• A Data Paper is affectively a Description of the Data 
• Other journals that do Data Publishing as a formal 
paper type 
• F1000 Research (launched in 2012) 
• Has Data papers as one of several types of papers 
• Scientific Data (launched in 2014) 
• Solely publishes Data Descriptors 
• There are more…
Making the Data Itself Citable 
We provide a linked database 
The data are then directly linked to the paper- but can also be cited 
separately through a Data DOI 
We can do this because we have a collaboration between BMC 
(who handles the standard paper publication) and BGI (which has 
enormous data storage capacity.) 
However: There are many community available databases- so in 
principle- any journal can do this by taking advantage of such 
available resources. 
These include the usual suspects: EBI, NCBI, DDBJ etc. 
Databases that take all data types and provide Data DOIs: Dryad, 
FigShare, etc. 
There are also numerous smaller community databases specific to 
different fields or data types.
For data citation to work, needs: 
• Acceptance by journals. 
• Data+Citation: inclusion in the references. 
• Tracking by citation indexes. 
• Usage of the metrics by the community…
For data citation to work, needs: 
• Acceptance by journals. 
• Data+Citation: inclusion in the references. 
• Tracking by citation indexes. 
• Usage of the metrics by the community…
In Principle…
Back to E.coli O104:H4 
• As noted: articles on these early released and 
citable data were published 
• Also- the early releasers were not the first to 
publish 
• Nor was the data cited
This open-source 
analysis work 
was published on 
August 25th
The journal did 
not approve of 
inclusion of the 
data citation. 
Nor was any 
indication of 
where the 
genome 
information 
could be found
This report was the first to 
be publisher- and it 
included and used 
information from the 
crowd-source release as 
well as the other early 
release. 
No where in the paper is 
there any indication of 
where to obtain this data 
Nor is there an indication 
of where to obtain the 
sequence data they 
generated
This group made 
their 0104:H4 
sequence available 
at the time of 
completion- prior 
to publication in 
the NCBI database. 
Though no link to 
the Accession 
Number is easily 
found in the paper.
This report DID include a reference for the data 
(even though they did not use it in their analysis)
For data citation to work, needs: 
• Acceptance by journals. 
• Data+Citation: inclusion in the references. 
• Tracking by citation indexes. 
• Usage of the metrics by the community…
In Practice…
• Data submitted to NCBI databases: 
- Raw data SRA:SRA046843 
- Assemblies of 3 strains Genbank:AHAO00000000-AHAQ00000000 
- SNPs dbSNP:1056306 
- CNVs 
- InDels } 
dbVAR:nstd63 
- SV 
• Submission to public databases complemented by 
its citable form in GigaDB (doi:10.5524/100012).
In the references…
Is the DOI…
In Practice…
In Practice… 
http://blogs.biomedcentral.com/gigablog/2014/05/14/the-latest-weapon-in-publishing-data-the-polar-bear/
The polar bear DATA was released –prepublication- in 2011 
They were used and cited in the following studies- before the main paper on 
the sequencing was published 
Hailer, F et al., Nuclear genomic sequences reveal that polar bears are an old 
and distinct bear lineage. Science. 2012 Apr 20;336(6079):344-7. 
doi:10.1126/science.1216424. 
Cahill, JA et al., Genomic evidence for island population conversion resolves 
conflicting theories of polar bear evolution. PLoS Genet. 2013;9(3):e1003345. 
doi:10.1371/journal.pgen.1003345. 
Morgan, CC et al., Heterogeneous models place the root of the placental 
mammal phylogeny. Mol Biol Evol. 2013 Sep;30(9):2145-56. 
doi:10.1093/molbev/mst117. 
Cronin, MA et al., Molecular Phylogeny and SNP Variation of Polar Bears 
(Ursus maritimus), Brown Bears (U. arctos), and Black Bears (U. americanus) 
Derived from Genome Sequences. J Hered. 2014; 105(3):312-23. 
doi:10.1093/jhered/est133. 
Bidon, T et al., Brown and Polar Bear Y Chromosomes Reveal Extensive Male- 
Biased Gene Flow within Brother Lineages. Mol Biol Evol. 2014 Apr 4. 
doi:10.1093/molbev/msu109
Cell Press Journals
However, this didn’t include the citation…
One step forward — two steps back
Removing data citations from the 
references 
One journal informed the authors that non-reviewed material could 
not be cited in the references of the paper 
Another journal stripped the data citation from the references- and 
went an extra step and changed the citation in the Data Availability 
section to the URL where the DOI directed it to at that time 
We happened to know about this one- and were able to create a forward to the 
DOI’d page when the URL broke after we moved our database platform 
Note: Much of this was due to a standard operating procedure in the 
production department 
Lesson: If you decide to include Data Citations- tell your entire team
For data citation to work, needs: 
• Acceptance by journals. 
• Data+Citation: inclusion in the references. 
• Tracking by citation indexes. 
• Usage of the metrics by the community…
For data citation to work, needs: 
• Acceptance by journals. 
• Data+Citation: inclusion in the references. 
• Tracking by citation indexes. 
• Usage of the metrics by the community… 
This is a work in progress…
Data Citation Really is a Major Incentive 
On Weds this week- we released the genome sequence 
from 3000 Rice strains (13.4 TB of data) 
• These data were also deposited in NIH SRA repository 
• So why did we do it too? 
1. It is linked directly to the Data Paper that provides 
details of data production, quality, and basic analysis 
2. Authors were hesitant to release these data (a HUGE 
community resource) prior to the analysis paper 
publication (which, for 3000 strains… would take 
years…). The opportunity to have these data citable 
(and trackable) encouraged the authors and led to 
their releasing these data and doing so in 
collaboration with GigaScience’s Biocurator 
The 3,000 Rice Genomes Project. (2014) GigaScience 3:7 http://dx.doi.org/10.1186/2047-217X-3-7; 
The 3000 Rice Genomes Project (2014) GigaScience Database. http://dx.doi.org/10.5524/200001
No: your data is not too large to share 
Rice 3K project: 3,000 rice genomes, 13.4TB public data 
IRRI GALAXY
Beyond Data Citation 
Reviewing Data 
Data Release policies include the need to 
help authors 
Data availability without metadata is 
practically useless
Beyond Data Citation 
Reviewing Data 
It’s too hard- we can’t ask our reviewers 
to do that! 
Use Data Reviewers
Example in Neuroscience 
1. Neuroscience Data 
are not typically 
shared 
2. For most papers: Data 
AND Tools are not 
typically made 
available to the 
reviewers 
3. Journal Editors think 
Reviewers will not 
want to review data 
GigaScience 2014, 3:3 doi:10.1186/2047-217X-3-3
Example in Neuroscience 
• Neuroscience Data are not typically shared 
• Author Dr. Stephen Eglen said: “One way of encouraging neuroscientists to 
share their data is to provide some form of academic credit.” 
• We hosted with a DOI: 366 recordings from 12 electrophysiology datasets 
• GigaDB is included in Thompson Reuters Data Citation Index 
• Data AND Tools are not typically made available to the reviewers 
• We made manuscript, data and tools all available to the reviewers. 
• We make sure to include reviewers who are able to properly assess the data 
itself and rerun the tools 
• To reduce burdens- we sometimes select a reviewer who ONLY looks at the 
data. 
• Journal Editors think Reviewers will not want to review data 
• What Reviewer Dr. Thomas Wachtler said: “The paper by Eglen and 
colleagues is a shining example of openness in that it enables replicating the 
results almost as easily as by pressing a button.” 
• What Reviewer Dr. Christophe Pouzat said: “In addition to making the 
presented research trustworthy, the reproducible research paradigm 
definitely makes the reviewers job more fun!”
Beyond Data Citation 
Data Release policies include the need to 
help authors 
Collaborations 
With data repositories 
With other journals
Consider Cross Journal Support 
Competition is good… 
….but sometimes we should collaborate 
for the community good 
• PLoS recent data deposition policies have led to 
community concerns about feasibility. 
• We support (and applaud) this …we have an even stricter 
data deposition policy 
• But- PLoS ONE received a submission that was a 
comparative study of earthworm morphology and 
anatomy using a 3D non-invasive imaging technique 
called micro-computed tomography (or microCT) …And 
there is no good place to put this 
• These data are extremely complex, videos, multiple files-with 
several folders of ~10 GB
Consider Cross Journal Support 
• GigaScience and PLOS ONE collaborated. They published 
the main article; we published a Data Note describing the 
data itself and hosted all the data on GigaDB under 
separate citation. 
• With our Aspera Connection- reviewers could download 
even the 10 TB folders in ~1/2 hour 
• Reviewer Dr. Sarah Faulwetter noted the usefulness of 
having these data available, saying: Instead of having to 
go through the lengthy process of obtaining the physical 
specimen from a museum, I can now download a fairly 
accurate representation from the web. 
Lenihan et al (2014). GigaScience, 3:6 http://dx.doi.org/10.1186/2047-217X-3-6; Lenihan, et al (2014): GigaScience Database. 
http://dx.doi.org/10.5524/100092; Fernández et al (2014) PLOS ONE 9 (5) e96617 http://dx.doi.org/10.1371/journal.pone.0096617
Beyond Data Citation 
Data availability without metadata is 
practically useless 
Engage/Employ/Interact with Curators
Challenges for the future… 
1. Lack of interoperability/sufficient metadata 
2. Long tail of curation (“Democratization” of “big-data”) 
?
Think about what you do… and what you can do… 
• Promote- rather than inhibit- prepublication data sharing 
• Promote Data Citation in the reference section 
– incentivizes data release 
– Makes it easier for readers to find 
• Promote Data Sharing upon publication 
– Consider your data release policies 
• Form collaborations with repositories to aid authors in depositing 
their work 
– Identify community organizations with metadata standards 
• Make data available for reviewers (author website, community 
repositories, dryad and similar (your publisher?) 
– at least do a sanity check 
– Use “data reviewers” 
No- this isn’t easy, but do what you can now 
And work toward the rest 
Evolve
It’s Time to Move Beyond 
Dead Trees 
1665 1812 1869
Thanks to: 
Scott Edmunds, Executive Editor 
Nicole Nogoy, Commissioning Editor 
Peter Li, Lead Data Manager 
Chris Hunter, Lead BioCurator 
Rob Davidson, Data Scientist 
Xiao (Jesse) Si Zhe, Database Developer 
Amye Kenall, Journal Development Manager 
Contact us: 
editorial@gigasciencejournal.com 
database@gigasciencejournal.com 
Follow us: 
@GigaScience 
facebook.com/GigaScience 
blogs.openaccesscentral.com/blogs/gigablog 
www.gigasciencejournal.com 
www.gigadb.org

Contenu connexe

Tendances

Scott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire TalkScott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire TalkGigaScience, BGI Hong Kong
 
Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...
Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...
Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...GigaScience, BGI Hong Kong
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...GigaScience, BGI Hong Kong
 
Roche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLRoche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLDominique Roche
 
E Research Chapter 1
E Research Chapter 1E Research Chapter 1
E Research Chapter 1guest2426e1d
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Evolution of e-Research
Evolution of e-ResearchEvolution of e-Research
Evolution of e-ResearchDavid De Roure
 
Reproducibility, argument and data in translational medicine
Reproducibility, argument and data in translational medicineReproducibility, argument and data in translational medicine
Reproducibility, argument and data in translational medicineTim Clark
 
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...Scott Edmunds
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the futurePistoia Alliance
 
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...Scott Edmunds
 
Thesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defenseThesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defenseHeather Piwowar
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...GigaScience, BGI Hong Kong
 
Nicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do researchNicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do researchGigaScience, BGI Hong Kong
 

Tendances (20)

Scott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire TalkScott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
Scott Edmunds: GigaScience Datacite meeting Rapid Fire Talk
 
Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...
Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...
Scott Edmunds, HKU Open Access Week: Experiences from the front-line of Open ...
 
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
Nicole Nogoy's talk at eResearchNZ 2014: Improving data sharing, integration ...
 
Roche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NLRoche_open_science_NIOO_KNAW_workshop_NL
Roche_open_science_NIOO_KNAW_workshop_NL
 
Nicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShowNicole Nogoy at the Auckland BMC RoadShow
Nicole Nogoy at the Auckland BMC RoadShow
 
E Research Chapter 1
E Research Chapter 1E Research Chapter 1
E Research Chapter 1
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Evolution of e-Research
Evolution of e-ResearchEvolution of e-Research
Evolution of e-Research
 
Reproducibility, argument and data in translational medicine
Reproducibility, argument and data in translational medicineReproducibility, argument and data in translational medicine
Reproducibility, argument and data in translational medicine
 
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
Scott Edmunds talk at ODHK.meet.26: Open Science Data = Open Data (a rant in ...
 
Lifesavingcomputer a
Lifesavingcomputer aLifesavingcomputer a
Lifesavingcomputer a
 
Pepe "Enriching Preprints with Provenance, Reproducibility, and Trustworthiness"
Pepe "Enriching Preprints with Provenance, Reproducibility, and Trustworthiness"Pepe "Enriching Preprints with Provenance, Reproducibility, and Trustworthiness"
Pepe "Enriching Preprints with Provenance, Reproducibility, and Trustworthiness"
 
Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration
Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration
Open Drug Discovery Teams: A Chemistry Mobile App for Collaboration
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the future
 
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
BGI training lecture: Scott Edmunds - Science 2.0, why new developments on th...
 
McIntosh "Improving the quality of preprints with automated checks"
McIntosh "Improving the quality of preprints with automated checks"McIntosh "Improving the quality of preprints with automated checks"
McIntosh "Improving the quality of preprints with automated checks"
 
Thesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defenseThesis Proposal, as presented for dissertation proposal defense
Thesis Proposal, as presented for dissertation proposal defense
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
HRB-Health Research In Action booklet (feat. NICB)
HRB-Health Research In Action booklet (feat. NICB)HRB-Health Research In Action booklet (feat. NICB)
HRB-Health Research In Action booklet (feat. NICB)
 
Nicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do researchNicole Nogoy: GigaScience...how licensing can change the way we do research
Nicole Nogoy: GigaScience...how licensing can change the way we do research
 

En vedette

Right time, right place, to change the world
Right time, right place, to change the worldRight time, right place, to change the world
Right time, right place, to change the worldHeather Piwowar
 
Public Sharing of Research Datasets: A Pilot Study of Associations
Public Sharing of Research Datasets: A Pilot Study of Associations Public Sharing of Research Datasets: A Pilot Study of Associations
Public Sharing of Research Datasets: A Pilot Study of Associations Heather Piwowar
 
Analyzing data about our data
Analyzing data about our dataAnalyzing data about our data
Analyzing data about our dataHeather Piwowar
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset useHeather Piwowar
 
Text Mining Rights from Three Perspectives: Researcher.
Text Mining Rights from Three Perspectives: Researcher.Text Mining Rights from Three Perspectives: Researcher.
Text Mining Rights from Three Perspectives: Researcher.Heather Piwowar
 
Libraries empowering scholars (and scholarly communication) through #altmetrics
Libraries empowering scholars (and scholarly communication) through #altmetricsLibraries empowering scholars (and scholarly communication) through #altmetrics
Libraries empowering scholars (and scholarly communication) through #altmetricsHeather Piwowar
 
submission summary for #WSSSPE Policy session on Credit, Citation, and Impact
submission summary for #WSSSPE Policy session on Credit, Citation, and Impactsubmission summary for #WSSSPE Policy session on Credit, Citation, and Impact
submission summary for #WSSSPE Policy session on Credit, Citation, and ImpactHeather Piwowar
 
Software-Native metrics: Depsy lessons learned
Software-Native metrics: Depsy lessons learnedSoftware-Native metrics: Depsy lessons learned
Software-Native metrics: Depsy lessons learnedHeather Piwowar
 
Building Skyscrapers with our Scholarship
Building Skyscrapers with our ScholarshipBuilding Skyscrapers with our Scholarship
Building Skyscrapers with our ScholarshipHeather Piwowar
 

En vedette (9)

Right time, right place, to change the world
Right time, right place, to change the worldRight time, right place, to change the world
Right time, right place, to change the world
 
Public Sharing of Research Datasets: A Pilot Study of Associations
Public Sharing of Research Datasets: A Pilot Study of Associations Public Sharing of Research Datasets: A Pilot Study of Associations
Public Sharing of Research Datasets: A Pilot Study of Associations
 
Analyzing data about our data
Analyzing data about our dataAnalyzing data about our data
Analyzing data about our data
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset use
 
Text Mining Rights from Three Perspectives: Researcher.
Text Mining Rights from Three Perspectives: Researcher.Text Mining Rights from Three Perspectives: Researcher.
Text Mining Rights from Three Perspectives: Researcher.
 
Libraries empowering scholars (and scholarly communication) through #altmetrics
Libraries empowering scholars (and scholarly communication) through #altmetricsLibraries empowering scholars (and scholarly communication) through #altmetrics
Libraries empowering scholars (and scholarly communication) through #altmetrics
 
submission summary for #WSSSPE Policy session on Credit, Citation, and Impact
submission summary for #WSSSPE Policy session on Credit, Citation, and Impactsubmission summary for #WSSSPE Policy session on Credit, Citation, and Impact
submission summary for #WSSSPE Policy session on Credit, Citation, and Impact
 
Software-Native metrics: Depsy lessons learned
Software-Native metrics: Depsy lessons learnedSoftware-Native metrics: Depsy lessons learned
Software-Native metrics: Depsy lessons learned
 
Building Skyscrapers with our Scholarship
Building Skyscrapers with our ScholarshipBuilding Skyscrapers with our Scholarship
Building Skyscrapers with our Scholarship
 

Similaire à Rapid Data Sharing Promotes Public Health

EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecutureScott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecutureScott Edmunds
 
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data eraScott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data eraGigaScience, BGI Hong Kong
 
Open Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott EdmundsOpen Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott EdmundsScott Edmunds
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
 
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...GigaScience, BGI Hong Kong
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience, BGI Hong Kong
 
Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...Jason Bobe
 
CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...
CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...
CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...Todd Suomela
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global EcosystemPhilip Bourne
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementDag Endresen
 
Stat 1040, Recitation packet 11. A 1999 study claimed that.docx
Stat 1040, Recitation packet 11. A 1999 study claimed that.docxStat 1040, Recitation packet 11. A 1999 study claimed that.docx
Stat 1040, Recitation packet 11. A 1999 study claimed that.docxdessiechisomjj4
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019Dag Endresen
 
CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...
CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...
CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...CINECAProject
 
Opening up to Diversity talk by @phylogenomics at #UCDPHSA
Opening up to Diversity talk by @phylogenomics at #UCDPHSAOpening up to Diversity talk by @phylogenomics at #UCDPHSA
Opening up to Diversity talk by @phylogenomics at #UCDPHSAJonathan Eisen
 

Similaire à Rapid Data Sharing Promotes Public Health (20)

EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecutureScott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
Scott Edmunds & Mendel Wong, Citizen Science #101. HKU MPA lecuture
 
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data eraScott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
Scott Edmunds ICIS talk at UC Davis: Open Publishing for the Big Data era
 
Open Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott EdmundsOpen Data HK: open science meets open data. A primer from Scott Edmunds
Open Data HK: open science meets open data. A primer from Scott Edmunds
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...
 
GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.GigaScience: a new resource for the big-data community.
GigaScience: a new resource for the big-data community.
 
Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...Participant-centered research design and “equal access” data sharing practice...
Participant-centered research design and “equal access” data sharing practice...
 
CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...
CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...
CCI32 - Citizen Participation in the Biological Sciences: A Literature Review...
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data management
 
Dual-use Research, H5N1
Dual-use Research, H5N1Dual-use Research, H5N1
Dual-use Research, H5N1
 
Stat 1040, Recitation packet 11. A 1999 study claimed that.docx
Stat 1040, Recitation packet 11. A 1999 study claimed that.docxStat 1040, Recitation packet 11. A 1999 study claimed that.docx
Stat 1040, Recitation packet 11. A 1999 study claimed that.docx
 
Science as open enterprise
Science as open enterpriseScience as open enterprise
Science as open enterprise
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019
 
CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...
CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...
CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...
 
Reaching out to collaborators and crowdsourcing for pharmaceutical research
Reaching out to collaborators and crowdsourcing for pharmaceutical research  Reaching out to collaborators and crowdsourcing for pharmaceutical research
Reaching out to collaborators and crowdsourcing for pharmaceutical research
 
Opening up to Diversity talk by @phylogenomics at #UCDPHSA
Opening up to Diversity talk by @phylogenomics at #UCDPHSAOpening up to Diversity talk by @phylogenomics at #UCDPHSA
Opening up to Diversity talk by @phylogenomics at #UCDPHSA
 
Plosslides
PlosslidesPlosslides
Plosslides
 
PLOS slides
PLOS slidesPLOS slides
PLOS slides
 

Plus de Crossref

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021 Crossref
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Crossref
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowCrossref
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...Crossref
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolCrossref
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...Crossref
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionCrossref
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...Crossref
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaCrossref
 
crossmark update
crossmark updatecrossmark update
crossmark updateCrossref
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020Crossref
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020Crossref
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarCrossref
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK OnlineCrossref
 

Plus de Crossref (20)

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to know
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en español
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de Investigacion
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, Indonesia
 
crossmark update
crossmark updatecrossmark update
crossmark update
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK Online
 

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 

Rapid Data Sharing Promotes Public Health

  • 1. Ways and Needs to Promote Rapid Data Sharing Laurie Goodman, PhD Editor-in-Chief GigaScience ORCID ID: 0000-0001-9724-5976
  • 2. Scientific Communication Via Publication • Scholarly articles are merely advertisement of scholarship . The actual scholarly artefacts, i.e. the data and computational methods, which support the scholarship, remain largely inaccessible --- Jon B. Buckheit and David L. Donoho, WaveLab and reproducible research, 1995 • Core scientific statements or assertions are intertwined and hidden in the conventional scholarly narratives • Lack of transparency, lack of credit for anything other than “regular” dead tree publication
  • 3. A Tale of Two Bacteria 1. On May 2, 2011 German Doctors Reported the first case of an E.coli infection, that was accompanied by hemolytic-uremic syndrome 2. On May 21, 2011 the first death occurred from this bacteria (denoted E.coli O104:H4) 3. On June 3, 2014, BGI completed a draft sequence of E.coli O104:H4 from a sample provided by doctors at the University Medical Centre Hamburg-Eppendorf 4. At this point- the leaders at BGI held a discussion about whether to release the sequence data immediately: what were the potential repercussions of doing so The question arose: If the data were released now- would it affect their ability to publish later?
  • 4. A Tale of Two Bacteria • In one world- the researchers — who were concerned about their ability to publish as this is the way to obtain recognition and obtain grants (which are essential for them to work) — waited. The first publication appeared on July 29th • In another world, the researchers — who decided public health was more important than obtaining a publication — released the data immediately. The first publication appeared on July 29th — but was not from that group who released the data (though information on that data was included.
  • 5. Whether the concern about the ability to publish if data are released early is real or imagined Researchers act on that concern
  • 6. Whether the concern about the ability to publish if data are released early is real or imagined Researchers act on that concern
  • 7. These data were put on an FTP server under a CCO waiver and also given a DOI to make access ‘permanent’ To maximize its utility to the research community and aid those fighting the current epidemic, genomic data is released here into the public domain under a CC0 license. Until the publication of research papers on the assembly and whole-genome analysis of this isolate we would ask you to cite this dataset as: Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang, Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun, Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ; Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482 isolate genome sequencing consortium (2011) Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen. doi:10.5524/100001 http://dx.doi.org/10.5524/100001 To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
  • 8.
  • 9.
  • 10. Downstream consequences: 1. Citations (~180) 2. Therapeutics (primers, antimicrobials) 3. Platform Comparisons 4. Example for faster & more open science “Last summer, biologist Andrew Kasarskis was eager to help decipher the genetic origin of the Escherichia coli strain that infected roughly 4,000 people in Germany between May and July. But he knew it that might take days for the lawyers at his company — Pacific Biosciences — to parse the agreements governing how his team could use data collected on the strain. Luckily, one team had released its data under a Creative Commons licence that allowed free use of the data, allowing Kasarskis and his colleagues to join the international research effort and publish their work without wasting time on legal wrangling.”
  • 11. 1.3 The power of intelligently open data The benefits of intelligently open data were powerfully illustrated by events following an outbreak of a severe gastro-intestinal infection in Hamburg in Germany in May 2011. This spread through several European countries and the US, affecting about 4000 people and resulting in over 50 deaths. All tested positive for an unusual and little-known Shiga-toxin– producing E. coli bacterium. The strain was initially analysed by scientists at BGI-Shenzhen in China, working together with those in Hamburg, and three days later a draft genome was released under an open data licence. This generated interest from bioinformaticians on four continents. 24 hours after the release of the genome it had been assembled. Within a week two dozen reports had been filed on an open-source site dedicated to the analysis of the strain. These analyses provided crucial information about the strain’s virulence and resistance genes – how it spreads and which antibiotics are effective against it. They produced results in time to help contain the outbreak. By July 2011, scientists published papers based on this work. By opening up their early sequencing results to international collaboration, researchers in Hamburg produced results that were quickly tested by a wide range of experts, used to produce new knowledge and ultimately to control a public health emergency.
  • 12. All that aside Can we all agree that releasing the E.coli data ahead of publication was ‘good’ At least from a public health perspective Here are the numbers for the E.coli 2011 Outbreak In total, ~4000 people were infected and 53 died
  • 13. From a Public Health perspective…Deaths Worldwide* Infectious Disease Measles: 122,000 per year Hepatitis C-related liver disease: 350,000-500,000 per year Malaria: 627,000 per year HIV/AIDS: 1.4-1.7 million per year Non-communicable, with genetic predisposition Prostate cancer: 307,000 per year Breast cancer: 522,000 per year Suicide: 800,000 per year Diabetes: 1.5 million per year Cancer: 8.2 million per year Cardiovascular Disease: 17.5 million per year Non-genetic/Non-infectious Pesticide Poisoning: 250,000 per year Malnutrition: 2.8 million children (under 5) per year *World Health Organization Fact Sheets http://www.who.int/en/
  • 14. Sharing Data is Essential for Many Reasons
  • 15. Sharing aids fields… Rice v Wheat: consequences of publically available genome data 700 600 500 400 300 200 100 0 rice wheat Every 10 datasets collected contributes to at least 4 papers in the following 3-years. Piwowar, HA, Vision, TJ, & Whitlock, MC (2011). Data archiving is a good investment Nature, 473 (7347), 285-285 DOI: 10.1038/473285a
  • 16. Sharing aids authors… Sharing Detailed Research Data Is Associated with Increased Citation Rate. Piwowar HA, Day RS, Fridsma DB (2007) PLoS ONE 2(3): e308. doi:10.1371/journal.pone.0000308
  • 17. Lack of Sharing Impacts Reproducibility Out of 18 microarray papers, results from 10 could not be reproduced 1. Ioannidis et al., (2009). Repeatability of published microarray gene expression analyses. Nature Genetics 41: 14 2. Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8)
  • 18. Sharing can reduce retractions >15X increase in last decade Strong correlation of “retraction index” with higher impact factor At current % increase by 2045 as many papers published as retracted! 1. Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.html 2. Retracted Science and the Retraction Index ▿ http://iai.asm.org/content/79/10/3855.abstract?
  • 19. Data Sharing Hurdles ? If only it were easy… There are numerous reasons why researchers do not share data: The majority of which are good reasons
  • 20. Wiley Researcher Data Insights Survey Our objective was to establish a baseline view of data sharing practices, attitudes, and motivations globally, with participation from researchers in every scholarly field. In March 2014, more than 90,000 researchers around the world were invited to participate in Wiley’s Researcher Data Insights Survey. Participants were researchers who had published at least one journal article in the past year with any publisher. We received an overwhelming 2,886 responses from around the world. Slide from Catherine Giffi, Director, Strategic Market Analysis, Global Research, Wiley
  • 21. Wiley Researcher Data Insights Survey Key Findings • Most researchers are sharing their data. • Those not sharing have a variety of reasons. • Data that’s being shared typically is <10 GB. • The most common type of data that is being shared is flat, tabular data (.csv, .txt, .xl) • Data is usually saved on hard drives. Slide from Catherine Giffi, Director, Strategic Market Analysis, Global Research, Wiley
  • 22. Wiley Researcher Data Insights Survey Why Researchers Do Not Share • Intellectual property or confidentiality issues (59%) • Concerned research might be “scooped” (39%) • Concerns about misinterpretation or misuse (32%) • Concerns about attribution/citation credit (31%) • Ethical concerns (24%) • Insufficient time/resources (19%) • Funder/institution does not require sharing (13%) • Lack of funding (13%) • Not sure where to share (5%) • Not sure how to share (3%) Slide from Catherine Giffi, Director, Strategic Market Analysis, Global Research, Wiley See also: http://exchanges.wiley.com/blog/2014/11/03/how-and-why-researchers-share-data-and-why-they-dont/ http://scholarlykitchen.sspnet.org/2014/11/11/to-share-or-not-to-share-that-is-the-research-data-question/
  • 23. How Can Publishers Promote Data Sharing Researchers are never so captive as when they publishing But we need to help — not just harass. Carrots and Sticks And- why us? – Create Journal Data Release Policies – Check Data Release Policy is followed – Find Ways to Aid Researchers in Releasing Data – Consider ways to support/protect researchers who do share ahead of publications – Promote Data Citation
  • 24. How Can Publishers Promote Data Sharing Researchers are never so captive as when they publishing But we need to help — not just harass. Carrots and Sticks And- why us? – Create Journal Data Release Policies – Check Data Release Policy is followed – Find Ways to Aid Researchers in Releasing Data – Consider ways to support/protect researchers who do share ahead of publications – Promote Data Citation
  • 25. Incentives/credit Credit where credit is overdue: “One option would be to provide researchers who release data to public repositories with a means of accreditation.” “An ability to search the literature for all online papers that used a particular data set would enable appropriate attribution for those who share. “ Nature Biotechnology 27, 579 (2009) Prepublication data sharing (Toronto International Data Release Workshop) “Data producers benefit from creating ? a citable reference, as it can later be used to reflect impact of the data sets.” Nature 461, 168-170 (2009)
  • 26. Genomics Data Sharing Policies… Bermuda Accords 1996/1997/1998: 1. Automatic release of sequence assemblies within 24 hours. 2. Immediate publication of finished annotated sequences. 3. Aim to make the entire sequence freely available in the public domain for both research and development in order to maximise benefits to society. Fort Lauderdale Agreement, 2003: 1. Sequence traces from whole genome shotgun projects are to be deposited in a trace archive within one week of production. 2. Whole genome assemblies are to be deposited in a public nucleotide sequence database as soon as possible after the assembled sequence has met a set of quality evaluation criteria. Toronto International data release workshop, 2009: The goal was to reaffirm and refine, where needed, the policies related to the early release of genomic data, and to extend, if possible, similar data release policies to other types of large biological datasets – whether from proteomics, biobanking or metabolite research.
  • 27. Sharing Data from Large-scale Biological Research Projects: A System of Tripartite Responsibility (From the Fort Lauderdale Meeting 2003) http://www.genome.gov/pages/research/wellcomereport0303.pdf
  • 28. Citing Data Isn’t New The Physical Sciences have been doing this for a while DataCite and DOIs “increase acceptance of research data as legitimate, citable contributions to the scholarly record”. Aims to: “data generated in the course of research are just as valuable to the ongoing academic discourse as papers and monographs”.
  • 29. How We Envision Research Publication (Communicating Science) Open-access journal Data Publishing Platform Data Sets in GigaDB Analyses in GigaGalaxy Paper in GigaScience Data Analysis Platform
  • 30. Other Journals are now doing similar This is most commonly done in the form of a Data Paper rather than a release of data that is citable in itself. • A Data Paper is affectively a Description of the Data • Other journals that do Data Publishing as a formal paper type • F1000 Research (launched in 2012) • Has Data papers as one of several types of papers • Scientific Data (launched in 2014) • Solely publishes Data Descriptors • There are more…
  • 31. Making the Data Itself Citable We provide a linked database The data are then directly linked to the paper- but can also be cited separately through a Data DOI We can do this because we have a collaboration between BMC (who handles the standard paper publication) and BGI (which has enormous data storage capacity.) However: There are many community available databases- so in principle- any journal can do this by taking advantage of such available resources. These include the usual suspects: EBI, NCBI, DDBJ etc. Databases that take all data types and provide Data DOIs: Dryad, FigShare, etc. There are also numerous smaller community databases specific to different fields or data types.
  • 32. For data citation to work, needs: • Acceptance by journals. • Data+Citation: inclusion in the references. • Tracking by citation indexes. • Usage of the metrics by the community…
  • 33. For data citation to work, needs: • Acceptance by journals. • Data+Citation: inclusion in the references. • Tracking by citation indexes. • Usage of the metrics by the community…
  • 35. Back to E.coli O104:H4 • As noted: articles on these early released and citable data were published • Also- the early releasers were not the first to publish • Nor was the data cited
  • 36. This open-source analysis work was published on August 25th
  • 37. The journal did not approve of inclusion of the data citation. Nor was any indication of where the genome information could be found
  • 38.
  • 39. This report was the first to be publisher- and it included and used information from the crowd-source release as well as the other early release. No where in the paper is there any indication of where to obtain this data Nor is there an indication of where to obtain the sequence data they generated
  • 40. This group made their 0104:H4 sequence available at the time of completion- prior to publication in the NCBI database. Though no link to the Accession Number is easily found in the paper.
  • 41. This report DID include a reference for the data (even though they did not use it in their analysis)
  • 42.
  • 43. For data citation to work, needs: • Acceptance by journals. • Data+Citation: inclusion in the references. • Tracking by citation indexes. • Usage of the metrics by the community…
  • 45. • Data submitted to NCBI databases: - Raw data SRA:SRA046843 - Assemblies of 3 strains Genbank:AHAO00000000-AHAQ00000000 - SNPs dbSNP:1056306 - CNVs - InDels } dbVAR:nstd63 - SV • Submission to public databases complemented by its citable form in GigaDB (doi:10.5524/100012).
  • 46.
  • 50.
  • 52. The polar bear DATA was released –prepublication- in 2011 They were used and cited in the following studies- before the main paper on the sequencing was published Hailer, F et al., Nuclear genomic sequences reveal that polar bears are an old and distinct bear lineage. Science. 2012 Apr 20;336(6079):344-7. doi:10.1126/science.1216424. Cahill, JA et al., Genomic evidence for island population conversion resolves conflicting theories of polar bear evolution. PLoS Genet. 2013;9(3):e1003345. doi:10.1371/journal.pgen.1003345. Morgan, CC et al., Heterogeneous models place the root of the placental mammal phylogeny. Mol Biol Evol. 2013 Sep;30(9):2145-56. doi:10.1093/molbev/mst117. Cronin, MA et al., Molecular Phylogeny and SNP Variation of Polar Bears (Ursus maritimus), Brown Bears (U. arctos), and Black Bears (U. americanus) Derived from Genome Sequences. J Hered. 2014; 105(3):312-23. doi:10.1093/jhered/est133. Bidon, T et al., Brown and Polar Bear Y Chromosomes Reveal Extensive Male- Biased Gene Flow within Brother Lineages. Mol Biol Evol. 2014 Apr 4. doi:10.1093/molbev/msu109
  • 54. However, this didn’t include the citation…
  • 55. One step forward — two steps back
  • 56. Removing data citations from the references One journal informed the authors that non-reviewed material could not be cited in the references of the paper Another journal stripped the data citation from the references- and went an extra step and changed the citation in the Data Availability section to the URL where the DOI directed it to at that time We happened to know about this one- and were able to create a forward to the DOI’d page when the URL broke after we moved our database platform Note: Much of this was due to a standard operating procedure in the production department Lesson: If you decide to include Data Citations- tell your entire team
  • 57. For data citation to work, needs: • Acceptance by journals. • Data+Citation: inclusion in the references. • Tracking by citation indexes. • Usage of the metrics by the community…
  • 58.
  • 59. For data citation to work, needs: • Acceptance by journals. • Data+Citation: inclusion in the references. • Tracking by citation indexes. • Usage of the metrics by the community… This is a work in progress…
  • 60. Data Citation Really is a Major Incentive On Weds this week- we released the genome sequence from 3000 Rice strains (13.4 TB of data) • These data were also deposited in NIH SRA repository • So why did we do it too? 1. It is linked directly to the Data Paper that provides details of data production, quality, and basic analysis 2. Authors were hesitant to release these data (a HUGE community resource) prior to the analysis paper publication (which, for 3000 strains… would take years…). The opportunity to have these data citable (and trackable) encouraged the authors and led to their releasing these data and doing so in collaboration with GigaScience’s Biocurator The 3,000 Rice Genomes Project. (2014) GigaScience 3:7 http://dx.doi.org/10.1186/2047-217X-3-7; The 3000 Rice Genomes Project (2014) GigaScience Database. http://dx.doi.org/10.5524/200001
  • 61. No: your data is not too large to share Rice 3K project: 3,000 rice genomes, 13.4TB public data IRRI GALAXY
  • 62. Beyond Data Citation Reviewing Data Data Release policies include the need to help authors Data availability without metadata is practically useless
  • 63. Beyond Data Citation Reviewing Data It’s too hard- we can’t ask our reviewers to do that! Use Data Reviewers
  • 64. Example in Neuroscience 1. Neuroscience Data are not typically shared 2. For most papers: Data AND Tools are not typically made available to the reviewers 3. Journal Editors think Reviewers will not want to review data GigaScience 2014, 3:3 doi:10.1186/2047-217X-3-3
  • 65. Example in Neuroscience • Neuroscience Data are not typically shared • Author Dr. Stephen Eglen said: “One way of encouraging neuroscientists to share their data is to provide some form of academic credit.” • We hosted with a DOI: 366 recordings from 12 electrophysiology datasets • GigaDB is included in Thompson Reuters Data Citation Index • Data AND Tools are not typically made available to the reviewers • We made manuscript, data and tools all available to the reviewers. • We make sure to include reviewers who are able to properly assess the data itself and rerun the tools • To reduce burdens- we sometimes select a reviewer who ONLY looks at the data. • Journal Editors think Reviewers will not want to review data • What Reviewer Dr. Thomas Wachtler said: “The paper by Eglen and colleagues is a shining example of openness in that it enables replicating the results almost as easily as by pressing a button.” • What Reviewer Dr. Christophe Pouzat said: “In addition to making the presented research trustworthy, the reproducible research paradigm definitely makes the reviewers job more fun!”
  • 66. Beyond Data Citation Data Release policies include the need to help authors Collaborations With data repositories With other journals
  • 67. Consider Cross Journal Support Competition is good… ….but sometimes we should collaborate for the community good • PLoS recent data deposition policies have led to community concerns about feasibility. • We support (and applaud) this …we have an even stricter data deposition policy • But- PLoS ONE received a submission that was a comparative study of earthworm morphology and anatomy using a 3D non-invasive imaging technique called micro-computed tomography (or microCT) …And there is no good place to put this • These data are extremely complex, videos, multiple files-with several folders of ~10 GB
  • 68. Consider Cross Journal Support • GigaScience and PLOS ONE collaborated. They published the main article; we published a Data Note describing the data itself and hosted all the data on GigaDB under separate citation. • With our Aspera Connection- reviewers could download even the 10 TB folders in ~1/2 hour • Reviewer Dr. Sarah Faulwetter noted the usefulness of having these data available, saying: Instead of having to go through the lengthy process of obtaining the physical specimen from a museum, I can now download a fairly accurate representation from the web. Lenihan et al (2014). GigaScience, 3:6 http://dx.doi.org/10.1186/2047-217X-3-6; Lenihan, et al (2014): GigaScience Database. http://dx.doi.org/10.5524/100092; Fernández et al (2014) PLOS ONE 9 (5) e96617 http://dx.doi.org/10.1371/journal.pone.0096617
  • 69. Beyond Data Citation Data availability without metadata is practically useless Engage/Employ/Interact with Curators
  • 70. Challenges for the future… 1. Lack of interoperability/sufficient metadata 2. Long tail of curation (“Democratization” of “big-data”) ?
  • 71. Think about what you do… and what you can do… • Promote- rather than inhibit- prepublication data sharing • Promote Data Citation in the reference section – incentivizes data release – Makes it easier for readers to find • Promote Data Sharing upon publication – Consider your data release policies • Form collaborations with repositories to aid authors in depositing their work – Identify community organizations with metadata standards • Make data available for reviewers (author website, community repositories, dryad and similar (your publisher?) – at least do a sanity check – Use “data reviewers” No- this isn’t easy, but do what you can now And work toward the rest Evolve
  • 72. It’s Time to Move Beyond Dead Trees 1665 1812 1869
  • 73. Thanks to: Scott Edmunds, Executive Editor Nicole Nogoy, Commissioning Editor Peter Li, Lead Data Manager Chris Hunter, Lead BioCurator Rob Davidson, Data Scientist Xiao (Jesse) Si Zhe, Database Developer Amye Kenall, Journal Development Manager Contact us: editorial@gigasciencejournal.com database@gigasciencejournal.com Follow us: @GigaScience facebook.com/GigaScience blogs.openaccesscentral.com/blogs/gigablog www.gigasciencejournal.com www.gigadb.org

Notes de l'éditeur

  1. Thank you very much to the Meeting Organizers for Inviting me to Speak.
  2. Happily we live in the 2nd world- but, that the fact that even gave them pause
  3. Isn’t hyperbole fun?
  4. And a paper by the group was published in a high impact journal even though the data were released early in a citable format
  5. The data were released on an FTP server, and were given a data DOI should the data need to be cited in a more permanent fashion.
  6. Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data to dbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  7. Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data to dbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  8. Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data to dbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  9. Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data to dbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
  10. (A) Cumulative base pairs in INSDC over time, excluding the Trace Archive (raw data from capillary sequencing platforms). (B) Base pairs in INSDC over time since 1980, broken down into selected data components. Cumulative data volume in base pairs broken down into assembled sequence (whole genome shotgun methods and others) and raw next-generation-sequence data.