SlideShare une entreprise Scribd logo
Looking for Data:
Finding New Science
Anita de Waard
VP Research Data Collaborations
a.dewaard@elsevier.com
http://researchdata.elsevier.com/
Why should science publishers care
about Research Data?Funding bodies:
 Demonstrate impact
 Guarantee permanence,
discoverability
 Avoid fraud
 Avoid double funding
 Serve general public
Research Management/Libary:
 Generate, track outputs
 Comply with mandates
 Ensure availability
Phil Bourne, (then) Associate Vice Chancellor, UCSD, 4/13:
“We need to think about the university as a digital enterprise.”
Mike Huerta, Ass. Director NLM:
“Today, the major public product of science are concepts, written
down in papers. But tomorrow, data will be the main product of
science…. We will require scientists to track and share their data as
least as well, if not better, than they are sharing their ideas today.”
Researchers:
 Derive credit
 Comply with mandates
 Discover and use
 Cite/acknowledge
Nathan Urban, PI Urban Lab, CMU, 3/13:
“If we can share our data, we can write a paper that will knock
everybody’s socks off!”
Barbara Ransom, NSF Program Director Earth Sciences:
“We’re not going to spend any more money for you to go out and get
more data! We want you first to show us how you’re going to use all
the data we paid y’all to collect in the past!”
Research data management today:
Using antibodies
and squishy bits
Grad Students experiment
and enter details into their
lab notebook.
The PI then tries to make
sense of their slides,
and writes a paper.
End of story.
Prepare
Observe
Analyze
Ponder
Communicate
Prepare
Observe
Analyze
Ponder
Communicate
Most of biology is quite insular
But it is also VERY complicated:
http://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg
• Interspecies variability: A specimen is not a species
• Gene expression variability: Knowing genes is not
knowing how they are expressed
• Microbiome: An animal is an ecosystem
• Systems biology: A whole is more than the sum of its parts
• Male researchers stress out rodents!
Reductionist science
does not work
for living systems!
Statistics to the rescue!
What if the research data was connected?
Prepare
Analyze Communicate
Prepare
Analyze Communicate
Observations
Observations
Observations
Across labs,
experiments: track
reagents and how they
are used
Prepare
Analyze Communicate
Prepare
Analyze Communicate
Observations
Observations
Observations
Compare outcome of
interactions with these
entities
What if the research data was connected?
Prepare
Analyze Communicate
Prepare
AnalyzeCommunicate
Observations
Observations
Observations
Build a ‘virtual reagent
spectrogram’ by comparing
how different entities
interacted in different
experiments
Think
What if the research data was connected?
Maslow Hierarchy of Research Data Needs:
Use
ful
Trusted
Reproducible
Discoverable
Comprehensible
Archived
Accessible
Preserved in digital format
1: Urban Legend
How can we make a standard
neuroscience wet lab store and
share their data?
• Incorporate structured workflows into
the daily practice of a typical
electrophysiology lab (the Urban Lab at
CMU)
– What does it take?
– Where are points of conflict?
• 1-year pilot, funded by Elsevier RDS:
– CMU: Shreejoy Tripathy, manage/user test
– Elsevier: development, UI, project management
• Next steps: NIH grant to scale up to 4 labs
Use
ful
Trusted
Reproducible
Discoverable
Comprehensible
Archived
Accessible
Preserved in digital
format
de Waard, A., Burton, S. et al., 2013
Urban Legend Components
Data Entry App:
Data dashboard
(e.g. SDB140225c4_onbeam_CC)
2: Moonrocks
How can we scale up data curation?
Pilot project with IEDA:
• Build a database for lunar geochemistry
• Leapfrog & improve curation time
• Write joint report on processes, costs
and challenges
• 1-year pilot, funded by Elsevier
• Next step: NSF grant on schema’s >
spreadsheets
Use
ful
Trus-
ted
Reprodu-
cible
Discoverable
Comprehensible
Archived
Accessible
Preserved in digital format
Moonrocks Data Import:
Moonrocks: pushing data curation to the researcher
3: How do we improve how data (and
software) are published?
• Eg with the Virtual Microscope
• Or Interactive Plots
• Or Executable Papers
Use
ful
Trusted
Reprodu-cible
Discoverable
Comprehensible
Archived
Accessible
Preserved in digital format
Let’s support the needs of research data!
Experimental Metadata:
Workflows, Samples, Settings, Reagents, Organisms, etc.
Record Metadata: DOI, Date, Author, Institute, etc.
Processed Data:
Mathematically/computationally processed
data: correlations, plots, etc.
Raw Data: Direct outputs from equipment:
images, traces, spectra, etc.
Methods and Equipment: Reagents,
settings, manufacturer’s details, etc.
Validation: Approval, Reproduction, Selection,
Quality Stamp
Use
ful
Trusted
Reproducib
le
Discoverable
Comprehensible
Archived
Accessible
Preserved in digital format
Morecuration
Moreusable
Anita de Waard
a.dewaard@elsevier.com
Collaborations and discussions gratefully acknowledged:
• CMU: Nathan Urban, Shreejoy Tripathy, Shawn Burton, Ed Hovy
• UCSD: Brian Shoettlander, David Minor, Declan Fleming, Ilya Zaslavsky
• NIF: Maryann Martone, Anita Bandrowski
• OHSU: Melissa Haendel, Nicole Vasilevsky
• Columbia/IEDA: Kerstin Lehnert, Leslie Hsu
• MIT: Micah Altman
Thank you!
http://researchdata.elsevier.com/

Contenu connexe

Tendances

Zooniverse teachers workshop
Zooniverse teachers workshopZooniverse teachers workshop
Zooniverse teachers workshop
Laura Whyte
 

Tendances (20)

Fsci 2018 tuesday31_july_am6
Fsci 2018 tuesday31_july_am6Fsci 2018 tuesday31_july_am6
Fsci 2018 tuesday31_july_am6
 
Blogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart LabsBlogs Logs Pods: Smart Labs
Blogs Logs Pods: Smart Labs
 
Orientation - Computer Science - 13_0827
Orientation - Computer Science - 13_0827Orientation - Computer Science - 13_0827
Orientation - Computer Science - 13_0827
 
The "social" side of digital science
The "social" side of digital scienceThe "social" side of digital science
The "social" side of digital science
 
ACS National Meeting - Libraries as Hubs for Emerging Technologies - 14_0813
ACS National Meeting - Libraries as Hubs for Emerging Technologies - 14_0813ACS National Meeting - Libraries as Hubs for Emerging Technologies - 14_0813
ACS National Meeting - Libraries as Hubs for Emerging Technologies - 14_0813
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
DataUp Lightning Talk for #iEvoBio
DataUp Lightning Talk for #iEvoBioDataUp Lightning Talk for #iEvoBio
DataUp Lightning Talk for #iEvoBio
 
Developing data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universitiesDeveloping data services: a tale from two Oregon universities
Developing data services: a tale from two Oregon universities
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
 
Zooniverse teachers workshop
Zooniverse teachers workshopZooniverse teachers workshop
Zooniverse teachers workshop
 
Data Management for Mountain Observatories Workshop
Data Management for Mountain Observatories WorkshopData Management for Mountain Observatories Workshop
Data Management for Mountain Observatories Workshop
 
Data citation metrics : best practice to enable new metrics for research data
Data citation metrics : best practice to enable new metrics for research dataData citation metrics : best practice to enable new metrics for research data
Data citation metrics : best practice to enable new metrics for research data
 
CDL Tools for DataCite 2014
CDL Tools for DataCite 2014CDL Tools for DataCite 2014
CDL Tools for DataCite 2014
 
E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010E research overview gahegan bioinformatics workshop 2010
E research overview gahegan bioinformatics workshop 2010
 
Cal Poly - Data Management for Researchers
Cal Poly - Data Management for ResearchersCal Poly - Data Management for Researchers
Cal Poly - Data Management for Researchers
 
New e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionNew e-Science Edinburgh Late Edition
New e-Science Edinburgh Late Edition
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
One Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific PublishersOne Scientist’s Wish List for Scientific Publishers
One Scientist’s Wish List for Scientific Publishers
 
Data Matters for AGU Early Career Conference
Data Matters for AGU Early Career ConferenceData Matters for AGU Early Career Conference
Data Matters for AGU Early Career Conference
 
dataTEL - Datasets for Recommender Systems in Technology-Enhanced Learning
dataTEL - Datasets for Recommender Systems in Technology-Enhanced LearningdataTEL - Datasets for Recommender Systems in Technology-Enhanced Learning
dataTEL - Datasets for Recommender Systems in Technology-Enhanced Learning
 

En vedette (6)

Ncbo webinar force11
Ncbo webinar force11Ncbo webinar force11
Ncbo webinar force11
 
Linking data to publications: Towards the execution of papers
Linking data to publications: Towards the execution of papersLinking data to publications: Towards the execution of papers
Linking data to publications: Towards the execution of papers
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About It
 
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In... Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
Talk at ISWC 2012 Workshop on Semantic Technologies Applied to Biomedical In...
 
deWaardAAMC2012
deWaardAAMC2012deWaardAAMC2012
deWaardAAMC2012
 
Unknown Unknowns
Unknown UnknownsUnknown Unknowns
Unknown Unknowns
 

Similaire à Looking for Data: Finding New Science

Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds
 

Similaire à Looking for Data: Finding New Science (20)

Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Christine borgman keynote
Christine borgman keynoteChristine borgman keynote
Christine borgman keynote
 
Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?Open Data in a Big Data World: easy to say, but hard to do?
Open Data in a Big Data World: easy to say, but hard to do?
 
On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...On community-standards, data curation and scholarly communication - BITS, Ita...
On community-standards, data curation and scholarly communication - BITS, Ita...
 
Towards Incidental Collaboratories; Research Data Services
Towards Incidental Collaboratories; Research Data ServicesTowards Incidental Collaboratories; Research Data Services
Towards Incidental Collaboratories; Research Data Services
 
The Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and MusicThe Evolution of e-Research: Machines, Methods and Music
The Evolution of e-Research: Machines, Methods and Music
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research Paper
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...CODATA International Training Workshop in Big Data for Science for Researcher...
CODATA International Training Workshop in Big Data for Science for Researcher...
 
Data Science and Urban Science @ UW
Data Science and Urban Science @ UWData Science and Urban Science @ UW
Data Science and Urban Science @ UW
 
Realizing the Potential of Research Data by Carole L. Palmer
Realizing the Potential of Research Data by Carole L. Palmer Realizing the Potential of Research Data by Carole L. Palmer
Realizing the Potential of Research Data by Carole L. Palmer
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchData
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
 

Plus de Anita de Waard

Plus de Anita de Waard (20)

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 

Dernier

Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Sérgio Sacani
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
Sérgio Sacani
 

Dernier (20)

Topography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of BengalTopography and sediments of the floor of the Bay of Bengal
Topography and sediments of the floor of the Bay of Bengal
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
 
Microbial bio Synthesis of nanoparticles.pptx
Microbial bio Synthesis of nanoparticles.pptxMicrobial bio Synthesis of nanoparticles.pptx
Microbial bio Synthesis of nanoparticles.pptx
 
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
Gliese 12 b, a temperate Earth-sized planet at 12 parsecs discovered with TES...
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Erythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C KalyanErythropoiesis- Dr.E. Muralinath-C Kalyan
Erythropoiesis- Dr.E. Muralinath-C Kalyan
 
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
GBSN -  Microbiology (Lab  1) Microbiology Lab Safety ProceduresGBSN -  Microbiology (Lab  1) Microbiology Lab Safety Procedures
GBSN - Microbiology (Lab 1) Microbiology Lab Safety Procedures
 
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCEPLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
PLANT DISEASE MANAGEMENT PRINCIPLES AND ITS IMPORTANCE
 
INSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere UniversityINSIGHT Partner Profile: Tampere University
INSIGHT Partner Profile: Tampere University
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
 
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxMETHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
electrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptxelectrochemical gas sensors and their uses.pptx
electrochemical gas sensors and their uses.pptx
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
Phytogeography........................pptx
Phytogeography........................pptxPhytogeography........................pptx
Phytogeography........................pptx
 
In vitro androgenesis ...............pptx
In vitro androgenesis ...............pptxIn vitro androgenesis ...............pptx
In vitro androgenesis ...............pptx
 
Tissue engineering......................pptx
Tissue engineering......................pptxTissue engineering......................pptx
Tissue engineering......................pptx
 
Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...Climate extremes likely to drive land mammal extinction during next supercont...
Climate extremes likely to drive land mammal extinction during next supercont...
 

Looking for Data: Finding New Science

  • 1. Looking for Data: Finding New Science Anita de Waard VP Research Data Collaborations a.dewaard@elsevier.com http://researchdata.elsevier.com/
  • 2. Why should science publishers care about Research Data?Funding bodies:  Demonstrate impact  Guarantee permanence, discoverability  Avoid fraud  Avoid double funding  Serve general public Research Management/Libary:  Generate, track outputs  Comply with mandates  Ensure availability Phil Bourne, (then) Associate Vice Chancellor, UCSD, 4/13: “We need to think about the university as a digital enterprise.” Mike Huerta, Ass. Director NLM: “Today, the major public product of science are concepts, written down in papers. But tomorrow, data will be the main product of science…. We will require scientists to track and share their data as least as well, if not better, than they are sharing their ideas today.” Researchers:  Derive credit  Comply with mandates  Discover and use  Cite/acknowledge Nathan Urban, PI Urban Lab, CMU, 3/13: “If we can share our data, we can write a paper that will knock everybody’s socks off!” Barbara Ransom, NSF Program Director Earth Sciences: “We’re not going to spend any more money for you to go out and get more data! We want you first to show us how you’re going to use all the data we paid y’all to collect in the past!”
  • 3. Research data management today: Using antibodies and squishy bits Grad Students experiment and enter details into their lab notebook. The PI then tries to make sense of their slides, and writes a paper. End of story.
  • 5. But it is also VERY complicated: http://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg • Interspecies variability: A specimen is not a species • Gene expression variability: Knowing genes is not knowing how they are expressed • Microbiome: An animal is an ecosystem • Systems biology: A whole is more than the sum of its parts • Male researchers stress out rodents! Reductionist science does not work for living systems! Statistics to the rescue!
  • 6. What if the research data was connected? Prepare Analyze Communicate Prepare Analyze Communicate Observations Observations Observations Across labs, experiments: track reagents and how they are used
  • 7. Prepare Analyze Communicate Prepare Analyze Communicate Observations Observations Observations Compare outcome of interactions with these entities What if the research data was connected?
  • 8. Prepare Analyze Communicate Prepare AnalyzeCommunicate Observations Observations Observations Build a ‘virtual reagent spectrogram’ by comparing how different entities interacted in different experiments Think What if the research data was connected?
  • 9. Maslow Hierarchy of Research Data Needs: Use ful Trusted Reproducible Discoverable Comprehensible Archived Accessible Preserved in digital format
  • 10. 1: Urban Legend How can we make a standard neuroscience wet lab store and share their data? • Incorporate structured workflows into the daily practice of a typical electrophysiology lab (the Urban Lab at CMU) – What does it take? – Where are points of conflict? • 1-year pilot, funded by Elsevier RDS: – CMU: Shreejoy Tripathy, manage/user test – Elsevier: development, UI, project management • Next steps: NIH grant to scale up to 4 labs Use ful Trusted Reproducible Discoverable Comprehensible Archived Accessible Preserved in digital format
  • 11. de Waard, A., Burton, S. et al., 2013 Urban Legend Components
  • 14. 2: Moonrocks How can we scale up data curation? Pilot project with IEDA: • Build a database for lunar geochemistry • Leapfrog & improve curation time • Write joint report on processes, costs and challenges • 1-year pilot, funded by Elsevier • Next step: NSF grant on schema’s > spreadsheets Use ful Trus- ted Reprodu- cible Discoverable Comprehensible Archived Accessible Preserved in digital format
  • 15. Moonrocks Data Import: Moonrocks: pushing data curation to the researcher
  • 16. 3: How do we improve how data (and software) are published? • Eg with the Virtual Microscope • Or Interactive Plots • Or Executable Papers Use ful Trusted Reprodu-cible Discoverable Comprehensible Archived Accessible Preserved in digital format
  • 17. Let’s support the needs of research data! Experimental Metadata: Workflows, Samples, Settings, Reagents, Organisms, etc. Record Metadata: DOI, Date, Author, Institute, etc. Processed Data: Mathematically/computationally processed data: correlations, plots, etc. Raw Data: Direct outputs from equipment: images, traces, spectra, etc. Methods and Equipment: Reagents, settings, manufacturer’s details, etc. Validation: Approval, Reproduction, Selection, Quality Stamp Use ful Trusted Reproducib le Discoverable Comprehensible Archived Accessible Preserved in digital format Morecuration Moreusable
  • 18. Anita de Waard a.dewaard@elsevier.com Collaborations and discussions gratefully acknowledged: • CMU: Nathan Urban, Shreejoy Tripathy, Shawn Burton, Ed Hovy • UCSD: Brian Shoettlander, David Minor, Declan Fleming, Ilya Zaslavsky • NIF: Maryann Martone, Anita Bandrowski • OHSU: Melissa Haendel, Nicole Vasilevsky • Columbia/IEDA: Kerstin Lehnert, Leslie Hsu • MIT: Micah Altman Thank you! http://researchdata.elsevier.com/