SlideShare une entreprise Scribd logo
1  sur  15
Télécharger pour lire hors ligne
Methodologies for Long-Tail Data
Sharing: What Have We Learned?
Maryann E. Martone, Ph. D.
University of California, San Diego
and
Hypothesis
Jeffrey S. Grethe, Ph. D.
University of California, San Diego
Database
Software Application
Data Analysis Service
Topical Portal
Core Facility
Ontology
Software Resource
Years:
NIF is an initiative of the NIH Blueprint consortium of institutes
– NIF has been tracking and cataloging the biomedical resource landscape since 2008
The current “Addictome"
NIF searches across:
• Resource Registry
(13,000+)
• > 200 deeply
integrated data
sources (>800
million records)
• literature
Query: Addiction
N
ORCID
RRID
Data
Digital world runs on globally unique and persistent identifiers; PID’s serve as a
“key” for identifying the same entity across different contexts
e-Science Ecosystem
Metadatastandards
Aggregator
People
Research resources
Ontology
Concepts
DOI
Protocols
Minimal Information Models
TranslationNon-digital
Repositories
and
Registries
e.g. NIF, Monarch
NIH Data DIscovery
Index
CDE
E
eScience goal: Make data Findable, Accessible, Interoperable, Re-usable
(FAIR) for both human and machine
PID
Resource Identification Initiative: Supplying unique
identifiers for key research resources
“The following antibodies were used for
immunoblotting: -actin mAb (1:10,000
dilution, Sigma-Aldrich)…”
“The following antibodies were used for
immunoblotting: -actin mAb (1:10,000
dilution, Sigma-Aldrich,
RRID:AB_262137)…”
VS
https://scicrunch.org/resolver/RRID:AB_262137
Minimal Information Standards
http://precedings.nature.com/documents/1720/version/1
http://precedings.nature.com/documents/1720/version/1/files/npre20081720-1.pdf
A set of guidelines for reporting data that
ensures the data can be easily verified,
analysed and clearly interpreted by the
wider scientific community. The
recommendations also provide a foundation
for structured databases, public repositories
and development of data analysis tools.
https://en.wikipedia.org/wiki/Minimum_Information_Standards
MINI: Minimum Information about a Neuroscience
Investigation
MIM
CDE 1
CDE 2
CDE N
• • •
Value Set
Common Data Elements
https://cde.nlm.nih.gov/home
http://www.nlm.nih.gov/cde/
A data element that is common
to multiple datasets and is used
to improve data quality and
promote data sharing. CDEs
usually describe the following
data element properties: Name,
Definition, Instructions,
Provenance, Value Set.
Value Sets
The set of possible values or
responses. A Value Set often
includes concepts from established
Vocabularies, Ontologies or Data
Standards. A value set may also
include a range of permissible values
and indicate the required units. For a
survey question, the value set may
be a list of possible responses.
http://neurolex.org/wiki/Category:Hippocampus_CA1_pyramidal_cell
Neuroscience Information Framework
“a tool for analyzing and structuring information”
“a reduction in uncertainty”
• Ontologies are the major way that NIF searches for and organizes information
• Aggregate of community ontologies, e.g., Gene Ontology, Chebi, Protein Ontology
• Still significant gaps for behavioral and physiological concepts and techniques
• Available as services through NIF so they can be built into applications
Organism
Molecule
Macromolecule Gene
Molecule Descriptors
Cell
Resource Instrument
Dysfunction QualityAnatomical Structure
NS Function
Subcellular
structure
Investigation
ProtocolsReagent
Techniques
NIFSTD
Concept-based query
Remove synonyms
Ontologies and their relationships let us probe the data space for related concepts
What have we learned?
• The landscape is vibrant, dynamic and growing, but also littered
with abandoned and unrealized projects
• Data belongs in a data repository, not on your lab server
• People are important in this endeavor: Leaders, curators,
community engagement specialists
• Data and ontology resources become interesting when they
are comprehensive: populate!!!
• Assume that you will be resource limited and plan
accordingly: time, money, personnel
• Cost-benefit analysis; what to do now vs later
• Technology will improve
• Don’t start from square 1-resources exist to help; help
support them
Extra Slides
12
Dimensions of FAIR data sharing
• Discoverability
– Data can be found
– Data set has an identifier and links are stable
• Accessibility
– Data can be accessed programmatically
– Access rights are clear
• Assessability
– Provenance is known
– Reliability can be determined
• Understandability
– The data can be understood
• Usability
– The data are actionable
– Data are not in a proprietary format
?
?
Goodman, A. et al. Ten simple rules for the care and feeding of scientific data. PLoS Comput Biol 10,
e1003542, doi:10.1371/journal.pcbi.1003542 (2014)
Science as an open enterprise, Royal Society: https://royalsociety.org/policy/projects/science-public-
enterprise/Report/
FORCE11: Future of Research Communications and
e-Scholarship
• Resource Identification Initiative:
https://www.force11.org/group/resource-identification-
initiative
• FAIR Data Guiding principles:
https://www.force11.org/group/fairgroup/fairprinciples
• Data Citation Principles:
https://www.force11.org/group/joint-declaration-data-
citation-principles-final
• On creating machine-readable data citations:
https://peerj.com/articles/cs-1/
• 10 Simple rules for design, provision, and reuse of persistent
identifiers for life science data:
https://zenodo.org/record/18003#.VeOxxLQjvyAFORCE11.org: Grass roots organization dedicated to transforming scholarship through
Forebrain
Midbrain
Hindbrain
0
1-10
11-100
>101
Data Sources
Mapping the data landscape: Anatomical framework
~800 million records across ~200 databases or views

Contenu connexe

Tendances

Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicinePaul Groth
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott LibraryRebekah Cummings
 
Why should researchers care about data curation?
Why should researchers care about data curation?Why should researchers care about data curation?
Why should researchers care about data curation?Varsha Khodiyar
 
Pharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomePharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomeRajarshi Guha
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciencesChris Dwan
 
Pharos: Putting targets in context
Pharos: Putting targets in contextPharos: Putting targets in context
Pharos: Putting targets in contextRajarshi Guha
 
Who owns the data? Intellectual property considerations for academic research...
Who owns the data? Intellectual property considerations for academic research...Who owns the data? Intellectual property considerations for academic research...
Who owns the data? Intellectual property considerations for academic research...Rebekah Cummings
 
A FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer ProteogenomicsA FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer ProteogenomicsBrett Tully
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfreypvhead123
 
Data management (1)
Data management (1)Data management (1)
Data management (1)SM Lalon
 
The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...Todd Vision
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data ManagementAmanda Whitmire
 
Data Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructionsData Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructionsIUPUI
 
Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomeRajarshi Guha
 

Tendances (20)

Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
Fair by design
Fair by designFair by design
Fair by design
 
Knowledge graph construction for research & medicine
Knowledge graph construction for research & medicineKnowledge graph construction for research & medicine
Knowledge graph construction for research & medicine
 
Next generation data services at the Marriott Library
Next generation data services at the Marriott LibraryNext generation data services at the Marriott Library
Next generation data services at the Marriott Library
 
Why should researchers care about data curation?
Why should researchers care about data curation?Why should researchers care about data curation?
Why should researchers care about data curation?
 
Pharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomePharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark Genome
 
No Free Lunch: Metadata in the life sciences
No Free Lunch:  Metadata in the life sciencesNo Free Lunch:  Metadata in the life sciences
No Free Lunch: Metadata in the life sciences
 
Pharos: Putting targets in context
Pharos: Putting targets in contextPharos: Putting targets in context
Pharos: Putting targets in context
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Who owns the data? Intellectual property considerations for academic research...
Who owns the data? Intellectual property considerations for academic research...Who owns the data? Intellectual property considerations for academic research...
Who owns the data? Intellectual property considerations for academic research...
 
A FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer ProteogenomicsA FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
A FAIR Data Sharing Framework for Large-Scale Human Cancer Proteogenomics
 
Data Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn WoolfreyData Management for Postgraduate students by Lynn Woolfrey
Data Management for Postgraduate students by Lynn Woolfrey
 
Data management (1)
Data management (1)Data management (1)
Data management (1)
 
The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...The Dryad Digital Repository: Published evolutionary data as part of the gre...
The Dryad Digital Repository: Published evolutionary data as part of the gre...
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
Data Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructionsData Management Lab: Data management plan instructions
Data Management Lab: Data management plan instructions
 
Dr. Eliot Siegel: Watson and Deep QA Software in Pursuit of Personalized Medi...
Dr. Eliot Siegel: Watson and Deep QA Software in Pursuit of Personalized Medi...Dr. Eliot Siegel: Watson and Deep QA Software in Pursuit of Personalized Medi...
Dr. Eliot Siegel: Watson and Deep QA Software in Pursuit of Personalized Medi...
 
Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark Genome
 
Jonathan Breeze, Symplectic
Jonathan Breeze, SymplecticJonathan Breeze, Symplectic
Jonathan Breeze, Symplectic
 

En vedette

A Deep Survey of the Digital Resource Landscape: Perspectives from the Neuros...
A Deep Survey of the Digital Resource Landscape:Perspectives from the Neuros...A Deep Survey of the Digital Resource Landscape:Perspectives from the Neuros...
A Deep Survey of the Digital Resource Landscape: Perspectives from the Neuros...Maryann Martone
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemMaryann Martone
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...Maryann Martone
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone
 
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...Maryann Martone
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystemMaryann Martone
 
Annotating research resources with rrid’s
Annotating research resources with rrid’sAnnotating research resources with rrid’s
Annotating research resources with rrid’sMaryann Martone
 

En vedette (7)

A Deep Survey of the Digital Resource Landscape: Perspectives from the Neuros...
A Deep Survey of the Digital Resource Landscape:Perspectives from the Neuros...A Deep Survey of the Digital Resource Landscape:Perspectives from the Neuros...
A Deep Survey of the Digital Resource Landscape: Perspectives from the Neuros...
 
Data-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystemData-knowledge transition zones within the biomedical research ecosystem
Data-knowledge transition zones within the biomedical research ecosystem
 
How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...How do we know what we don't know?  Exploring the data and knowledge space th...
How do we know what we don't know?  Exploring the data and knowledge space th...
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
EcsiNeurosciences Information Framework (NIF): An example of community Cyberi...
 
FORCE11: Creating a data and tools ecosystem
FORCE11:  Creating a data and tools ecosystemFORCE11:  Creating a data and tools ecosystem
FORCE11: Creating a data and tools ecosystem
 
Annotating research resources with rrid’s
Annotating research resources with rrid’sAnnotating research resources with rrid’s
Annotating research resources with rrid’s
 

Similaire à Martone grethe

Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...William Gunn
 
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkASIS&T
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?Maryann Martone
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information FrameworkMaryann Martone
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...ICPSR
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - finalKathy Fontaine
 
Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523ORCID, Inc
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhilip Bourne
 
Data as a service: a human-centered design approach/Retha de la Harpe
Data as a service: a human-centered design approach/Retha de la HarpeData as a service: a human-centered design approach/Retha de la Harpe
Data as a service: a human-centered design approach/Retha de la HarpeAfrican Open Science Platform
 
ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodKarry Lu
 
Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data CitationMicah Altman
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data ChallengesPhilip Bourne
 

Similaire à Martone grethe (20)

Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information FrameworkRDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
RDAP14: Maryann Martone, Keynote, The Neuroscience Information Framework
 
Open Science
Open Science Open Science
Open Science
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
Data Landscapes: The Neuroscience Information Framework
Data Landscapes:  The Neuroscience Information FrameworkData Landscapes:  The Neuroscience Information Framework
Data Landscapes: The Neuroscience Information Framework
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
A Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource LandscapeA Deep Survey of the Digital Resource Landscape
A Deep Survey of the Digital Resource Landscape
 
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...Meeting Federal Research Requirements for Data Management Plans, Public Acces...
Meeting Federal Research Requirements for Data Management Plans, Public Acces...
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - final
 
Data Landscapes - Addiction
Data Landscapes - AddictionData Landscapes - Addiction
Data Landscapes - Addiction
 
Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523Borgman orcid dryadsymposiumoxford20130523
Borgman orcid dryadsymposiumoxford20130523
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Engaging the Researcher in RDM
Engaging the Researcher in RDMEngaging the Researcher in RDM
Engaging the Researcher in RDM
 
Data as a service: a human-centered design approach/Retha de la Harpe
Data as a service: a human-centered design approach/Retha de la HarpeData as a service: a human-centered design approach/Retha de la Harpe
Data as a service: a human-centered design approach/Retha de la Harpe
 
ODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For GoodODSC East 2017: Data Science Models For Good
ODSC East 2017: Data Science Models For Good
 
Data Sharing & Data Citation
Data Sharing & Data CitationData Sharing & Data Citation
Data Sharing & Data Citation
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 

Dernier

Alternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusAlternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusPradnya Wadekar
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...PirithiRaju
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptSachin Teotia
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function. MUKTA MANJARI SAHOO
 
famous scientist presentation psp 1st year
famous scientist presentation psp 1st yearfamous scientist presentation psp 1st year
famous scientist presentation psp 1st yearmarwaahmad357
 
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Sérgio Sacani
 
PSP3 employability assessment form .docx
PSP3 employability assessment form .docxPSP3 employability assessment form .docx
PSP3 employability assessment form .docxmarwaahmad357
 
Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestAkashDTejwani
 
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsIsabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsAdvanced-Concepts-Team
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPirithiRaju
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxUalikhanKalkhojayev1
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersAndreaLucarelli
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WaySérgio Sacani
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxAkinrotimiOluwadunsi
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPirithiRaju
 
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxQ3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxArdeniel
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsSafaFallah
 
Breast Cancer Pharmacology Presentation - Louis Pearce.pptx
Breast Cancer Pharmacology Presentation - Louis Pearce.pptxBreast Cancer Pharmacology Presentation - Louis Pearce.pptx
Breast Cancer Pharmacology Presentation - Louis Pearce.pptxLouisPearce2
 
Exploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & ResearchExploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & ResearchPrachya Adhyayan
 

Dernier (20)

Alternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusAlternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabus
 
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
3.2 Pests of Sorghum_Identification, Symptoms and nature of damage, Binomics,...
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.ppt
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function.
 
famous scientist presentation psp 1st year
famous scientist presentation psp 1st yearfamous scientist presentation psp 1st year
famous scientist presentation psp 1st year
 
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
Legacy Analysis of Dark Matter Annihilation from the Milky Way Dwarf Spheroid...
 
PSP3 employability assessment form .docx
PSP3 employability assessment form .docxPSP3 employability assessment form .docx
PSP3 employability assessment form .docx
 
Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening Test
 
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsIsabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
 
Pests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPRPests of Redgram_Identification, Binomics_Dr.UPR
Pests of Redgram_Identification, Binomics_Dr.UPR
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptx
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and Engineers
 
Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPR
 
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxQ3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibiotics
 
Breast Cancer Pharmacology Presentation - Louis Pearce.pptx
Breast Cancer Pharmacology Presentation - Louis Pearce.pptxBreast Cancer Pharmacology Presentation - Louis Pearce.pptx
Breast Cancer Pharmacology Presentation - Louis Pearce.pptx
 
Exploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & ResearchExploration Method’s in Archaeological Studies & Research
Exploration Method’s in Archaeological Studies & Research
 

Martone grethe

  • 1. Methodologies for Long-Tail Data Sharing: What Have We Learned? Maryann E. Martone, Ph. D. University of California, San Diego and Hypothesis Jeffrey S. Grethe, Ph. D. University of California, San Diego
  • 2. Database Software Application Data Analysis Service Topical Portal Core Facility Ontology Software Resource Years: NIF is an initiative of the NIH Blueprint consortium of institutes – NIF has been tracking and cataloging the biomedical resource landscape since 2008
  • 3. The current “Addictome" NIF searches across: • Resource Registry (13,000+) • > 200 deeply integrated data sources (>800 million records) • literature Query: Addiction
  • 4. N ORCID RRID Data Digital world runs on globally unique and persistent identifiers; PID’s serve as a “key” for identifying the same entity across different contexts e-Science Ecosystem Metadatastandards Aggregator People Research resources Ontology Concepts DOI Protocols Minimal Information Models TranslationNon-digital Repositories and Registries e.g. NIF, Monarch NIH Data DIscovery Index CDE E eScience goal: Make data Findable, Accessible, Interoperable, Re-usable (FAIR) for both human and machine PID
  • 5. Resource Identification Initiative: Supplying unique identifiers for key research resources “The following antibodies were used for immunoblotting: -actin mAb (1:10,000 dilution, Sigma-Aldrich)…” “The following antibodies were used for immunoblotting: -actin mAb (1:10,000 dilution, Sigma-Aldrich, RRID:AB_262137)…” VS https://scicrunch.org/resolver/RRID:AB_262137
  • 6. Minimal Information Standards http://precedings.nature.com/documents/1720/version/1 http://precedings.nature.com/documents/1720/version/1/files/npre20081720-1.pdf A set of guidelines for reporting data that ensures the data can be easily verified, analysed and clearly interpreted by the wider scientific community. The recommendations also provide a foundation for structured databases, public repositories and development of data analysis tools. https://en.wikipedia.org/wiki/Minimum_Information_Standards MINI: Minimum Information about a Neuroscience Investigation MIM CDE 1 CDE 2 CDE N • • • Value Set
  • 7. Common Data Elements https://cde.nlm.nih.gov/home http://www.nlm.nih.gov/cde/ A data element that is common to multiple datasets and is used to improve data quality and promote data sharing. CDEs usually describe the following data element properties: Name, Definition, Instructions, Provenance, Value Set.
  • 8. Value Sets The set of possible values or responses. A Value Set often includes concepts from established Vocabularies, Ontologies or Data Standards. A value set may also include a range of permissible values and indicate the required units. For a survey question, the value set may be a list of possible responses. http://neurolex.org/wiki/Category:Hippocampus_CA1_pyramidal_cell
  • 9. Neuroscience Information Framework “a tool for analyzing and structuring information” “a reduction in uncertainty” • Ontologies are the major way that NIF searches for and organizes information • Aggregate of community ontologies, e.g., Gene Ontology, Chebi, Protein Ontology • Still significant gaps for behavioral and physiological concepts and techniques • Available as services through NIF so they can be built into applications Organism Molecule Macromolecule Gene Molecule Descriptors Cell Resource Instrument Dysfunction QualityAnatomical Structure NS Function Subcellular structure Investigation ProtocolsReagent Techniques NIFSTD
  • 10. Concept-based query Remove synonyms Ontologies and their relationships let us probe the data space for related concepts
  • 11. What have we learned? • The landscape is vibrant, dynamic and growing, but also littered with abandoned and unrealized projects • Data belongs in a data repository, not on your lab server • People are important in this endeavor: Leaders, curators, community engagement specialists • Data and ontology resources become interesting when they are comprehensive: populate!!! • Assume that you will be resource limited and plan accordingly: time, money, personnel • Cost-benefit analysis; what to do now vs later • Technology will improve • Don’t start from square 1-resources exist to help; help support them
  • 13. Dimensions of FAIR data sharing • Discoverability – Data can be found – Data set has an identifier and links are stable • Accessibility – Data can be accessed programmatically – Access rights are clear • Assessability – Provenance is known – Reliability can be determined • Understandability – The data can be understood • Usability – The data are actionable – Data are not in a proprietary format ? ? Goodman, A. et al. Ten simple rules for the care and feeding of scientific data. PLoS Comput Biol 10, e1003542, doi:10.1371/journal.pcbi.1003542 (2014) Science as an open enterprise, Royal Society: https://royalsociety.org/policy/projects/science-public- enterprise/Report/
  • 14. FORCE11: Future of Research Communications and e-Scholarship • Resource Identification Initiative: https://www.force11.org/group/resource-identification- initiative • FAIR Data Guiding principles: https://www.force11.org/group/fairgroup/fairprinciples • Data Citation Principles: https://www.force11.org/group/joint-declaration-data- citation-principles-final • On creating machine-readable data citations: https://peerj.com/articles/cs-1/ • 10 Simple rules for design, provision, and reuse of persistent identifiers for life science data: https://zenodo.org/record/18003#.VeOxxLQjvyAFORCE11.org: Grass roots organization dedicated to transforming scholarship through
  • 15. Forebrain Midbrain Hindbrain 0 1-10 11-100 >101 Data Sources Mapping the data landscape: Anatomical framework ~800 million records across ~200 databases or views

Notes de l'éditeur

  1. Figure X: Resource types and year added to the registry. Research resources are each tagged with one or more resource types, the most common are represented in this graph (for all data see http://neurolex.org/wiki/Resource_Type_Hierarchy). The year that a resource was added to the registry is denoted by the color, note that 2009 and earlier data are lumped into 2010.