SlideShare une entreprise Scribd logo
1  sur  23
Towards Reusable Experiments:
Making Metadata While You
Measure
Shreejoy Tripathy
PhD student, Carnegie Mellon
Email: stripat3@gmail.com
Twitter: @neuronJoy
Lots of great tools for data sharing…
Barriers to data sharing
• Social
– “What’s in it for me? How will I get credit?”
– “It’s my data, not yours”
– “The benefit to me isn’t worth the time I put into it”
– “What if I get scooped?”
• Methodological
– “How do I share data? What do I share?”
– “Going back and annotating my files to share is super-
time consuming”
– Specifying file formats, data standards
– Building FTP servers and nice user interfaces
Project idea
• How can we make a standard neuroscience
wet lab more data-sharing savvy?
• Incorporate structured workflows into the
daily practice of a typical electrophysiology lab
(the Urban Lab at CMU)
– What does it take?
– Where are points of conflict?
Key insights/motivations
1. Effective data
sharing includes raw
data files +
experimental
metadata (typically
stored in a lab
notebook)
SDB_MC_12_voltages.mat
Key insights/motivations
1. Share raw data files
+ experimental
metadata
2. You know the most
about an
experiment when
you’re performing it
Key insights/motivations
1. Share raw data files +
experimental
metadata
2. You know the most
about an experiment
when you’re
performing it
3. Improved data
practices should
make labs more
productive
Project schematic
Project schematic
Metadata data app
• Electronic lab
notebook models
sequential slice-
electrophysiology
workflow
– Replaces pen-and-
paper lab notebook
Metadata data entry
• Electronic lab
notebook allows
structured data entry
Animal Strain
Metadata data entry
• Electronic lab
notebook allows
structured data entry
(i.e., dropdown
menus)
– Allows incorporation
of semantic ontologies
• Important to strike a
balance between
structure and
flexibility
MGI:3719486
Metadata data entry
MGI:3719486
• Electronic lab
notebook facilitates
entry of new content,
like registration of
recorded neurons to
brain atlas
Data integration
• Syncing of metadata
app and
electrophysiology data
acquisition via server
– Each trace of
experimental data
annotated with
metadata
• IGOR-Pro specific,
support pClamp, other
acquisition packages as
needed later
Data dashboard (web-based)
Data dashboard (future-steps)
• Use collected
metadata to sort
experiments
– Like mouse strain,
neuron type, animal
age
• Enable in-browser
analyses
– Track provenance
of analyzed data
back to raw data
Next steps
• Use built tools
– Populate data server with many experiments
• Is use of e-notebook too prohibitive?
– If yes, continue to iterate
– What can we ask now that we couldn’t before?
• It is much easier to ask exploratory questions, like
– How is the cell type that Shawn records different from the one that Matt
records?
• Exposing data to neuroscience databases
– NIF, INCF Dataspace, neuroelectro.org
• How adaptable are these solutions for use in other
labs?
• Who is going to pay for this?
Acknowledgements
• Carnegie Mellon
– Shreejoy Tripathy
– Nathan Urban
– Shawn Burton
– Rick Gerkin
– Santosh
Chandrasekaran
– Matthew Geramita
• Elsevier Research
Data Services
– Anita de Waard
– Mark Harviston
– Jez Alder
– Sarah Tyrchniewicz
– David Marques
– (funding!)
Next steps
• Roll out updated app to experimentalists
• Populate database with the contents of many
experiments
• Flesh out Data dashboard functionality
• Investigate the new things that we can achieve
given these tools
Effective data sharing is…
• Not just experimental data file
– But also the experimental metadata: what was
done? What does this variable mean? This is
usually stored in PHYSICAL lab notebooks,
understandable by only the experimenter
• Effective data sharing – someone who is not
the person who collected the data can
understand the experiment and data
App user testing
• “I don’t like the way the app forces me
through a specific workflow, I want to enter
experimental data when I see fit”
• “I’m not opposed to the idea of dropdowns,
but I want more flexibility, more text fields”
• “When I use a lab notebook, I only write down
the absolute minimum. Can the app’s fields
be prepolated with the results of an old
experiment?”
What is effective data sharing?
• Effective data sharing – someone who is not
the person who collected the data can
understand the experiment and data
– i.e., datasets should be more or less self-
describing
– >90% of data sharing use cases are an
experimentalist sharing data with a future version
of herself or with a labmate
Neuroinformatics successes don’t
come from large-scale multi-lab data
sharing
• NeuroSynth
• NeuroElectro?

Contenu connexe

Dernier

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

En vedette

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Towards reusable experiments: making metadata while you measure

  • 1. Towards Reusable Experiments: Making Metadata While You Measure Shreejoy Tripathy PhD student, Carnegie Mellon Email: stripat3@gmail.com Twitter: @neuronJoy
  • 2. Lots of great tools for data sharing…
  • 3. Barriers to data sharing • Social – “What’s in it for me? How will I get credit?” – “It’s my data, not yours” – “The benefit to me isn’t worth the time I put into it” – “What if I get scooped?” • Methodological – “How do I share data? What do I share?” – “Going back and annotating my files to share is super- time consuming” – Specifying file formats, data standards – Building FTP servers and nice user interfaces
  • 4. Project idea • How can we make a standard neuroscience wet lab more data-sharing savvy? • Incorporate structured workflows into the daily practice of a typical electrophysiology lab (the Urban Lab at CMU) – What does it take? – Where are points of conflict?
  • 5. Key insights/motivations 1. Effective data sharing includes raw data files + experimental metadata (typically stored in a lab notebook) SDB_MC_12_voltages.mat
  • 6. Key insights/motivations 1. Share raw data files + experimental metadata 2. You know the most about an experiment when you’re performing it
  • 7. Key insights/motivations 1. Share raw data files + experimental metadata 2. You know the most about an experiment when you’re performing it 3. Improved data practices should make labs more productive
  • 10. Metadata data app • Electronic lab notebook models sequential slice- electrophysiology workflow – Replaces pen-and- paper lab notebook
  • 11. Metadata data entry • Electronic lab notebook allows structured data entry Animal Strain
  • 12. Metadata data entry • Electronic lab notebook allows structured data entry (i.e., dropdown menus) – Allows incorporation of semantic ontologies • Important to strike a balance between structure and flexibility MGI:3719486
  • 13. Metadata data entry MGI:3719486 • Electronic lab notebook facilitates entry of new content, like registration of recorded neurons to brain atlas
  • 14. Data integration • Syncing of metadata app and electrophysiology data acquisition via server – Each trace of experimental data annotated with metadata • IGOR-Pro specific, support pClamp, other acquisition packages as needed later
  • 16. Data dashboard (future-steps) • Use collected metadata to sort experiments – Like mouse strain, neuron type, animal age • Enable in-browser analyses – Track provenance of analyzed data back to raw data
  • 17. Next steps • Use built tools – Populate data server with many experiments • Is use of e-notebook too prohibitive? – If yes, continue to iterate – What can we ask now that we couldn’t before? • It is much easier to ask exploratory questions, like – How is the cell type that Shawn records different from the one that Matt records? • Exposing data to neuroscience databases – NIF, INCF Dataspace, neuroelectro.org • How adaptable are these solutions for use in other labs? • Who is going to pay for this?
  • 18. Acknowledgements • Carnegie Mellon – Shreejoy Tripathy – Nathan Urban – Shawn Burton – Rick Gerkin – Santosh Chandrasekaran – Matthew Geramita • Elsevier Research Data Services – Anita de Waard – Mark Harviston – Jez Alder – Sarah Tyrchniewicz – David Marques – (funding!)
  • 19. Next steps • Roll out updated app to experimentalists • Populate database with the contents of many experiments • Flesh out Data dashboard functionality • Investigate the new things that we can achieve given these tools
  • 20. Effective data sharing is… • Not just experimental data file – But also the experimental metadata: what was done? What does this variable mean? This is usually stored in PHYSICAL lab notebooks, understandable by only the experimenter • Effective data sharing – someone who is not the person who collected the data can understand the experiment and data
  • 21. App user testing • “I don’t like the way the app forces me through a specific workflow, I want to enter experimental data when I see fit” • “I’m not opposed to the idea of dropdowns, but I want more flexibility, more text fields” • “When I use a lab notebook, I only write down the absolute minimum. Can the app’s fields be prepolated with the results of an old experiment?”
  • 22. What is effective data sharing? • Effective data sharing – someone who is not the person who collected the data can understand the experiment and data – i.e., datasets should be more or less self- describing – >90% of data sharing use cases are an experimentalist sharing data with a future version of herself or with a labmate
  • 23. Neuroinformatics successes don’t come from large-scale multi-lab data sharing • NeuroSynth • NeuroElectro?

Notes de l'éditeur

  1. Tangible benefits of data sharing – more people can collaborate on the same project – which should lead to more productivity and better science = “nature paper”
  2. Walk through pieces 1 by 1, also mention that this is very much an uncompleted work in progress
  3. Walk through pieces 1 by 1, also mention that this is very much an uncompleted work in progress