Science and eol.org aggregates biodiversity data

•Télécharger en tant que PPTX, PDF•

2 j'aime•347 vues

EOL aggregates scientific data from various databases about all species globally to provide summaries for various audiences including enthusiasts, learners, citizen scientists, and scientists. It utilizes crowd-sourcing to improve data quality and provide computable data for research through features like collections, APIs, and challenges. Future enhancements aim to further enhance EOL's capabilities for scientific research.

Technologie Formation

Science and eol.org
Yes we can! @cydparr
@eol

EOL Scope & Engagement
• All species
• Global
• Many audiences
– enthusiasts
– learners
– citizen scientists
– scientists

EOL aggregates and curates
Scientific Databases, including
BHL, GBIF, ALA, INBio, COL,
Scratchpads, LifeDesks
Scientific Journals Curate

Aggregate

Comment
Rate, Collect
eol.org

Quality control, prioritization API

Third party apps

From Moorea Biocode

Summarizing knowledge
Not managing raw data
Erosaria caputserpentis
Serpent's Head Cowrie
… summaries can be data, too.
Depth range based on 51 specimens in 2 taxa.
Water temperature and chemistry ranges
based on 40 samples.

Environmental ranges
Depth range (m): -5 - 67
Temperature range (°C): 23.011 - 28.496
Nitrate (umol/L): 0.048 - 0.923
Salinity (PPS): 33.821 - 35.837
Oxygen (ml/l): 4.349 - 4.825
Phosphate (umol/l): 0.088 - 0.228
From GBIF Silicate (umol/l): 0.983 - 4.026 From OBIS

Erosaria caputserpentis
Serpent's Head Cowrie

Salinity envelope (n=40)

From OBIS

Citation of Biodiversity Heritage Library
BHL assigned 40,000
Digital Object Identifiers
to monograph titles

In 5 months ending
March 2012, there were
5,333 resolution
attempts (i.e. uses) of
the DOIs for 4,562
unique titles.
From Martin Kalfatovic

Using EOL collections
to get computable data
Step 1: Search on EOL for
organisms with characteristics
of interest. Add each one to an
EOL collection.
Step 2: Write a program using
EOL API methods to retrieve the
external database identifiers for
the species in that collection.
Step 3: Add to your program
code to retrieve data using
external database APIs.
Step 4: Analyze, rinse, repeat.
From Arthur Chapman

Crowd-sourcing for computable data

Lovell and Libby Langstroth, Calphotos Foodwebs.org

We can do more
Phylogenetic tree challenge
http://eol.org/info/tree_challenge
Free trip to iEvoBio, deadline 15 April

Computable data challenge
http://eol.org/info/data_challenge
$50,000, deadline 22 May

Semantic reasoning workshop, in September

The Smithsonian’s Phenotype Repository or Gateway?

Summing up
• EOL offers powerful aggregating, summarizing,
searching, curating, and crowd-sourcing features
• EOL content is already usable for research
• Enhancements to EOL for science research are
underway

What questions do you want to answer?
What data or features do you need?

Thanks to
Our funders
John D. and Catherine T. MacArthur Foundation
Alfred P. Sloane Foundation
Smithsonian Institution
Marine Biological Laboratory
Harvard University
David Rubenstein
and other funders and donors

All our content providers and global partners

Volunteer curators and individual contributors via Flickr,
Wikimedia, and members of EOL

Recommandé

Encyclopedia of Life: Use cases for phenotypesCyndy Parr

LinEpig: Developing a taxonomic reference using collections-management systemsNina Sandlin

LinEpig: An online resource on erigonine epigynaNina Sandlin

Species pages and portals Cyndy Parr

Enze Li CV恩泽李

MURI Poster (1)Zachary East

WaterScopeK2M.Club

Biodiversity Virtual e-Laboratory (BioVeL)Alex Hardisty

Recommandé

Encyclopedia of Life: Use cases for phenotypesCyndy Parr

LinEpig: Developing a taxonomic reference using collections-management systemsNina Sandlin

LinEpig: An online resource on erigonine epigynaNina Sandlin

Species pages and portals Cyndy Parr

Enze Li CV恩泽李

MURI Poster (1)Zachary East

WaterScopeK2M.Club

Biodiversity Virtual e-Laboratory (BioVeL)Alex Hardisty

Introduction to EOL.org for scientistsCyndy Parr

Introducing Encyclopedia of Life version 2Cyndy Parr

Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Cyndy Parr

iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr

Bio solr building a better search for bioinformaticsCharlie Hull

Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres

Introduction to EOL v2 for Crossroads Cyndy Parr

Facilitating semantic alignment.-biohackathon-juppSimon Jupp

Introduction to Web Apollo for the i5K pilot species.Monica Munoz-Torres

Introduction to bioinformaticsMakarand Bhale

Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute

How the Encyclopedia of Life is wrangling organismal attribute dataCyndy Parr

Global content summit: Overview, content partnering, richnessCyndy Parr

Scott Miller - Opening PlenaryConsortium for the Barcode of Life (CBOL)

Frontiers of discovery with Encyclopedia of LifeCyndy Parr

Ontology Services for the Biomedical SciencesConnected Data World

Collaboratively Creating the Knowledge Graph of LifeChris Mungall

J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts

JulieKlein_Bosc2012KUPKB_Team

Building a Model Organism Metabolome DatabaseChristoph Steinbeck

Open data and the ag data commonsCyndy Parr

Ag Data Commons for AgBioDataCyndy Parr

Contenu connexe

Similaire à Science and eol.org aggregates biodiversity data

Introduction to EOL.org for scientistsCyndy Parr

Introducing Encyclopedia of Life version 2Cyndy Parr

Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...Cyndy Parr

iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK Cyndy Parr

Bio solr building a better search for bioinformaticsCharlie Hull

Web Apollo: Lessons learned from community-based biocuration efforts.Monica Munoz-Torres

Introduction to EOL v2 for Crossroads Cyndy Parr

Facilitating semantic alignment.-biohackathon-juppSimon Jupp

Introduction to Web Apollo for the i5K pilot species.Monica Munoz-Torres

Introduction to bioinformaticsMakarand Bhale

Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute

How the Encyclopedia of Life is wrangling organismal attribute dataCyndy Parr

Global content summit: Overview, content partnering, richnessCyndy Parr

Scott Miller - Opening PlenaryConsortium for the Barcode of Life (CBOL)

Frontiers of discovery with Encyclopedia of LifeCyndy Parr

Ontology Services for the Biomedical SciencesConnected Data World

Collaboratively Creating the Knowledge Graph of LifeChris Mungall

J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...Jan Aerts

JulieKlein_Bosc2012KUPKB_Team

Building a Model Organism Metabolome DatabaseChristoph Steinbeck

Similaire à Science and eol.org aggregates biodiversity data (20)

Introduction to EOL.org for scientists

Introducing Encyclopedia of Life version 2

Encyclopedia of Life: Applying Concepts from Amazon and LEGO to Biodiversity ...

iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK

Bio solr building a better search for bioinformatics

Web Apollo: Lessons learned from community-based biocuration efforts.

Introduction to EOL v2 for Crossroads

Facilitating semantic alignment.-biohackathon-jupp

Introduction to Web Apollo for the i5K pilot species.

Introduction to bioinformatics

Advanced Bioinformatics for Genomics and BioData Driven Research

How the Encyclopedia of Life is wrangling organismal attribute data

Global content summit: Overview, content partnering, richness

Scott Miller - Opening Plenary

Frontiers of discovery with Encyclopedia of Life

Ontology Services for the Biomedical Sciences

Collaboratively Creating the Knowledge Graph of Life

J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledg...

JulieKlein_Bosc2012

Building a Model Organism Metabolome Database

Plus de Cyndy Parr

Open data and the ag data commonsCyndy Parr

Ag Data Commons for AgBioDataCyndy Parr

Biodiversity informatics and the agricultural data landscapeCyndy Parr

Public access to research results at USDACyndy Parr

Ag Data Commons: Agricultural research metadata and dataCyndy Parr

Ag Data Commons: A new USDA catalog and repository for agricultural research ...Cyndy Parr

Preparing for data-intensive science across domains.Cyndy Parr

Parr ag datacommonsnal_brownbagCyndy Parr

Ag Data Commons: Adding Value to open agricultural research dataCyndy Parr

Big Data Initiatives for AgroecosystemsCyndy Parr

TDWG 2014 opening talk: Chair's WelcomeCyndy Parr

Behavior ontology workshop princetonCyndy Parr

Practical interoperability across semantic stores of data for ecological, tax...Cyndy Parr

Using and extending Darwin Core for structured attribute dataCyndy Parr

The Road to TraitBank: What's Next for the Encyclopedia of LifeCyndy Parr

Building EOL species pagesCyndy Parr

Leveraging an international infrastructure: Case studies from the Encyclopeda...Cyndy Parr

EOL China Center statusCyndy Parr

Western Ghats PortalCyndy Parr

EOL's Hotlist and RedHotListCyndy Parr

Plus de Cyndy Parr (20)

Open data and the ag data commons

Ag Data Commons for AgBioData

Biodiversity informatics and the agricultural data landscape

Public access to research results at USDA

Ag Data Commons: Agricultural research metadata and data

Ag Data Commons: A new USDA catalog and repository for agricultural research ...

Preparing for data-intensive science across domains.

Parr ag datacommonsnal_brownbag

Ag Data Commons: Adding Value to open agricultural research data

Big Data Initiatives for Agroecosystems

TDWG 2014 opening talk: Chair's Welcome

Behavior ontology workshop princeton

Practical interoperability across semantic stores of data for ecological, tax...

Using and extending Darwin Core for structured attribute data

The Road to TraitBank: What's Next for the Encyclopedia of Life

Building EOL species pages

Leveraging an international infrastructure: Case studies from the Encyclopeda...

EOL China Center status

Western Ghats Portal

EOL's Hotlist and RedHotList

Dernier

Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Google AI Hackathon: LLM based Evaluator for RAGSujit Pal

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Histor y of HAM Radio presentation slidevu2urc

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

CNv6 Instructor Chapter 6 Quality of Servicegiselly40

🐬 The future of MySQL is Postgres 🐘RTylerCroy

Dernier (20)

Maximizing Board Effectiveness 2024 Webinar.pptx

Breaking the Kubernetes Kill Chain: Host Path Mount

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

Handwritten Text Recognition for manuscripts and early printed texts

The 7 Things I Know About Cyber Security After 25 Years | April 2024

Data Cloud, More than a CDP by Matt Robison

Presentation on how to chat with PDF using ChatGPT code interpreter

My Hashitalk Indonesia April 2024 Presentation

Google AI Hackathon: LLM based Evaluator for RAG

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Histor y of HAM Radio presentation slide

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Finology Group – Insurtech Innovation Award 2024

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

CNv6 Instructor Chapter 6 Quality of Service

🐬 The future of MySQL is Postgres 🐘

Science and eol.org aggregates biodiversity data

1. Science and eol.org Yes we can! @cydparr @eol

2. EOL Scope & Engagement • All species • Global • Many audiences – enthusiasts – learners – citizen scientists – scientists

3. EOL aggregates and curates Scientific Databases, including BHL, GBIF, ALA, INBio, COL, Scratchpads, LifeDesks Scientific Journals Curate Aggregate Comment Rate, Collect eol.org Quality control, prioritization API Third party apps

4. From Moorea Biocode Summarizing knowledge Not managing raw data Erosaria caputserpentis Serpent's Head Cowrie … summaries can be data, too. Depth range based on 51 specimens in 2 taxa. Water temperature and chemistry ranges based on 40 samples. Environmental ranges Depth range (m): -5 - 67 Temperature range (°C): 23.011 - 28.496 Nitrate (umol/L): 0.048 - 0.923 Salinity (PPS): 33.821 - 35.837 Oxygen (ml/l): 4.349 - 4.825 Phosphate (umol/l): 0.088 - 0.228 From GBIF Silicate (umol/l): 0.983 - 4.026 From OBIS

5. Erosaria caputserpentis Serpent's Head Cowrie Salinity envelope (n=40) From OBIS

6. Citation of Biodiversity Heritage Library BHL assigned 40,000 Digital Object Identifiers to monograph titles In 5 months ending March 2012, there were 5,333 resolution attempts (i.e. uses) of the DOIs for 4,562 unique titles. From Martin Kalfatovic

7. Using EOL collections to get computable data Step 1: Search on EOL for organisms with characteristics of interest. Add each one to an EOL collection. Step 2: Write a program using EOL API methods to retrieve the external database identifiers for the species in that collection. Step 3: Add to your program code to retrieve data using external database APIs. Step 4: Analyze, rinse, repeat. From Arthur Chapman

8. Crowd-sourcing for computable data Lovell and Libby Langstroth, Calphotos Foodwebs.org

9. We can do more Phylogenetic tree challenge http://eol.org/info/tree_challenge Free trip to iEvoBio, deadline 15 April Computable data challenge http://eol.org/info/data_challenge $50,000, deadline 22 May Semantic reasoning workshop, in September The Smithsonian’s Phenotype Repository or Gateway?

10. Summing up • EOL offers powerful aggregating, summarizing, searching, curating, and crowd-sourcing features • EOL content is already usable for research • Enhancements to EOL for science research are underway What questions do you want to answer? What data or features do you need?

11. Thanks to Our funders John D. and Catherine T. MacArthur Foundation Alfred P. Sloane Foundation Smithsonian Institution Marine Biological Laboratory Harvard University David Rubenstein and other funders and donors All our content providers and global partners Volunteer curators and individual contributors via Flickr, Wikimedia, and members of EOL

Notes de l'éditeur

My name is Cyndy Parr and I’ve been the director of the EOL species group and here at the Smithsonian for four years, I was originally a behavioral ecologist interested in birds but I’ve been involved in informatics research for the last fifteen years.As you may know, Encyclopedia of Life is a web site providing global access to knowledge about life on earth.Global – the whole worldAccess – free, and freely re-usableKnowledge – synthesized, not rawLife on Earth – biological diversity
We have miliions of pages with at least a name with more than 4 million data objects distributed on those pages, so that nearly a million of these pages have some sort of content. Up to now, we’ve focused on our more general audiences, which are most of the 55,000 registered members. But my talk today will show that we ARE being used by scientists, and we have great potential to be used more for science.
EOL takes information from about 200 sources so far, mostly scientific databases, but also including Flickr and Wikipedia, and automatically sorts it onto on taxon pages. Our curators can then trust or untrust it, or anybody can provide comments or ratings. About a thousand credentialed scientists have already volunteered to help with quality control. Actions and comments get fed back to the original providers, and the material on EOL is also available to other applications via an Application Programming Interface, which I’ll talk more about in a moment.We’re partnering with over two hundred scientific databases as well as public conribution sites like Flickr and Wikipedia.100+ partner databases700 curators/1000s contributors/46,000 members2.8 million pages500 thousand pages with Creative Commons contentOver 2 million data objects and >1 million pages with links to research literatureTraffic in past year: 1.7 million unique users, 6.2 million page views
I want to emphasize that EOL deals in summarized knowledge, not raw specimen data. For example, for the serpents head cowrie, we have images like this from the Mooreabiocode project, but instead of serving the individual specimen data, we get the overall distribution of specimen data on a map from GBIF. We also get a summary of environmental data associated with specimens in the Ocean Biogeographic Information System database. Imagine if we could do a summary like this across databases.
This is a graphical way of presenting the summarized data from OBIS, which Jen Hammock on my staff worked on with Edward Van den berghe and our team at the Marine Biological Lab. The salinity range for the species is shown here as just a smal, specific slice of the global ocean minimum and maximums.Looking just at 15 content providers we already work with, it is possible that numeric data such as lifespan or average body weight is already available for more than 800,000 species
BHL, which is funded as part of the encyclopedia of Life and linked to EOL pages. we know that scientists are using BHL. Martin Kalfatovic and his team recently did an analysis and found that more than 10% of the DOI-tagged monographs were used, probably in scholarly citations, in just five months.(out of 100K possible)
We have a feature where users can create customized collections of pages or objects on EOL.A scientist could search for a characteristic, say, red flowers, and create a collection of those taxa. Actually, we’ve been doing this with blue coloration in the “Life is blue” collection. If you wanted to test what might be driving the evolution of coloration, you could write a program that uses EOL to get all the Genbank IDs for those species identifiers or some other EOL partner that we’ve mapped to each of those taxon pages, and then use those to go to that database and pull raw data to analyze. For example, genetic sequences, or specimen locations. In the future we hope to make step 2 and step 3 even easier, so you might just be able to click a button and download lots of raw data for your collection from certain data sources.
You can also use EOL for crowd-sourcing. For example, Jennifer Hammock has started a collection called “Mystery associates” and asked people to try to identify the partners shown in photos that have some sort of ecological association. When they’ve been identified, like this sea star and anemone predation interaction, then she moves the image to the “known associates” collection. This adds to the information we have from a bunch of partners on food web interactions, and then would be available for foodweb modelers. There are many other possible ways that the large crowds on EOL could be harnessed to generate new datasets from EOL content. And this is all possible to some degree now.
For the future, we are working on a few new angles. First, we are working to get a more phylogenetic organization available on EOL, because that will definitely help those who are doing comparative analyses and who want a true evolutionary framework. The deadline for submitting a large tree is this weekend, Monday really. The second challenge is to propose research work using computable data and EOL in some concrete way. Perhaps as I suggested with using collections to harvest computable data or perhaps using text mining. Here the deadline is next month for the idea, and then we’re providing funds to accomplish the pilot project over the next year.Finally, in September here in Washington we’re bringing in computer scientists and biologists who have an interest in broad scale data-intensive science using biodiversity data. We expect this to lead to other projects and enhancements of the EOL platform.All this could, in my personal opinion, lead up to EOL beginning to serve as The Smithsonian’s phenotype repository. Parallel with genbank, we could be the initial point of entry for ecologists or other biologists seeking large-scale structured information about the observable characteristics of organisms.