Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses

Using the US EPA’s CompTox
Chemistry Dashboard for structure
identification and non-targeted analyses
Antony Williams1, Andrew D. McEachran3, Seth Newton2,
Kristin Isaacs2, Katherine Phillips2, Nancy Baker1,
Chris Grulke1 and Jon R. Sobus2
1) National Center for Computational Toxicology, U.S. Environmental Protection Agency, RTP, NC
2) National Exposure Research Laboratory, U.S. Environmental Protection Agency, RTP, NC
3) Oak Ridge Institute of Science and Education (ORISE) Research Participant, Research Triangle Park, NC
March 2018
ACS Spring Meeting, New Orleans
http://www.orcid.org/0000-0002-2668-4821
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA

The CompTox Chemistry Dashboard
• A publicly accessible website delivering access:
– ~760,000 chemicals with related property data
– Experimental and predicted physicochemical property data
– Experimental Human and Ecological hazard data
– Integration to “biological assay data” for 1000s of chemicals
– Information regarding consumer products containing chemicals
– Links to other agency websites and public data resources
– “Literature” searches for chemicals using public resources
– “Batch searching” for thousands of chemicals
– DOWNLOADABLE Open Data for reuse and repurposing
1

CompTox Chemistry Dashboard
https://comptox.epa.gov/dashboard
2

1 of ~761,000 Chemical Pages
3

Access to Chemical Hazard Data
4

Sources of Exposure to Chemicals
5

Dashboard for Structure ID
• Structure Identification using the dashboard
– Formula/mass-based searching – 1 chemical at a time
6

Advanced Searches
Mass Based Search
8

Exact Formula Search: C12H17NO
298 Chemicals
11

– Distilling structures into “MS-Ready form”
12

Specific Data-Mappings
“MS-Ready Structures”
13

Diphenhydramine
15 Total MS-Ready Mappings
14

“MS Ready” Formula Search C12H17NO
354 Chemicals
15

– Ranking based on metadata
16

Identifying Known Unknowns
by reference ranking
17

Data source ranking using
the Dashboard
18
DOI: 10.1007/s00216-016-0139-z

Additional Metadata Ranking
• US EPA CompTox Chemistry Dashboard Data Sources
• “CPDat” Consumer Product Database
• PubChem Data Source Count
• PubMed Reference Count
19

20
C12H17NO: 354 Chemicals

21
C12H17NO: 354 Chemicals

Additional data streams
in development
• US EPA CompTox Chemistry Dashboard Data Sources
• “CPDat” Consumer Product Database
• PubChem Data Source Count
• PubMed Reference Count
• Retention Time Prediction
• Predicted Environmental Media Occurrence
• Presence in Lists
23
0 1 2 3 4
DTXSID5024506
DTXSID3020962
DTXSID0026961
DTXSID2022591
DTXSID9059208
DTXSID1052298
DTXSID5075365
DTXSID2062535
DTXSID0046066
DTXSID90197716
C7H7NO3
Data Sources Retention Time
Media Occurrence Method Compatibility
𝑆𝐶 𝑇𝑂𝑇𝐴𝐿 = 𝑆𝐶 𝐷𝑆 + 𝑆𝐶 𝑃𝑀 + 𝑆𝐶 𝑅𝑇 + 𝑆𝐶 𝑀𝑂 + ⋯

“Chemicals Detected in Water”
24

– Ranking based on metadata
– Batch searching of formulae and masses
25

Batch Search Integration to MetFrag
http://c-ruttkies.github.io/MetFrag/projects/metfragweb/
29

Batch Search Integration to MetFrag
http://c-ruttkies.github.io/MetFrag/projects/metfragweb/
31

The Dashboard to Support
MS-Analysis
32
MS-Ready
Structures
Underpin
Analysis

Future Work: Combined
Substructure/Formula Searching
34

Future Work: Searching Against
Predicted Spectra
35

Future Work: Searching Against
Predicted Spectra
• CFM-ID predicted spectra generated for
700,000 chemicals
– Positive ion, Negative ion, Electron Impact
– Three energies
36

Future Work
Scoring scheme into results
37
𝑆𝐶 𝑇𝑂𝑇𝐴𝐿 = 𝑆𝐶 𝐷𝑆 + 𝑆𝐶 𝑃𝑀 + 𝑆𝐶 𝑅𝑇 + 𝑆𝐶 𝑀𝑂 + ⋯

Conclusion
• The CompTox Chemistry Dashboard provides
access to data for ~760,000 chemicals
• High quality curated data and rich metadata
facilitates mass spec analysis
• “MS-Ready” processed data enables structure
identification
38

Acknowledgments
• The CompTox Chemistry Dashboard team
• NERL colleagues:
– Jon Sobus, Elin Ulrich, Mark Strynar, Seth Newton (NTA Analysis)
– Katherine Phillips, Kathie Dionisio, Kristin Isaacs (Consumer Products
Database)
• Emma Schymanski – Luxembourg Center for
Systems Biomedicine (MS-ready/NTA)
39

Contact
Antony Williams
US EPA Office of Research and Development
National Center for Computational Toxicology (NCCT)
Williams.Antony@epa.gov
ORCID: https://orcid.org/0000-0002-2668-4821
40

Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses

Similaire à Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses (20)

Dernier

Dernier (20)

Using the US EPA’s CompTox Chemistry Dashboard for structure identification and non-targeted analyses