SlideShare une entreprise Scribd logo
1  sur  38
Télécharger pour lire hors ligne
Consensus modeling
CERAPP - Collaborative Estrogen Receptor
Activity Prediction Project
Background and Goals
• U.S. Congress mandated that the EPA screen
chemicals for their potential to be endocrine
disruptors
• Led to development of the Endocrine Disruptor
Screening Program (EDSP)
• Initial focus was on environmental estrogens, but
program expanded to include androgens and thyroid
pathway disruptors
9/30/2015 2
EDSP Chemicals
• EDSP Legislation contained in:
– FIFRA: Federal Insecticide, Fungicide, Rodenticide Act
– SDWA: Safe Drinking Water Act
• Chemicals:
– All pesticide ingredients (actives and inerts)
– Chemicals likely to be found in drinking water to which a
significant population can be exposed
• Total EDSP Chemical universe is ~10,000
• Subsequent filters brings this to about 5,000 to be
tested
9/30/2015 3
The Problem with EDSP
• EDSP Consists of Tier 1 and Tier 2 tests
• Tier 1 is a battery of 11 in vitro and in vivo assays
• Cost ~$1,000,000 per chemical
• Throughput is ~50 chemicals / year
• Total cost of Tier 1 is billions of dollars and will take
100 years at the current rate
• Need pre-tier 1 filter
• Use combination of structure modeling tools and
high-throughput screening “EDSP21”
9/30/2015 4
CERAPP Goals
• Use structure-based models to predict ER activity for
all of EDSP Universe and aid in prioritization for EDSP
Tier 1
• Because models are relatively easy to run on large
numbers of chemicals, extend to all chemicals with
likely human exposure
• Chemicals with significant evidence of ER activity can
be queued further testing
9/30/2015 5
Thinking about need for accuracy
• Goal is prioritizing chemicals for further testing
– Sensitivity more important than specificity
– Better to leave in “funny” structures than to discard
– OK predictions today are better than perfect predictions
tomorrow
• There will be errors in:
– Chemical structures
– Chemical identities
– Model predictions
– Experimental data
• Structure library can improve / expand going forward
– Will be used for other prediction projects
9/30/2015 6
• DTU/food: Technical University of Denmark/ National Food Institute
• EPA/NCCT: U.S. Environmental Protection Agency / National Center for Computational Toxicology
• FDA/NCTR/DBB: U.S. Food and Drug Administration/ National Center for Toxicological
Research/Division of Bioinformatics and Biostatistics
• FDA/NCTR/DSB: U.S. Food and Drug Administration/ National Center for Toxicological
Research/Division of Systems Biology
• Helmholtz/ISB: Helmholtz Zentrum Muenchen/Institute of Structural Biology
• ILS&EPA/NCCT: ILS Inc & EPA/NCCT
• IRCSS: Istituto di Ricerche Farmacologiche “Mario Negri”
• JRC_Ispra: Joint Research Centre of the European Commission, Ispra.
• LockheedMartin&EPA: Lockheed Martin IS&GS/ High Performance Computing
• NIH/NCATS: National Institutes of Health/ National Center for Advancing Translational Sciences
• NIH/NCI: National Institutes of Health/ National Cancer Institute
• RIFM: Research Institute for Fragrance Materials, Inc
• UMEA/Chemistry: University of UMEA/ Chemistry department
• UNC/MML: University of North Carolina/ Laboratory for Molecular Modeling
• UniBA/Pharma: University of Bari/ Department of Pharmacy
• UNIMIB/Michem: University of Milano-Bicocca/ Milano Chemometrics and QSAR Research Group
• UNISTRA/Infochim: University of Strasbourg/ ChemoInformatique
Participants:
Plan of the project
1: Structures curation
- Collect chemical structures from different sources
- Design and document a workflow for structure cleaning
- Deliver the QSAR-ready training set and prediction set
2: Experimental data preparation
- Collect and clean experimental data for the evaluation set
- Define a strategy to evaluate the models separately
3: Modeling & predictions
- Train/refine the models based on the training set
- Deliver predictions and applicability domains for evaluation
4: Model evaluation
- Analyze the training and evaluation datasets
- Evaluate the predictions of each model separately
5: Consensus strategy
- Define a score for each model based on the evaluation step
- Define a weighting scheme from the scores
6: Consensus modeling & validation
- Combine the predictions based on the weighting scheme
- Validate the consensus model using an external dataset.
• U.S. EPA-NCCT
• University of North Carolina
• Danish Technical University-DTU Food
9/30/2015 9
Chemical structures curation
(standardization)
Subgroup:
INITIAL LIST OF CHEMICALS
File parsing & first check
Check for mixtures
Check for salts & counter ions
Normalization of specific chemotypes
Manual inspection
Removal of duplicates
Treatment of tautomeric forms
CURATED DATASET
Check for inorganics
Scheme of the curation workflow
UNC, DTU, EPA Consensus
Fourches, Muratov, Tropsha. J Chem Inf Model, 2010, 29, 476 – 488
KNIME workflow
Aim of the workflow:
• Combine (not reproduce) different procedures and ideas
• Minimize the differences between the structures used for prediction by
different groups
• Produce a flexible free and open source workflow to be shared
Fourches, Muratov, Tropsha. J Chem Inf Model, 2010, 29, 476 – 488
Wedebye, Niemelä, Nikolov, Dybdahl, Danish EPA Environmental Project No. 1503, 2013
Indigo
Parsing and 1st filter
SDF Parser: 40125 initial compounds
(Webservices: Pubchem, Chemspider)
40117 parsed compounds
Unique IDs
Errors reported
Unconnected structures
1. Separate unconnected fragments
2. MW filter on biggest Cpd
(497 compounds removed)
1. 2nd biggest is removed if:
• It was the same/stereo as the biggest component
• Not containing carbons
• It was a salt/solvent from the defined list of accepted salts and solvents.
Standardization of structures
• Explicit hydrogen removed
• Dearomtization
• Removal of chirality/stereochemistry info,
isotopes and pseudo-atoms
• Aromatization + add explicit hydrogen atoms
• Standardize Nitro groups
• Other: tautomerize/mesomerize
• Neutralize (when possible)
Standardize Nitro mesomers
SMARTS query to reaction
Mesomerization/tautomerization
• Azide mesomers
• Exo-enol tautomers
• Enamine-Imine tautomers
• Ynol-ketene tautomers
• ….
Neutralize Structures
Filter inacceptable atoms
• Generate InChi, InChi Key and Canonical
Smiles.
• Remove duplicates (InChis & canonical SMI)
• Remove molecules with inacceptable atoms.
Other then:
H, C, N, O, P, S, Se, F, Cl, Br, I, Li, Na, K, B, Si
Write results
• Calculate 2D descriptors (Indigo,
CDK, RDKit)
• Generate 3D conformers
• Optimize geometry (MMFF94S)
Generated files:
• Sdf file containing the 2D structures
• Excel file containing 2D descriptors
• Sdf file containing the 3D structures
• Excel file for error messages
Chemicals for Prediction:
The Human Exposure Universe
• EDSP Universe (10K)
• Chemicals with known use (40K) (CPCat & ACToR)
• Canadian Domestic Substances List (DSL) (23K)
• EPA DSSTox – structures of EPA/FDA interest (15K)
• ToxCast and Tox21 (In vitro ER data) (8K)
~55k to ~32K unique set of structures
20
• Training set (ToxCast): 1677 Chemicals
• Prediction Set: 32464 Chemicals
Subgroup:
• U.S.EPA/NCCT: Kamel Mansouri, Jayaram
Kancherla, Ann Richard, Richard Judson
• UMEA/Chem: Aleksandra Rybacka, Patrik
Andersson
• FDA/NCTR/DBB: Huixiao Hong
• NIH/NCATS: Ruili Huang
• Helmholtz/ISB: Igor Tetko
9/30/2015 21
Experimental data for evaluation
Tasks to fulfill
• Collect the experimental data for the
evaluation step.
• Combine the different sources of literature.
• Define a strategy to evaluate the models
separately.
a) Tox21, ~8000 chemicals in 4 assays;
b) FDA EDKB database of ~8000 chemicals from the literature;
c) METI database, ~2000 chemicals;
d) ChEMBL database, ~2000 chemicals.
Experimental data for evaluation set
EPA/NCCT, UMEA/Chem, FDA/NCTR/DBB, NIH/NCATS, Helmholtz/ISB
60,000 entries for ~15,000 chemicals
Cleaning procedure
• Knime workflow for structure cleaning
• INChi code for chemical matching
• 7,600 chemicals with CERAPP IDs
• Remove: in-vivo, cytotoxicity, ambiguous, missing values,
non-defined endpoints/units
• Categorize assays: binding, reporter gene or cell
proliferation
• Normalize units
• Use of reference chemicals to categorize into 5 classes.
7547 CERAPP compounds from 44641 entries
Categorize chemicals
• Merge entries with AC50, PC50, IC50, GI50
and EC50.
• Use of 36 reference categorized chemicals
• 5 classes created:
– Strong : 0-0.09 => score =1
– Moderate: 0.09-0.18 => score = 0.25
– Weak: 0.18-20 => score = 0.5
– Very Weak: 20-800 => score = 0.75
– Inactive: 800> => score = 0
Evaluation set
Evaluation set for binary classification models
Active Inactive Total
Binding 1982 5301 7283
Agonist 350 5969 6319
Antagonist 284 6255 6539
Total 2616 17525 20141
Evaluation set for quantitative models
Inactive V. Weak Weak Moderate Strong Total
Binding 5042 685 894 72 77 6770
Agonist 5892 19 179 31 42 6163
Antagonist 6221 76 188 10 10 6505
Total 17155 780 1261 113 129 19438
Consistency of the data
Consistency between speciesConsistency alpha/beta
data N. Chem B. Acc Sn Sp
All orig. all 1659 78.57 89.29 67.86
No VW all 1424 84.57 88.16 80.99
All orig. Multi Src 1410 84.42 88.46 80.38
No VW Multi Src 1306 87.79 87.67 87.9
Consistency training set/ evaluation set
Evaluation & consensus
(consensus subgroup: most of participants)
• Classification / Qualitative:
– Binding: 22 models
– Agonists: 11 models
– Antagonists: 9 models
• Regression / Quantitative:
– Binding: 3 models
– Agonists: 3 models
– Antagonists: 2 models
Models received: Preliminary Results
18 binding models with most chemicals
predicted
Euclidean distance
Concordance of the 22 classification models
for binding
757 chemicals have >75% positive concordance
Active
Inactive
Prioritization
Most models predict most chemicals as inactive
Evaluation procedure:
• On the EPA training set (1677)
• On the full evaluation set (~7k)
• Evaluation set with multi-sources
• Remove “VeryWeak”
• Remove single source
• Remove chemicals outside the AD
Score functions & weights for consensus predictions
𝑔_𝑠𝑐𝑜𝑟𝑒 = 1
3
𝑁𝐸𝑅 𝑇𝑜𝑥𝐶𝑎𝑠𝑡 ∗ 𝑁_𝑝𝑟𝑒𝑑 𝑇𝑜𝑥𝐶𝑎𝑠𝑡
𝑁_𝑡𝑜𝑡 𝑇𝑜𝑥𝐶𝑎𝑠𝑡
+
𝑁_𝑝𝑟𝑒𝑑
𝑁_𝑡𝑜𝑡
+ 1
𝑁𝑓𝑖𝑙𝑡𝑒𝑟
𝑖=1
𝑁 𝑓𝑖𝑙𝑡𝑒𝑟
𝑁𝐸𝑅𝑖 ∗ 𝑁_𝑝𝑟𝑒𝑑𝑖
𝑁_𝑡𝑜𝑡𝑖
𝑜𝑝𝑡_𝑠𝑐𝑜𝑟𝑒 = 1
2 𝑁𝐸𝑅 𝑇𝑜𝑥𝐶𝑎𝑠𝑡 + 𝑁𝐸𝑅 𝑎𝑙𝑙_𝑓𝑖𝑙𝑡𝑒𝑟𝑠
Evaluation of binding models
Models Training set B. Ac. Training Evaluation set B. Ac. Eval Unambiguous Accu Unambig All predicted g_score opt_score
DTU_1 873 0.82 3840 0.64 2695 0.78 16063 0.43 0.80
DTU_2 737 0.79 3268 0.61 2383 0.71 13442 0.36 0.75
EPA_NCCT 1529 0.87 7283 0.57 5275 0.69 32463 0.82 0.78
FDA_NCTR_DBB 1529 0.99 7283 0.60 5991 0.68 32464 0.87 0.84
FDA_NCTR_DSB 0 0.00 534 0.53 431 0.53 2008 0.03 0.53
Helmholtz_ISB 1512 0.89 7123 0.62 5860 0.72 31629 0.83 0.80
ILS_EPA 1506 0.84 7068 0.66 5814 0.75 31318 0.82 0.79
IRCCS_CART 1529 0.80 7280 0.61 3620 0.75 32442 0.78 0.77
IRCCS_Ruleset 1383 0.91 6603 0.56 5416 0.62 28958 0.75 0.77
JRC_Ispra 1465 0.82 6900 0.58 5672 0.67 30801 0.77 0.74
LockheedMartin_EPA_1 1529 0.83 7283 0.55 1539 0.66 32464 0.75 0.75
LockheedMartin_EPA_2 1529 0.76 7283 0.54 1539 0.64 32464 0.72 0.70
NIH_NCATS 1528 0.69 7271 0.59 5981 0.65 32184 0.77 0.67
NIH_NCI_GUASAR 1529 0.99 7283 0.61 5951 0.69 32455 0.88 0.84
NIH_NCI_PASS 1465 0.86 6900 0.58 5672 0.66 30800 0.78 0.76
RIFM 1529 0.73 7283 0.58 5991 0.65 32463 0.78 0.69
UMEA 1529 0.82 7280 0.61 5989 0.70 32430 0.82 0.76
UNC_MML_1 1529 0.80 7283 0.59 5991 0.65 32464 0.80 0.73
UNC_MML_2 1529 0.49 7283 0.55 5991 0.60 32464 0.69 0.55
UNIBA 750 0.86 3259 0.62 2753 0.73 15178 0.40 0.80
UNIMIB_Michem_1 1529 0.76 7283 0.55 5991 0.59 32464 0.77 0.68
UNIMIB_Michem_2 531 0.98 2780 0.62 2241 0.71 11832 0.32 0.85
UNISTRA_InfoChim 1529 0.86 7283 0.57 4755 0.60 32464 0.80 0.73
Consensus_1 predictions
Binding Agonist Antagonist
CERAPP
ID
n_act score n_no score cons act_conc inact_c Potency cons act_c inact_c Potency cons act_c inact_c Potency
10001 1 0.05 15 0.71 0 0.06 0.94 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10005 3 0.11 17 0.65 0 0.15 0.85 Inactive 0 0.11 0.89 Inactive 0 0.14 0.86 Inactive
10007 4 0.18 12 0.58 0 0.25 0.75 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10008 0 0.00 18 0.76 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10009 1 0.04 17 0.71 0 0.06 0.94 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10016 21 0.76 0 0.00 1 1.00 0.00 Strong 1 1.00 0.00 Strong 1 1.00 0.00 Inactive
10017 21 0.76 0 0.00 1 1.00 0.00 Strong 1 1.00 0.00 Strong 1 1.00 0.00 Inactive
10018 16 0.61 4 0.15 1 0.80 0.20 VeryWeak 1 0.89 0.11 VeryWeak 1 0.86 0.14 Inactive
10027 19 0.72 1 0.04 1 0.95 0.05 Moderate 0 0.10 0.90 Inactive 0 0.13 0.88 Moderate
10033 4 0.17 13 0.58 0 0.24 0.76 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10034 21 0.75 0 0.00 1 1.00 0.00 Moderate 1 0.89 0.11 Moderate 1 0.86 0.14 Inactive
10088 11 0.42 9 0.34 1 0.55 0.45 VeryWeak 1 0.78 0.22 VeryWeak 1 0.86 0.14 Inactive
10089 1 0.04 19 0.72 0 0.05 0.95 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10099 2 0.09 15 0.66 0 0.12 0.88 Inactive 0 0.11 0.89 Inactive 0 0.13 0.88 Inactive
10100 6 0.24 12 0.50 0 0.33 0.67 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10101 3 0.12 16 0.64 0 0.16 0.84 Inactive 0 0.11 0.89 Inactive 0 0.14 0.86 Inactive
10102 12 0.43 9 0.32 1 0.57 0.43 VeryWeak 1 0.78 0.22 VeryWeak 1 0.71 0.29 Inactive
10111 3 0.12 16 0.64 0 0.16 0.84 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive
10112 22 0.75 0 0.00 1 1.00 0.00 Weak 1 1.00 0.00 Weak 1 1.00 0.00 Inactive
10113 21 0.75 0 0.00 1 1.00 0.00 Weak 1 1.00 0.00 Weak 1 1.00 0.00 Inactive
10119 12 0.46 8 0.30 1 0.60 0.40 VeryWeak 1 0.78 0.22 VeryWeak 1 0.71 0.29 Inactive
10120 11 0.39 10 0.36 1 0.52 0.48 VeryWeak 1 0.78 0.22 VeryWeak 1 0.71 0.29 Inactive
ToxCast
data
Evaluation
set
Sensitivity 0.85
0.98
0.92
0.23
0.95
0.59
Specificity
Balanced accuracy
Total binders: 2576
Agonists: 2312
Antagonists: 2779
Consensus_1 evaluation
Agonist and antagonist consensus models first, then on binding consensus:
1) If chemical i is active in classification consensus_1
 active in Potency_class consensus_2
2) If chemical i is active in regression & >= 3 positive classification models
 active in classification consensus_2
3) If chemical i is active in regression & < 3 positive classification models
 Inactive in Potency_class consensus_2
Binding consensus:
4) If chemical i is active agonist or active antagonist
 Active in classification consensus_2
 Potency_class consensus_2 = Potency_class agonist/antagonist
Rules for consensus_2
Total binders: 3961
Agonists: 2494
Antagonists: 2793
Consensus_2 evaluation
ToxCast data
Literature data
(All: 7283)
Literature data
(>6 sources: 1209)
Sensitivity 0.93 0.30 0.87
Specificity 0.97 0.91 0.94
Balanced accuracy 0.95 0.61 0.91
ToxCast data Literature data
ObservedPredicted Actives Inactives Actives Inactives
Actives 83 6 597 1385
Inactives 40 1400 463 4838
• positive concordance < 0.6 => Potency class= Very weak
• 0.6=<positive concordance<0.75 => Potency class= Weak
• 0.75=<positive concordance<0.9 => Potency class= Moderate
• positive concordance>=0.9 => Potency class= Strong
Positive concordance & Potency level
Box plot of the positive classes of the
consensus model.
Variation of the balanced accuracy with
positive concordance thresholds
New External validation set
ToxCast phIII+ Tox21 agonist assays
ObservedPredicted Actives Inactives
Actives 19 23
Inactives 17 561
ObservedPredicted Actives Inactives
Actives 13 3
Inactives 17 551
Specificity: 0.97 Sensitivity: 0.81 Balanced accuracy: 0.89
Specificity: 0.97 Sensitivity: 0.45 Balanced accuracy: 0.71
All matching chemicals: 620
Only chemicals in agreement with other literature sources: 584
• High quality training set (1677 chemicals)
• Free & open-source structure curation workflow
• Curated structures with potential exposure (32k)
• QSAR-ready dataset from the literature (~7k)
• Consensus models for binding, agonist & antagonist
• 32k list predicted for prioritization.
• EDSP dashboard: http://actor.epa.gov/edsp21/
future work
Conclusions
• Validate binding consensus with the new external set
• Clean literature data from cytotoxicity. Use it as QSAR
ready set.

Contenu connexe

Tendances

Virtual screening of chemicals for endocrine disrupting activity through CER...
Virtual screening of chemicals for endocrine disrupting activity through  CER...Virtual screening of chemicals for endocrine disrupting activity through  CER...
Virtual screening of chemicals for endocrine disrupting activity through CER...Kamel Mansouri
 
OPERA: A free and open source QSAR tool for predicting physicochemical proper...
OPERA: A free and open source QSAR tool for predicting physicochemical proper...OPERA: A free and open source QSAR tool for predicting physicochemical proper...
OPERA: A free and open source QSAR tool for predicting physicochemical proper...Kamel Mansouri
 
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...Kamel Mansouri
 
Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...Kamel Mansouri
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryAnn-Marie Roche
 

Tendances (20)

The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
The EPA Online Prediction Physicochemical Prediction Platform to Support Envi...
 
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
Environmental Chemistry Compound Identification Using High Resolution Mass Sp...
 
The influence of data curation on QSAR Modeling – examining issues of qualit...
 The influence of data curation on QSAR Modeling – examining issues of qualit... The influence of data curation on QSAR Modeling – examining issues of qualit...
The influence of data curation on QSAR Modeling – examining issues of qualit...
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
Virtual screening of chemicals for endocrine disrupting activity through CER...
Virtual screening of chemicals for endocrine disrupting activity through  CER...Virtual screening of chemicals for endocrine disrupting activity through  CER...
Virtual screening of chemicals for endocrine disrupting activity through CER...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
Progress in Using Big Data in Chemical Toxicity Research at the National Cent...
Progress in Using Big Data in Chemical Toxicity Research at the National Cent...Progress in Using Big Data in Chemical Toxicity Research at the National Cent...
Progress in Using Big Data in Chemical Toxicity Research at the National Cent...
 
OPERA: A free and open source QSAR tool for predicting physicochemical proper...
OPERA: A free and open source QSAR tool for predicting physicochemical proper...OPERA: A free and open source QSAR tool for predicting physicochemical proper...
OPERA: A free and open source QSAR tool for predicting physicochemical proper...
 
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
The EPA iCSS Chemistry Dashboard to Support Compound Identification Using Hig...
 
Web-based access to experimental and predicted data for environmental fate, t...
Web-based access to experimental and predicted data for environmental fate, t...Web-based access to experimental and predicted data for environmental fate, t...
Web-based access to experimental and predicted data for environmental fate, t...
 
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
Consensus Models to Predict Endocrine Disruption for All Human-Exposure Chemi...
 
Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...Free online access to experimental and predicted chemical properties through ...
Free online access to experimental and predicted chemical properties through ...
 
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
US EPA CompTox Chemistry Dashboard as a source of data to fill data gaps for ...
 
The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...
 
Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...Development of a Tool for Systematic Integration of Traditional and New Appro...
Development of a Tool for Systematic Integration of Traditional and New Appro...
 
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
 
Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistry
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 

Similaire à CERAPP - Collaborative Estrogen Receptor Activity Prediction Project. Computational Toxicology Communities of Practice

CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity
CoMPARA: Collaborative Modeling Project for Androgen Receptor ActivityCoMPARA: Collaborative Modeling Project for Androgen Receptor Activity
CoMPARA: Collaborative Modeling Project for Androgen Receptor ActivityKamel Mansouri
 
Virtual screening of chemicals for endocrine disrupting activity: Case studie...
Virtual screening of chemicals for endocrine disrupting activity: Case studie...Virtual screening of chemicals for endocrine disrupting activity: Case studie...
Virtual screening of chemicals for endocrine disrupting activity: Case studie...Kamel Mansouri
 
International Computational Collaborations to Solve Toxicology Problems
International Computational Collaborations to Solve Toxicology ProblemsInternational Computational Collaborations to Solve Toxicology Problems
International Computational Collaborations to Solve Toxicology ProblemsKamel Mansouri
 
An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...Kamel Mansouri
 
Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...
Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...
Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...Jordi Labs
 
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryMolecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryLee Larcombe
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology Sean Ekins
 
Development and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning modelsDevelopment and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning modelsSean Ekins
 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Prof. Wim Van Criekinge
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekingeProf. Wim Van Criekinge
 
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsSean Ekins
 
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Prof. Wim Van Criekinge
 

Similaire à CERAPP - Collaborative Estrogen Receptor Activity Prediction Project. Computational Toxicology Communities of Practice (20)

CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity
CoMPARA: Collaborative Modeling Project for Androgen Receptor ActivityCoMPARA: Collaborative Modeling Project for Androgen Receptor Activity
CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity
 
Virtual screening of chemicals for endocrine disrupting activity: Case studie...
Virtual screening of chemicals for endocrine disrupting activity: Case studie...Virtual screening of chemicals for endocrine disrupting activity: Case studie...
Virtual screening of chemicals for endocrine disrupting activity: Case studie...
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
International Computational Collaborations to Solve Toxicology Problems
International Computational Collaborations to Solve Toxicology ProblemsInternational Computational Collaborations to Solve Toxicology Problems
International Computational Collaborations to Solve Toxicology Problems
 
An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...
 
Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...
Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...
Jordi Labs Agilent Extractables & Leachables (E&L) Webinar Presentation (Part...
 
Molecular modelling for in silico drug discovery
Molecular modelling for in silico drug discoveryMolecular modelling for in silico drug discovery
Molecular modelling for in silico drug discovery
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
Development and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning modelsDevelopment and sharing of ADME/Tox and Drug Discovery Machine learning models
Development and sharing of ADME/Tox and Drug Discovery Machine learning models
 
How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...How to place your research questions or results into the context of the "Lega...
How to place your research questions or results into the context of the "Lega...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014Bioinformatics t9-t10-biocheminformatics v2014
Bioinformatics t9-t10-biocheminformatics v2014
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge2015 bioinformatics bio_cheminformatics_wim_vancriekinge
2015 bioinformatics bio_cheminformatics_wim_vancriekinge
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
 
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
Bioinformatics t9-t10-bio cheminformatics-wimvancriekinge_v2013
 

Plus de Kamel Mansouri

Automated workflows for data curation and standardization of chemical structu...
Automated workflows for data curation and standardization of chemical structu...Automated workflows for data curation and standardization of chemical structu...
Automated workflows for data curation and standardization of chemical structu...Kamel Mansouri
 
Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)
Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)
Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)Kamel Mansouri
 
Scoring and ranking of metabolic trees to computationally prioritize chemical...
Scoring and ranking of metabolic trees to computationally prioritize chemical...Scoring and ranking of metabolic trees to computationally prioritize chemical...
Scoring and ranking of metabolic trees to computationally prioritize chemical...Kamel Mansouri
 
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...Kamel Mansouri
 
In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...
In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...
In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...Kamel Mansouri
 
In-silico structure activity relationship study of toxicity endpoints by QSAR...
In-silico structure activity relationship study of toxicity endpoints by QSAR...In-silico structure activity relationship study of toxicity endpoints by QSAR...
In-silico structure activity relationship study of toxicity endpoints by QSAR...Kamel Mansouri
 

Plus de Kamel Mansouri (6)

Automated workflows for data curation and standardization of chemical structu...
Automated workflows for data curation and standardization of chemical structu...Automated workflows for data curation and standardization of chemical structu...
Automated workflows for data curation and standardization of chemical structu...
 
Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)
Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)
Chemical prioritization using in silico modeling. SOT 2018 (San Antonio, USA)
 
Scoring and ranking of metabolic trees to computationally prioritize chemical...
Scoring and ranking of metabolic trees to computationally prioritize chemical...Scoring and ranking of metabolic trees to computationally prioritize chemical...
Scoring and ranking of metabolic trees to computationally prioritize chemical...
 
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...
 
In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...
In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...
In-silico study of ToxCast GPCR assays by quantitative structure-activity rel...
 
In-silico structure activity relationship study of toxicity endpoints by QSAR...
In-silico structure activity relationship study of toxicity endpoints by QSAR...In-silico structure activity relationship study of toxicity endpoints by QSAR...
In-silico structure activity relationship study of toxicity endpoints by QSAR...
 

Dernier

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 

Dernier (20)

Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 

CERAPP - Collaborative Estrogen Receptor Activity Prediction Project. Computational Toxicology Communities of Practice

  • 1. Consensus modeling CERAPP - Collaborative Estrogen Receptor Activity Prediction Project
  • 2. Background and Goals • U.S. Congress mandated that the EPA screen chemicals for their potential to be endocrine disruptors • Led to development of the Endocrine Disruptor Screening Program (EDSP) • Initial focus was on environmental estrogens, but program expanded to include androgens and thyroid pathway disruptors 9/30/2015 2
  • 3. EDSP Chemicals • EDSP Legislation contained in: – FIFRA: Federal Insecticide, Fungicide, Rodenticide Act – SDWA: Safe Drinking Water Act • Chemicals: – All pesticide ingredients (actives and inerts) – Chemicals likely to be found in drinking water to which a significant population can be exposed • Total EDSP Chemical universe is ~10,000 • Subsequent filters brings this to about 5,000 to be tested 9/30/2015 3
  • 4. The Problem with EDSP • EDSP Consists of Tier 1 and Tier 2 tests • Tier 1 is a battery of 11 in vitro and in vivo assays • Cost ~$1,000,000 per chemical • Throughput is ~50 chemicals / year • Total cost of Tier 1 is billions of dollars and will take 100 years at the current rate • Need pre-tier 1 filter • Use combination of structure modeling tools and high-throughput screening “EDSP21” 9/30/2015 4
  • 5. CERAPP Goals • Use structure-based models to predict ER activity for all of EDSP Universe and aid in prioritization for EDSP Tier 1 • Because models are relatively easy to run on large numbers of chemicals, extend to all chemicals with likely human exposure • Chemicals with significant evidence of ER activity can be queued further testing 9/30/2015 5
  • 6. Thinking about need for accuracy • Goal is prioritizing chemicals for further testing – Sensitivity more important than specificity – Better to leave in “funny” structures than to discard – OK predictions today are better than perfect predictions tomorrow • There will be errors in: – Chemical structures – Chemical identities – Model predictions – Experimental data • Structure library can improve / expand going forward – Will be used for other prediction projects 9/30/2015 6
  • 7. • DTU/food: Technical University of Denmark/ National Food Institute • EPA/NCCT: U.S. Environmental Protection Agency / National Center for Computational Toxicology • FDA/NCTR/DBB: U.S. Food and Drug Administration/ National Center for Toxicological Research/Division of Bioinformatics and Biostatistics • FDA/NCTR/DSB: U.S. Food and Drug Administration/ National Center for Toxicological Research/Division of Systems Biology • Helmholtz/ISB: Helmholtz Zentrum Muenchen/Institute of Structural Biology • ILS&EPA/NCCT: ILS Inc & EPA/NCCT • IRCSS: Istituto di Ricerche Farmacologiche “Mario Negri” • JRC_Ispra: Joint Research Centre of the European Commission, Ispra. • LockheedMartin&EPA: Lockheed Martin IS&GS/ High Performance Computing • NIH/NCATS: National Institutes of Health/ National Center for Advancing Translational Sciences • NIH/NCI: National Institutes of Health/ National Cancer Institute • RIFM: Research Institute for Fragrance Materials, Inc • UMEA/Chemistry: University of UMEA/ Chemistry department • UNC/MML: University of North Carolina/ Laboratory for Molecular Modeling • UniBA/Pharma: University of Bari/ Department of Pharmacy • UNIMIB/Michem: University of Milano-Bicocca/ Milano Chemometrics and QSAR Research Group • UNISTRA/Infochim: University of Strasbourg/ ChemoInformatique Participants:
  • 8. Plan of the project 1: Structures curation - Collect chemical structures from different sources - Design and document a workflow for structure cleaning - Deliver the QSAR-ready training set and prediction set 2: Experimental data preparation - Collect and clean experimental data for the evaluation set - Define a strategy to evaluate the models separately 3: Modeling & predictions - Train/refine the models based on the training set - Deliver predictions and applicability domains for evaluation 4: Model evaluation - Analyze the training and evaluation datasets - Evaluate the predictions of each model separately 5: Consensus strategy - Define a score for each model based on the evaluation step - Define a weighting scheme from the scores 6: Consensus modeling & validation - Combine the predictions based on the weighting scheme - Validate the consensus model using an external dataset.
  • 9. • U.S. EPA-NCCT • University of North Carolina • Danish Technical University-DTU Food 9/30/2015 9 Chemical structures curation (standardization) Subgroup:
  • 10. INITIAL LIST OF CHEMICALS File parsing & first check Check for mixtures Check for salts & counter ions Normalization of specific chemotypes Manual inspection Removal of duplicates Treatment of tautomeric forms CURATED DATASET Check for inorganics Scheme of the curation workflow UNC, DTU, EPA Consensus Fourches, Muratov, Tropsha. J Chem Inf Model, 2010, 29, 476 – 488
  • 11. KNIME workflow Aim of the workflow: • Combine (not reproduce) different procedures and ideas • Minimize the differences between the structures used for prediction by different groups • Produce a flexible free and open source workflow to be shared Fourches, Muratov, Tropsha. J Chem Inf Model, 2010, 29, 476 – 488 Wedebye, Niemelä, Nikolov, Dybdahl, Danish EPA Environmental Project No. 1503, 2013 Indigo
  • 12. Parsing and 1st filter SDF Parser: 40125 initial compounds (Webservices: Pubchem, Chemspider) 40117 parsed compounds Unique IDs Errors reported
  • 13. Unconnected structures 1. Separate unconnected fragments 2. MW filter on biggest Cpd (497 compounds removed) 1. 2nd biggest is removed if: • It was the same/stereo as the biggest component • Not containing carbons • It was a salt/solvent from the defined list of accepted salts and solvents.
  • 14. Standardization of structures • Explicit hydrogen removed • Dearomtization • Removal of chirality/stereochemistry info, isotopes and pseudo-atoms • Aromatization + add explicit hydrogen atoms • Standardize Nitro groups • Other: tautomerize/mesomerize • Neutralize (when possible)
  • 16. Mesomerization/tautomerization • Azide mesomers • Exo-enol tautomers • Enamine-Imine tautomers • Ynol-ketene tautomers • ….
  • 18. Filter inacceptable atoms • Generate InChi, InChi Key and Canonical Smiles. • Remove duplicates (InChis & canonical SMI) • Remove molecules with inacceptable atoms. Other then: H, C, N, O, P, S, Se, F, Cl, Br, I, Li, Na, K, B, Si
  • 19. Write results • Calculate 2D descriptors (Indigo, CDK, RDKit) • Generate 3D conformers • Optimize geometry (MMFF94S) Generated files: • Sdf file containing the 2D structures • Excel file containing 2D descriptors • Sdf file containing the 3D structures • Excel file for error messages
  • 20. Chemicals for Prediction: The Human Exposure Universe • EDSP Universe (10K) • Chemicals with known use (40K) (CPCat & ACToR) • Canadian Domestic Substances List (DSL) (23K) • EPA DSSTox – structures of EPA/FDA interest (15K) • ToxCast and Tox21 (In vitro ER data) (8K) ~55k to ~32K unique set of structures 20 • Training set (ToxCast): 1677 Chemicals • Prediction Set: 32464 Chemicals
  • 21. Subgroup: • U.S.EPA/NCCT: Kamel Mansouri, Jayaram Kancherla, Ann Richard, Richard Judson • UMEA/Chem: Aleksandra Rybacka, Patrik Andersson • FDA/NCTR/DBB: Huixiao Hong • NIH/NCATS: Ruili Huang • Helmholtz/ISB: Igor Tetko 9/30/2015 21 Experimental data for evaluation
  • 22. Tasks to fulfill • Collect the experimental data for the evaluation step. • Combine the different sources of literature. • Define a strategy to evaluate the models separately.
  • 23. a) Tox21, ~8000 chemicals in 4 assays; b) FDA EDKB database of ~8000 chemicals from the literature; c) METI database, ~2000 chemicals; d) ChEMBL database, ~2000 chemicals. Experimental data for evaluation set EPA/NCCT, UMEA/Chem, FDA/NCTR/DBB, NIH/NCATS, Helmholtz/ISB 60,000 entries for ~15,000 chemicals
  • 24. Cleaning procedure • Knime workflow for structure cleaning • INChi code for chemical matching • 7,600 chemicals with CERAPP IDs • Remove: in-vivo, cytotoxicity, ambiguous, missing values, non-defined endpoints/units • Categorize assays: binding, reporter gene or cell proliferation • Normalize units • Use of reference chemicals to categorize into 5 classes. 7547 CERAPP compounds from 44641 entries
  • 25. Categorize chemicals • Merge entries with AC50, PC50, IC50, GI50 and EC50. • Use of 36 reference categorized chemicals • 5 classes created: – Strong : 0-0.09 => score =1 – Moderate: 0.09-0.18 => score = 0.25 – Weak: 0.18-20 => score = 0.5 – Very Weak: 20-800 => score = 0.75 – Inactive: 800> => score = 0
  • 26. Evaluation set Evaluation set for binary classification models Active Inactive Total Binding 1982 5301 7283 Agonist 350 5969 6319 Antagonist 284 6255 6539 Total 2616 17525 20141 Evaluation set for quantitative models Inactive V. Weak Weak Moderate Strong Total Binding 5042 685 894 72 77 6770 Agonist 5892 19 179 31 42 6163 Antagonist 6221 76 188 10 10 6505 Total 17155 780 1261 113 129 19438
  • 27. Consistency of the data Consistency between speciesConsistency alpha/beta data N. Chem B. Acc Sn Sp All orig. all 1659 78.57 89.29 67.86 No VW all 1424 84.57 88.16 80.99 All orig. Multi Src 1410 84.42 88.46 80.38 No VW Multi Src 1306 87.79 87.67 87.9 Consistency training set/ evaluation set
  • 28. Evaluation & consensus (consensus subgroup: most of participants) • Classification / Qualitative: – Binding: 22 models – Agonists: 11 models – Antagonists: 9 models • Regression / Quantitative: – Binding: 3 models – Agonists: 3 models – Antagonists: 2 models Models received: Preliminary Results 18 binding models with most chemicals predicted Euclidean distance
  • 29. Concordance of the 22 classification models for binding 757 chemicals have >75% positive concordance Active Inactive Prioritization Most models predict most chemicals as inactive
  • 30. Evaluation procedure: • On the EPA training set (1677) • On the full evaluation set (~7k) • Evaluation set with multi-sources • Remove “VeryWeak” • Remove single source • Remove chemicals outside the AD Score functions & weights for consensus predictions 𝑔_𝑠𝑐𝑜𝑟𝑒 = 1 3 𝑁𝐸𝑅 𝑇𝑜𝑥𝐶𝑎𝑠𝑡 ∗ 𝑁_𝑝𝑟𝑒𝑑 𝑇𝑜𝑥𝐶𝑎𝑠𝑡 𝑁_𝑡𝑜𝑡 𝑇𝑜𝑥𝐶𝑎𝑠𝑡 + 𝑁_𝑝𝑟𝑒𝑑 𝑁_𝑡𝑜𝑡 + 1 𝑁𝑓𝑖𝑙𝑡𝑒𝑟 𝑖=1 𝑁 𝑓𝑖𝑙𝑡𝑒𝑟 𝑁𝐸𝑅𝑖 ∗ 𝑁_𝑝𝑟𝑒𝑑𝑖 𝑁_𝑡𝑜𝑡𝑖 𝑜𝑝𝑡_𝑠𝑐𝑜𝑟𝑒 = 1 2 𝑁𝐸𝑅 𝑇𝑜𝑥𝐶𝑎𝑠𝑡 + 𝑁𝐸𝑅 𝑎𝑙𝑙_𝑓𝑖𝑙𝑡𝑒𝑟𝑠
  • 31. Evaluation of binding models Models Training set B. Ac. Training Evaluation set B. Ac. Eval Unambiguous Accu Unambig All predicted g_score opt_score DTU_1 873 0.82 3840 0.64 2695 0.78 16063 0.43 0.80 DTU_2 737 0.79 3268 0.61 2383 0.71 13442 0.36 0.75 EPA_NCCT 1529 0.87 7283 0.57 5275 0.69 32463 0.82 0.78 FDA_NCTR_DBB 1529 0.99 7283 0.60 5991 0.68 32464 0.87 0.84 FDA_NCTR_DSB 0 0.00 534 0.53 431 0.53 2008 0.03 0.53 Helmholtz_ISB 1512 0.89 7123 0.62 5860 0.72 31629 0.83 0.80 ILS_EPA 1506 0.84 7068 0.66 5814 0.75 31318 0.82 0.79 IRCCS_CART 1529 0.80 7280 0.61 3620 0.75 32442 0.78 0.77 IRCCS_Ruleset 1383 0.91 6603 0.56 5416 0.62 28958 0.75 0.77 JRC_Ispra 1465 0.82 6900 0.58 5672 0.67 30801 0.77 0.74 LockheedMartin_EPA_1 1529 0.83 7283 0.55 1539 0.66 32464 0.75 0.75 LockheedMartin_EPA_2 1529 0.76 7283 0.54 1539 0.64 32464 0.72 0.70 NIH_NCATS 1528 0.69 7271 0.59 5981 0.65 32184 0.77 0.67 NIH_NCI_GUASAR 1529 0.99 7283 0.61 5951 0.69 32455 0.88 0.84 NIH_NCI_PASS 1465 0.86 6900 0.58 5672 0.66 30800 0.78 0.76 RIFM 1529 0.73 7283 0.58 5991 0.65 32463 0.78 0.69 UMEA 1529 0.82 7280 0.61 5989 0.70 32430 0.82 0.76 UNC_MML_1 1529 0.80 7283 0.59 5991 0.65 32464 0.80 0.73 UNC_MML_2 1529 0.49 7283 0.55 5991 0.60 32464 0.69 0.55 UNIBA 750 0.86 3259 0.62 2753 0.73 15178 0.40 0.80 UNIMIB_Michem_1 1529 0.76 7283 0.55 5991 0.59 32464 0.77 0.68 UNIMIB_Michem_2 531 0.98 2780 0.62 2241 0.71 11832 0.32 0.85 UNISTRA_InfoChim 1529 0.86 7283 0.57 4755 0.60 32464 0.80 0.73
  • 32. Consensus_1 predictions Binding Agonist Antagonist CERAPP ID n_act score n_no score cons act_conc inact_c Potency cons act_c inact_c Potency cons act_c inact_c Potency 10001 1 0.05 15 0.71 0 0.06 0.94 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10005 3 0.11 17 0.65 0 0.15 0.85 Inactive 0 0.11 0.89 Inactive 0 0.14 0.86 Inactive 10007 4 0.18 12 0.58 0 0.25 0.75 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10008 0 0.00 18 0.76 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10009 1 0.04 17 0.71 0 0.06 0.94 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10016 21 0.76 0 0.00 1 1.00 0.00 Strong 1 1.00 0.00 Strong 1 1.00 0.00 Inactive 10017 21 0.76 0 0.00 1 1.00 0.00 Strong 1 1.00 0.00 Strong 1 1.00 0.00 Inactive 10018 16 0.61 4 0.15 1 0.80 0.20 VeryWeak 1 0.89 0.11 VeryWeak 1 0.86 0.14 Inactive 10027 19 0.72 1 0.04 1 0.95 0.05 Moderate 0 0.10 0.90 Inactive 0 0.13 0.88 Moderate 10033 4 0.17 13 0.58 0 0.24 0.76 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10034 21 0.75 0 0.00 1 1.00 0.00 Moderate 1 0.89 0.11 Moderate 1 0.86 0.14 Inactive 10088 11 0.42 9 0.34 1 0.55 0.45 VeryWeak 1 0.78 0.22 VeryWeak 1 0.86 0.14 Inactive 10089 1 0.04 19 0.72 0 0.05 0.95 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10099 2 0.09 15 0.66 0 0.12 0.88 Inactive 0 0.11 0.89 Inactive 0 0.13 0.88 Inactive 10100 6 0.24 12 0.50 0 0.33 0.67 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10101 3 0.12 16 0.64 0 0.16 0.84 Inactive 0 0.11 0.89 Inactive 0 0.14 0.86 Inactive 10102 12 0.43 9 0.32 1 0.57 0.43 VeryWeak 1 0.78 0.22 VeryWeak 1 0.71 0.29 Inactive 10111 3 0.12 16 0.64 0 0.16 0.84 Inactive 0 0.00 1.00 Inactive 0 0.00 1.00 Inactive 10112 22 0.75 0 0.00 1 1.00 0.00 Weak 1 1.00 0.00 Weak 1 1.00 0.00 Inactive 10113 21 0.75 0 0.00 1 1.00 0.00 Weak 1 1.00 0.00 Weak 1 1.00 0.00 Inactive 10119 12 0.46 8 0.30 1 0.60 0.40 VeryWeak 1 0.78 0.22 VeryWeak 1 0.71 0.29 Inactive 10120 11 0.39 10 0.36 1 0.52 0.48 VeryWeak 1 0.78 0.22 VeryWeak 1 0.71 0.29 Inactive
  • 33. ToxCast data Evaluation set Sensitivity 0.85 0.98 0.92 0.23 0.95 0.59 Specificity Balanced accuracy Total binders: 2576 Agonists: 2312 Antagonists: 2779 Consensus_1 evaluation
  • 34. Agonist and antagonist consensus models first, then on binding consensus: 1) If chemical i is active in classification consensus_1  active in Potency_class consensus_2 2) If chemical i is active in regression & >= 3 positive classification models  active in classification consensus_2 3) If chemical i is active in regression & < 3 positive classification models  Inactive in Potency_class consensus_2 Binding consensus: 4) If chemical i is active agonist or active antagonist  Active in classification consensus_2  Potency_class consensus_2 = Potency_class agonist/antagonist Rules for consensus_2
  • 35. Total binders: 3961 Agonists: 2494 Antagonists: 2793 Consensus_2 evaluation ToxCast data Literature data (All: 7283) Literature data (>6 sources: 1209) Sensitivity 0.93 0.30 0.87 Specificity 0.97 0.91 0.94 Balanced accuracy 0.95 0.61 0.91 ToxCast data Literature data ObservedPredicted Actives Inactives Actives Inactives Actives 83 6 597 1385 Inactives 40 1400 463 4838
  • 36. • positive concordance < 0.6 => Potency class= Very weak • 0.6=<positive concordance<0.75 => Potency class= Weak • 0.75=<positive concordance<0.9 => Potency class= Moderate • positive concordance>=0.9 => Potency class= Strong Positive concordance & Potency level Box plot of the positive classes of the consensus model. Variation of the balanced accuracy with positive concordance thresholds
  • 37. New External validation set ToxCast phIII+ Tox21 agonist assays ObservedPredicted Actives Inactives Actives 19 23 Inactives 17 561 ObservedPredicted Actives Inactives Actives 13 3 Inactives 17 551 Specificity: 0.97 Sensitivity: 0.81 Balanced accuracy: 0.89 Specificity: 0.97 Sensitivity: 0.45 Balanced accuracy: 0.71 All matching chemicals: 620 Only chemicals in agreement with other literature sources: 584
  • 38. • High quality training set (1677 chemicals) • Free & open-source structure curation workflow • Curated structures with potential exposure (32k) • QSAR-ready dataset from the literature (~7k) • Consensus models for binding, agonist & antagonist • 32k list predicted for prioritization. • EDSP dashboard: http://actor.epa.gov/edsp21/ future work Conclusions • Validate binding consensus with the new external set • Clean literature data from cytotoxicity. Use it as QSAR ready set.