2. Metabolomics
the quantitative and qualitative
analysis of all metabolites in
samples of cells, body fluids,
tissues, etc.
Julio E. Peironcely
3. Metabolomics
Experi- Biological
Biological Sample Data Data pre- Data
mental Sampling inter-
question preparation acquisition processing analysis
design pretation
Metabolites
Relevant
biomolecules/
List of
Samples Raw data connectivities
Protocol peaks/
&
biomolecules
Models
Julio E. Peironcely
4. Metabolomics
Experi- Biological
Biological Sample Data Data pre- Data
mental Sampling inter-
question preparation acquisition processing analysis
design pretation
Metabolites
Relevant
biomolecules/
List of
Samples Raw data connectivities
Protocol peaks/
&
biomolecules
Models
Julio E. Peironcely
17. Elemental
Composition
Structure Metabolite
Generation Likeness
Molecules
Julio E. Peironcely
18. Elemental
Composition
Metabolites
Structure Metabolite
Generation Likeness
Molecules
Julio E. Peironcely
19. Metabolite-likeness
Representation + Classification
HMDB ZINC
8K 21M
Atom Counts
Physicochemical desc. Support Vector
Machines (SVM)
MDL Public Keys
Random Forest (RF)
FCFP_4
Naïve Bayes (NB)
ECFP_4
Julio E. Peironcely
20. Metabolite-likeness HMDB
8K
ZINC
21M
Standardization
Atom Counts Diversity Selection
Physicochemical desc.
MDL Public Keys
FCFP_4
ECFP_4
Julio E. Peironcely
21. Metabolite-likeness HMDB
8K
ZINC
21M
Standardization
Atom Counts Diversity Selection
Physicochemical desc.
MDL Public Keys
FCFP_4 Training Set Test Set
ECFP_4 532 + 532 6.4K + 6.4K
Julio E. Peironcely
22. Metabolite-likeness HMDB
8K
ZINC
21M
Standardization
Atom Counts Diversity Selection
Physicochemical desc.
MDL Public Keys
FCFP_4 Training Set Test Set
ECFP_4 532 + 532 6.4K + 6.4K
5-fold CV
SVM RF BC
Julio E. Peironcely
23. Metabolite-likeness HMDB
8K
ZINC
21M
Standardization
Diversity Selection
3 classifiers
X
Training Set Test Set
5 descriptions 532 + 532 6.4K + 6.4K
5-fold CV Metabolite
likeness
SVM RF BC
Julio E. Peironcely
24. Metabolite-likeness HMDB
8K
ZINC
21M
Best = RF – MDLPublicKeys Standardization
Sensitivity Specificity AUC
Diversity Selection
99.84% 87.52% 99.20%
Training Set Test Set
Bad BC – P_desc 532 + 532 6.4K + 6.4K
Sensitivity Specificity AUC 5-fold CV Metabolite
likeness
SVM RF BC
42.51% 86.56% 61.57%
Julio E. Peironcely
31. Acknowledgements
Leiden University University of Cambridge
Theo Reijmers Andreas Bender
Thomas Hankemeier
TNO Quality of Life HMP University of
Alberta
Leon Coulier
David Wishart
Ying (Edison) Dong
Julio E. Peironcely