This document discusses tools and methods for metabolomic data analysis and visualization. It covers visualization techniques like plots and networks to explore patterns in data. It also discusses statistical analysis methods like ANOVA and clustering for significance testing and pattern detection. Additionally, it discusses predictive modeling, network analysis using pathways, and network mapping to relate metabolites based on biochemical transformations, structural similarity, or empirical dependencies. Common analysis tasks and featured open-source tools are also highlighted.
3. 1. Visualization (how does it look?)
• histograms, density plots, box plots, line plots, scatter plots, networks, etc.
2. Statistical Analysis (what is statistically significant?)
• summary tables, ANOVA, FDR adjustment, power analysis, etc.
3. Exploration (what are the major patterns/trends?)
• clustering, PCA, ICA, etc.
4. Predictive Modeling (what explains my hypothesis?)
• mixed effects, partial least squares (O-/PLS/-DA), etc.
5. Network Analysis (how are things related?)
• Pathway Enrichment
• Biochemical, mass spectral, empirical, etc.
6. Network Mapping
Common Data Analysis Tasks
4. Featured Tools
Data Analysis and Visualization
•DeviumWeb- Dynamic multivariate data analysis and
visualization platform
url: https://github.com/dgrapov/DeviumWeb
•imDEV- Microsoft Excel add-in for multivariate analysis
url: http://sourceforge.net/projects/imdev/
Network Analysis
•MetaMapR- Network analysis tools for metabolomics
url: https://github.com/dgrapov/MetaMapR
Network Mapping
• MetaMapR + DeviumWeb + Cytoscape
6. Network Mapping
Visualization and analysis of statistical and multivariate results within a biochemical
and/or empirical context.
Conduct Data
Analysis
Generate
Network
Map Results to
Network
Grapov D., Fiehn O., Multivariate and network tools for analysis and visualization of metabolomic data, ASMS, June 08, 2013, Minneapolis, MN
8. Metabolic
Perturbations in
Tumorigenesis
Biochemical and
chemical similarity
network comparing
changes in metabolites
between tumor and
control tissue
Mappings:
• direction of change
• statistical significance
• multivariate importance
• molecular biochemical
domain
11. Empirical Networks
Experiment specific data driven relationships can offer novel insight into
biochemical perturbations urea cycle
nucleotide
synthesis
protein
glycosylation
12. Mass Spectral Networks
Use mass spectra as a proxy for structure to help make sense of
unknown compounds’ biochemical identities.
Mix and match different
relationships to help narrow
down structures of unknowns
Partial derivatization
products of glucose
14. Data visualization as form of data analysis
DM
Liver
CYP2D6
Dextromethorphan = additives in
dextrorphan
•high fructose
corn syrup
• antioxidants
•flavor
17. Other Resources
•TeachingDemos- Tutorials and Demonstrations in R
•url: http://sourceforge.net/projects/teachingdemos/?source=directory
•url: https://github.com/dgrapov/TeachingDemos
•CDS Blog- Data analysis Case Studies and Tutorials
url: http://imdevsoftware.wordpress.com/