SlideShare une entreprise Scribd logo
1  sur  55
1
Principles of Peak Picking and Alignment
Emma L. Schymanski
FNR ATTRACT Fellow and PI in Environmental Cheminformatics
Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg
Email: emma.schymanski@uni.lu
…and many colleagues who contributed to my science over the years!
ASMS Fall Meeting, San Francisco, California, November 29-30, 2018
Image©www.seanoakley.com/
https://tinyurl.com/asmsfall2018-peaks
How many peaks will a peak picker pick if a peak picker only picks peaks?
2
(nevertheless, I will do my best!)
DISCLAIMER!
MS1
MS2
Two very different worlds …
3
Presenting Peak Picking: Plan
o Why Peak Pick
o Terminology
• Peak Picking vs Centroid vs Profile …
o Peak Picking & Peak Pickers
• “best of” xcms and enviPick
• Peak Picking in Pictures
• Peak Picking Parameters
• Alleviating Peak Picking Parameter Panic
o Alignment ( / Profiling)
• “best of” xcms and enviMass
o Peak Picking Pointers
o Don’t just listen to me … do it!
4
Why Peak Pick (I)
Example scheme of liquid chromatography - mass spectrometry
Image © www.planetorbitrap.com/q-exactive
Sampling
Extraction (SPE)
HPLC separation
HR-MS/MS
5
Why Peak Pick (II)
This is what the output “really” looks like …
Image © www.planetorbitrap.com/q-exactive
6
Why Peak Pick (III)
Identification = turning numbers into structures
N
N
N
S
CH3
NHNH
CH3
CH3
CH3
N
N
N
S
CH3
NHNHCH3
CH3
OH
P
O
S
SO
CH3
CH3
CH3
P OHS
S
O
CH3
CH3
OH
CH3
S
O
O
OH
CH3
CH3
S
N
S
O
O
OH
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
N
N
N
S
NHNH
CH3
CH3
CH3
NH2
OH
O
massbank.eu
7
TERMINOLOGY!
o Peak picking can be multi-directional, i.e.
• in mass… or time…
8
Mass: Centroid vs Profile Data (enviPat)
https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
9
Mass: Centroid vs Profile Data (enviPat)
https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
10
TERMINOLOGY!
http://proteowizard.sourceforge.net/
o Peak picking can be multi-directional (mass, time)
• Peak picking in Proteowizard MSConvert is “centroiding” masses
(turning profile mode data into centroided data for efficient processing)
11
Peak Picking (in time)
Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504
o Peak picking along time axis (chromatographic peaks)
12
Peak Picking
Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504
o Peak picking along time axis (chromatographic peaks)
13
Peak Picking
Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html
o Peak picking along time axis (chromatographic peaks)
14
Peak Picking
Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html
o Peak picking along time axis (chromatographic peaks)
Several Samples Overlaid
Red = KO
Blue = wild type
Rectangle = chromatographic
peaks identified per sample
15
Peak Picking
o Several options for peak picking
• XCMS and centWave
• Tautenhahn et al 2008 DOI: 10.1186/1471-2105-9-504
• http://bioconductor.org/packages/xcms/
• MZmine 2
• Pluskal et al 2010 DOI: 10.1186/1471-2105-11-395
• http://mzmine.github.io/
• enviPick / enviMass
• Loos 2018 DOI: 10.5281/zenodo.1213098
• http://www.looscomputing.ch/eng/enviMass/overview.htm
• Plenty of other open, research and vendor options ...
16
Peak Picking
o Result is something like this (from Formulator output):
17
Peak Picking – XCMS & XCMS Online
o http://bioconductor.org/packages/xcms/
18
Peak Picking – XCMS & XCMS Online
o https://xcmsonline.scripps.edu/
19
Peak Picking – enviMass and enviPick
o http://www.looscomputing.ch/eng/enviMass/overview.htm
o R packages …
20
Peak Picking in Pictures
http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm
Red = peaks
Grey = noise
21
Peak Picking .. Somewhat simpler picture
http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm
22
centWave – Gaussian with “Mexican Hat”
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
23
centWave – Gaussian with “Mexican Hat”
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
24
centWave – Gaussian with “Mexican Hat”
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
25
But … peaks are not perfect!
http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm
o See enviMass website for explanation …
26
Critical Point: Separating Peaks from Baseline
27
Peak Picking Parameters
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
o There are a lot of options to tweak!
• I will just run through (main) centWave parameters
• enviPick is too complicated => further reading!
28
Peak Picking Parameters: centWave
ppm maximal tolerated m/z deviation in consecutive scans, in
ppm (parts per million)
NOTE: dependent on your mass spectrometer
29
Peak Picking Parameters: centWave
peakwidth Chromatographic peak width, given as range (min,max) in seconds
NOTE: highly dependent on your chromatography!
30
Peak Picking Parameters: centWave
snthresh Signal to noise ratio cutoff
31
Peak Picking Parameters: centWave
prefilter prefilter=c(k,I). Prefilter step for the first phase. Mass traces are
only retained if they contain at least k peaks with intensity >= I
Only one “stick” so will
fail recommended prefilter
settings
32
Too Many Peak Picking Parameters ???????
https://bioconductor.org/packages/
release/bioc/vignettes/IPO/inst/doc
/IPO.html
o IPO to the rescue!
o Parameter
optimization for
xcms-based
workflows …
o Libiseller et al
2015, DOI:
10.1186/s12859-015-0562-8
IPO = Isotopologue Parameter Optimization
33
Too Many Peak Picking Parameters ???????
34
RECAP: Why Peak Pick?
Identification = turning numbers into structures
N
N
N
S
CH3
NHNH
CH3
CH3
CH3
N
N
N
S
CH3
NHNHCH3
CH3
OH
P
O
S
SO
CH3
CH3
CH3
P OHS
S
O
CH3
CH3
OH
CH3
S
O
O
OH
CH3
CH3
S
N
S
O
O
OH
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
S
O
O
OH
CH3
CH3
N
N
N
S
NHNH
CH3
CH3
CH3
NH2
OH
O
massbank.eu
35
o Instruments change over time …
o Before we can do fancy statistics, we need to make sure
our samples are comparable!
36
Alignment
http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#3_initial_data_inspection
o Alignment / Profiling => which peaks belong together
across large sample sets?
37
Alignment
http://www.looscomputing.ch/eng/enviMass/topics/profiling.htm
o “Profiling” in enviMass
38
Alignment ~= Retention Time Correction
http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#3_initial_data_inspection
o Many algorithms and methods …
o Before:
39
Alignment ~= Retention Time Correction
http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#5_alignment
o Many algorithms and methods …
o After (Obiwarp algorithm in xcms)
40
Before Alignment
After Alignment
41
Changes over samples
http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#5_alignment
o Difference between adjusted and raw retention times
along the retention time axis
42
Some advice …
o Peak pickers are designed to pick the perfect peak
• But life is never perfect and peaks are no different
o Pick the peak picker that is best for your situation
• Convenience, ease of use, designed for your data, …
• The optimal choice is usually a compromise
o Be sceptical (visualise your data, reality check it, etc.)
• But don’t go overboard in evaluating peak pickers … remember
your (real) goal …
43
Peak Picking Overlap (centWave paper)
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
44
Verify with EIC Extraction [these are NOT picked]
https://github.com/schymane/ReSOLUTION/blob/master/R/RMB_EIC_prescreen.R
No peak at all
Nice peak, MSMS
Peak, no MSMS
Noise with MSMS (careful!)
Isobars with MSMS (careful!)*
Looking for chemicals known
to be present in the sample
45
Just because you find a peak …
ENTACT Project: https://www.epa.gov/sites/production/files/2018-06/documents/comptox_cop_6-28-18.pdf
o Mix 505: One candidate with this mass/formula
• DTXSID9040001, C9H8O4
o One chemical…
How many
peaks?
46
…doesn’t mean it’s your compound of interest!
47
Beware of artefacts!
o Your results also depend on the acquisition data!
48
Further reading DOING! [Vendor independent]
o Don’t just take my word for it … don’t just read about it
… DO IT. There are so many ways to try it out …
complete with sample data! [Open Science!]
o http://bioconductor.org/packages/release/bioc/vignettes/x
cms/inst/doc/xcms.html
o http://www.looscomputing.ch/eng/enviMass/overview.htm
o An interface that many enjoy, likely comes with example
data but requires a login …
o https://xcmsonline.scripps.edu/
49
Further reading DOING! [Vendor independent]
o http://mzmine.github.io/
o http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/
o MS-DIAL
50
Acknowledgements
emma.schymanski@uni.lu
Further Information:
http://bioconductor.org/packages/xcms/
http://www.looscomputing.ch/eng/enviMass/overview.htm
https://xcmsonline.scripps.edu/
http://mzmine.github.io/
EU Grant
603437
The CompMS Community (proxy photo)
51
Extra Slides
52
Quality Control of Data
Slide c/o Michael Stravs
o Always visualise results … never take anything for granted
53
Homologues: Challenge Peak Pickers but are Present!
Stravs et al. (2013), J. Mass Spectrom, 48(1):89-99. DOI: 10.1002/jms.3131
OHSO
O
CH3
O
OH
m n
SPA-9C
m+n=6
www.massbank.eu ACCESSIONS (LAS, SPACs):
Literature MS/MS LIT00034, LIT00037
Std Mix., Sample ETS00012, ETS00018https://github.com/MassBank/RMassBank/
Tentatively Identified Spectra:
http://goo.gl/0t7jGp
54
Be wary of instrument specific phenomena!
o R package nontarget: satellite peak removal
55
Be wary of instrument specific phenomena II
o Orbitrap-specific calibration issues (not observed in TOF)

Contenu connexe

Tendances

High throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIHHigh throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIHpetermurrayrust
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidatapetermurrayrust
 
Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020crovida
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKpetermurrayrust
 
High throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesesHigh throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesespetermurrayrust
 
Linking the silos. Data and predictive models integration in toxicology.
Linking the silos. Data and predictive models integration in toxicology.Linking the silos. Data and predictive models integration in toxicology.
Linking the silos. Data and predictive models integration in toxicology.Nina Jeliazkova
 
Towards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspectiveTowards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspectivepetermurrayrust
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Valery Tkachenko
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSValery Tkachenko
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and modelsmyGrid team
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksCarole Goble
 

Tendances (20)

Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...Adding complex expert knowledge into chemical database and transforming surfa...
Adding complex expert knowledge into chemical database and transforming surfa...
 
High throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIHHigh throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIH
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidata
 
Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020Printout webinar r ax costanza 05 05-2020
Printout webinar r ax costanza 05 05-2020
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UK
 
High throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and thesesHigh throughput mining of the scholarly literature: journals and theses
High throughput mining of the scholarly literature: journals and theses
 
Linking the silos. Data and predictive models integration in toxicology.
Linking the silos. Data and predictive models integration in toxicology.Linking the silos. Data and predictive models integration in toxicology.
Linking the silos. Data and predictive models integration in toxicology.
 
Towards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspectiveTowards Responsible Content Mining: A Cambridge perspective
Towards Responsible Content Mining: A Cambridge perspective
 
Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...Building linked data large-scale chemistry platform - challenges, lessons and...
Building linked data large-scale chemistry platform - challenges, lessons and...
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
 
ISMB Workshop 2014
ISMB Workshop 2014ISMB Workshop 2014
ISMB Workshop 2014
 
The beauty of workflows and models
The beauty of workflows and modelsThe beauty of workflows and models
The beauty of workflows and models
 
4A2B2C-2013
4A2B2C-20134A2B2C-2013
4A2B2C-2013
 
NETTAB 2013
NETTAB 2013NETTAB 2013
NETTAB 2013
 
Overview of open resources to support automated structure verification and e...
Overview of open resources to support automated structure verification  and e...Overview of open resources to support automated structure verification  and e...
Overview of open resources to support automated structure verification and e...
 
Cheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural ProductsCheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural Products
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
CSHALS 2013
CSHALS 2013CSHALS 2013
CSHALS 2013
 
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object FrameworksResults Vary: The Pragmatics of Reproducibility and Research Object Frameworks
Results Vary: The Pragmatics of Reproducibility and Research Object Frameworks
 
Beyond the PDF 2, 2013
Beyond the PDF 2, 2013Beyond the PDF 2, 2013
Beyond the PDF 2, 2013
 

Similaire à ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking

CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beikobeiko
 
Dr. gerald pfister challenges, solutions and innovations in modern flowcyto...
Dr. gerald pfister   challenges, solutions and innovations in modern flowcyto...Dr. gerald pfister   challenges, solutions and innovations in modern flowcyto...
Dr. gerald pfister challenges, solutions and innovations in modern flowcyto...Hitham Esam
 
Machine learning in scientific workflows
Machine learning in scientific workflowsMachine learning in scientific workflows
Machine learning in scientific workflowsBalázs Kégl
 
Museum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themMuseum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themRoss Mounce
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesJan Aerts
 
CMSY workshop - Gianpaolo Coro (ISTI-CNR)
CMSY workshop - Gianpaolo Coro (ISTI-CNR)CMSY workshop - Gianpaolo Coro (ISTI-CNR)
CMSY workshop - Gianpaolo Coro (ISTI-CNR)Blue BRIDGE
 
Lec6: Pre-Processing for Nuclear Medicine Images
Lec6: Pre-Processing for Nuclear Medicine ImagesLec6: Pre-Processing for Nuclear Medicine Images
Lec6: Pre-Processing for Nuclear Medicine ImagesUlaş Bağcı
 
2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekinge2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekingeProf. Wim Van Criekinge
 
Basic Concepts of Clinical Flowcytometry
Basic Concepts of Clinical FlowcytometryBasic Concepts of Clinical Flowcytometry
Basic Concepts of Clinical FlowcytometryPravin Amabade
 
Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.Nicolò Paternoster
 
Lecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptxLecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptxNatKell
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Sunghwan Kim
 
Managing & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of BioinformaticsManaging & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of BioinformaticsRaul Chong
 
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Robert (Rob) Salomon
 

Similaire à ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking (20)

CCBC tutorial beiko
CCBC tutorial beikoCCBC tutorial beiko
CCBC tutorial beiko
 
Dr. gerald pfister challenges, solutions and innovations in modern flowcyto...
Dr. gerald pfister   challenges, solutions and innovations in modern flowcyto...Dr. gerald pfister   challenges, solutions and innovations in modern flowcyto...
Dr. gerald pfister challenges, solutions and innovations in modern flowcyto...
 
ChIP-seq - Data processing
ChIP-seq - Data processingChIP-seq - Data processing
ChIP-seq - Data processing
 
Machine learning in scientific workflows
Machine learning in scientific workflowsMachine learning in scientific workflows
Machine learning in scientific workflows
 
Museum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themMuseum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on them
 
Next-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologiesNext-generation sequencing course, part 1: technologies
Next-generation sequencing course, part 1: technologies
 
CMSY workshop - Gianpaolo Coro (ISTI-CNR)
CMSY workshop - Gianpaolo Coro (ISTI-CNR)CMSY workshop - Gianpaolo Coro (ISTI-CNR)
CMSY workshop - Gianpaolo Coro (ISTI-CNR)
 
Lec6: Pre-Processing for Nuclear Medicine Images
Lec6: Pre-Processing for Nuclear Medicine ImagesLec6: Pre-Processing for Nuclear Medicine Images
Lec6: Pre-Processing for Nuclear Medicine Images
 
Introduction to Genetic Algorithms
Introduction to Genetic AlgorithmsIntroduction to Genetic Algorithms
Introduction to Genetic Algorithms
 
Introduction to Bayesian phylogenetics and BEAST
Introduction to Bayesian phylogenetics and BEASTIntroduction to Bayesian phylogenetics and BEAST
Introduction to Bayesian phylogenetics and BEAST
 
2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekinge2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekinge
 
Basic Concepts of Clinical Flowcytometry
Basic Concepts of Clinical FlowcytometryBasic Concepts of Clinical Flowcytometry
Basic Concepts of Clinical Flowcytometry
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.
 
New Approach Methods - What is That?
New Approach Methods - What is That?New Approach Methods - What is That?
New Approach Methods - What is That?
 
T1 2018 bioinformatics
T1 2018 bioinformaticsT1 2018 bioinformatics
T1 2018 bioinformatics
 
Lecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptxLecture 2 - Bit vs Qubits.pptx
Lecture 2 - Bit vs Qubits.pptx
 
Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...Using open bioactivity data for developing machine-learning prediction models...
Using open bioactivity data for developing machine-learning prediction models...
 
Managing & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of BioinformaticsManaging & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
 
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
Genomic Cytometry: Using Multi-Omic Approaches to Increase Dimensionality in ...
 

Dernier

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 

Dernier (20)

DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 

ASMS Fall 2018 Metabolomics Informatics Workshop Peak Picking

  • 1. 1 Principles of Peak Picking and Alignment Emma L. Schymanski FNR ATTRACT Fellow and PI in Environmental Cheminformatics Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg Email: emma.schymanski@uni.lu …and many colleagues who contributed to my science over the years! ASMS Fall Meeting, San Francisco, California, November 29-30, 2018 Image©www.seanoakley.com/ https://tinyurl.com/asmsfall2018-peaks How many peaks will a peak picker pick if a peak picker only picks peaks?
  • 2. 2 (nevertheless, I will do my best!) DISCLAIMER! MS1 MS2 Two very different worlds …
  • 3. 3 Presenting Peak Picking: Plan o Why Peak Pick o Terminology • Peak Picking vs Centroid vs Profile … o Peak Picking & Peak Pickers • “best of” xcms and enviPick • Peak Picking in Pictures • Peak Picking Parameters • Alleviating Peak Picking Parameter Panic o Alignment ( / Profiling) • “best of” xcms and enviMass o Peak Picking Pointers o Don’t just listen to me … do it!
  • 4. 4 Why Peak Pick (I) Example scheme of liquid chromatography - mass spectrometry Image © www.planetorbitrap.com/q-exactive Sampling Extraction (SPE) HPLC separation HR-MS/MS
  • 5. 5 Why Peak Pick (II) This is what the output “really” looks like … Image © www.planetorbitrap.com/q-exactive
  • 6. 6 Why Peak Pick (III) Identification = turning numbers into structures N N N S CH3 NHNH CH3 CH3 CH3 N N N S CH3 NHNHCH3 CH3 OH P O S SO CH3 CH3 CH3 P OHS S O CH3 CH3 OH CH3 S O O OH CH3 CH3 S N S O O OH S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 N N N S NHNH CH3 CH3 CH3 NH2 OH O massbank.eu
  • 7. 7 TERMINOLOGY! o Peak picking can be multi-directional, i.e. • in mass… or time…
  • 8. 8 Mass: Centroid vs Profile Data (enviPat) https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
  • 9. 9 Mass: Centroid vs Profile Data (enviPat) https://www.envipat.eawag.ch/index.php and Loos et al Anal. Chem. 87(11), 5738-5744. DOI: 10.1021/acs.analchem.5b00941
  • 10. 10 TERMINOLOGY! http://proteowizard.sourceforge.net/ o Peak picking can be multi-directional (mass, time) • Peak picking in Proteowizard MSConvert is “centroiding” masses (turning profile mode data into centroided data for efficient processing)
  • 11. 11 Peak Picking (in time) Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504 o Peak picking along time axis (chromatographic peaks)
  • 12. 12 Peak Picking Source: R. Tautenhahn, C. Böttcher, S. Neumann, BMC Bioinformatics 2008, 9:504. DOI: 10.1186/1471-2105-9-504 o Peak picking along time axis (chromatographic peaks)
  • 13. 13 Peak Picking Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html o Peak picking along time axis (chromatographic peaks)
  • 14. 14 Peak Picking Source: Johannes Rainer; http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html o Peak picking along time axis (chromatographic peaks) Several Samples Overlaid Red = KO Blue = wild type Rectangle = chromatographic peaks identified per sample
  • 15. 15 Peak Picking o Several options for peak picking • XCMS and centWave • Tautenhahn et al 2008 DOI: 10.1186/1471-2105-9-504 • http://bioconductor.org/packages/xcms/ • MZmine 2 • Pluskal et al 2010 DOI: 10.1186/1471-2105-11-395 • http://mzmine.github.io/ • enviPick / enviMass • Loos 2018 DOI: 10.5281/zenodo.1213098 • http://www.looscomputing.ch/eng/enviMass/overview.htm • Plenty of other open, research and vendor options ...
  • 16. 16 Peak Picking o Result is something like this (from Formulator output):
  • 17. 17 Peak Picking – XCMS & XCMS Online o http://bioconductor.org/packages/xcms/
  • 18. 18 Peak Picking – XCMS & XCMS Online o https://xcmsonline.scripps.edu/
  • 19. 19 Peak Picking – enviMass and enviPick o http://www.looscomputing.ch/eng/enviMass/overview.htm o R packages …
  • 20. 20 Peak Picking in Pictures http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm Red = peaks Grey = noise
  • 21. 21 Peak Picking .. Somewhat simpler picture http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm
  • 22. 22 centWave – Gaussian with “Mexican Hat” https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  • 23. 23 centWave – Gaussian with “Mexican Hat” https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  • 24. 24 centWave – Gaussian with “Mexican Hat” https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  • 25. 25 But … peaks are not perfect! http://www.looscomputing.ch/eng/enviMass/topics/peakpicking.htm o See enviMass website for explanation …
  • 26. 26 Critical Point: Separating Peaks from Baseline
  • 27. 27 Peak Picking Parameters https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504 o There are a lot of options to tweak! • I will just run through (main) centWave parameters • enviPick is too complicated => further reading!
  • 28. 28 Peak Picking Parameters: centWave ppm maximal tolerated m/z deviation in consecutive scans, in ppm (parts per million) NOTE: dependent on your mass spectrometer
  • 29. 29 Peak Picking Parameters: centWave peakwidth Chromatographic peak width, given as range (min,max) in seconds NOTE: highly dependent on your chromatography!
  • 30. 30 Peak Picking Parameters: centWave snthresh Signal to noise ratio cutoff
  • 31. 31 Peak Picking Parameters: centWave prefilter prefilter=c(k,I). Prefilter step for the first phase. Mass traces are only retained if they contain at least k peaks with intensity >= I Only one “stick” so will fail recommended prefilter settings
  • 32. 32 Too Many Peak Picking Parameters ??????? https://bioconductor.org/packages/ release/bioc/vignettes/IPO/inst/doc /IPO.html o IPO to the rescue! o Parameter optimization for xcms-based workflows … o Libiseller et al 2015, DOI: 10.1186/s12859-015-0562-8 IPO = Isotopologue Parameter Optimization
  • 33. 33 Too Many Peak Picking Parameters ???????
  • 34. 34 RECAP: Why Peak Pick? Identification = turning numbers into structures N N N S CH3 NHNH CH3 CH3 CH3 N N N S CH3 NHNHCH3 CH3 OH P O S SO CH3 CH3 CH3 P OHS S O CH3 CH3 OH CH3 S O O OH CH3 CH3 S N S O O OH S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 S O O OH CH3 CH3 N N N S NHNH CH3 CH3 CH3 NH2 OH O massbank.eu
  • 35. 35 o Instruments change over time … o Before we can do fancy statistics, we need to make sure our samples are comparable!
  • 38. 38 Alignment ~= Retention Time Correction http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#3_initial_data_inspection o Many algorithms and methods … o Before:
  • 39. 39 Alignment ~= Retention Time Correction http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#5_alignment o Many algorithms and methods … o After (Obiwarp algorithm in xcms)
  • 41. 41 Changes over samples http://bioconductor.org/packages/release/bioc/vignettes/xcms/inst/doc/xcms.html#5_alignment o Difference between adjusted and raw retention times along the retention time axis
  • 42. 42 Some advice … o Peak pickers are designed to pick the perfect peak • But life is never perfect and peaks are no different o Pick the peak picker that is best for your situation • Convenience, ease of use, designed for your data, … • The optimal choice is usually a compromise o Be sceptical (visualise your data, reality check it, etc.) • But don’t go overboard in evaluating peak pickers … remember your (real) goal …
  • 43. 43 Peak Picking Overlap (centWave paper) https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-504
  • 44. 44 Verify with EIC Extraction [these are NOT picked] https://github.com/schymane/ReSOLUTION/blob/master/R/RMB_EIC_prescreen.R No peak at all Nice peak, MSMS Peak, no MSMS Noise with MSMS (careful!) Isobars with MSMS (careful!)* Looking for chemicals known to be present in the sample
  • 45. 45 Just because you find a peak … ENTACT Project: https://www.epa.gov/sites/production/files/2018-06/documents/comptox_cop_6-28-18.pdf o Mix 505: One candidate with this mass/formula • DTXSID9040001, C9H8O4 o One chemical… How many peaks?
  • 46. 46 …doesn’t mean it’s your compound of interest!
  • 47. 47 Beware of artefacts! o Your results also depend on the acquisition data!
  • 48. 48 Further reading DOING! [Vendor independent] o Don’t just take my word for it … don’t just read about it … DO IT. There are so many ways to try it out … complete with sample data! [Open Science!] o http://bioconductor.org/packages/release/bioc/vignettes/x cms/inst/doc/xcms.html o http://www.looscomputing.ch/eng/enviMass/overview.htm o An interface that many enjoy, likely comes with example data but requires a login … o https://xcmsonline.scripps.edu/
  • 49. 49 Further reading DOING! [Vendor independent] o http://mzmine.github.io/ o http://prime.psc.riken.jp/Metabolomics_Software/MS-DIAL/ o MS-DIAL
  • 52. 52 Quality Control of Data Slide c/o Michael Stravs o Always visualise results … never take anything for granted
  • 53. 53 Homologues: Challenge Peak Pickers but are Present! Stravs et al. (2013), J. Mass Spectrom, 48(1):89-99. DOI: 10.1002/jms.3131 OHSO O CH3 O OH m n SPA-9C m+n=6 www.massbank.eu ACCESSIONS (LAS, SPACs): Literature MS/MS LIT00034, LIT00037 Std Mix., Sample ETS00012, ETS00018https://github.com/MassBank/RMassBank/ Tentatively Identified Spectra: http://goo.gl/0t7jGp
  • 54. 54 Be wary of instrument specific phenomena! o R package nontarget: satellite peak removal
  • 55. 55 Be wary of instrument specific phenomena II o Orbitrap-specific calibration issues (not observed in TOF)