High-throughput proteomics: from understanding data to predicting them

•

2 likes•1,281 views

High-throughput proteomics: from understanding data to predicting themprof. dr. Lennart Martens UGent - Department of Biochemistry, Faculty of Medicine and Health Sciences, VIB - Group Leader Computational Omics and Systems Biology Group (CompOmics), Department of Medical Protein Research In proteomics, as in any high-throughput omics field, the rate of data generation has increased dramatically, yielding very large datasets that require substantial processing to render them useful and interpretable. Key concepts here are data management, data-bound analysis algorithms, and user interface design. But we do not need to limit ourselves to only the interpretation of experimental results. By combining data from across many (unrelated) experiments, we can gain substantial knowledge about the strengths and limitations of our technological approaches. High-throughput methods however, rarely serve as the endpoint for research. As exquisite parallel hypothesis testers, these approaches can quickly highlight promising follow-up targets for more detailed study. Yet moving from discovery to targeted analysis requires much more in-depth understanding of sample and methodology, which is where the insights gained from large-scale data analysis come into play. Armed with this knowledge, we can begin to predict experimental outcomes based on specific hypotheses, thus effectively creating tests or assays that can be used in focused validation experiments

Education Technology

proteomics and cross-omics integration lennart martens lennart.martens@ugent.be Computational Omics and Systems Biology Group Department of Medical Protein Research, VIB Department of Biochemistry, Ghent University Ghent, Belgium

OMICS TECHNOLOGIESIN (CLINICAL) RESEARCH

Omics technologies are massively parallel microarray 2D gel shotgun LC-MS next-gen sequencing interaction network pathway systems biology modelling

…and have a vast analytical range Anderson’s analysis of identified plasma proteins across three proteomics analyses illustrates the difficulties in consistently finding low-abundance proteins using standard, explorative proteomics analyses. At the same time, it proves the tremendous ability of the instruments to span 11 orders of magnitude in a single analysis! From: Anderson, J. Physiol., 563.1:23-60 (2005), and http://powersof10.com

Tools to visualize your hard-earned data See: Colaert et al., Journal of Proteome Research, 2011

Looking at protein quantification See: Colaert et al., Proteomics 2010, and Colaert et al., Nature Methods, 2011

Analysing separation of plasma samples 373 SCX separations See: Foster et al., Proteomics 2011

Viewing the analysed data (peptide level) See: Foster et al., Proteomics 2011

A whole experiment in 100 numbers See: Foster et al., Proteomics 2011

From 20 magicnumbers to 2 dimensions yeast human green plants zebrafish Drosophila See: Foster et al., Proteomics 2011

Predicting RT for modified peptides See: Moruz et al., submitted

Fragmentation variability (i) See: Barsnes et al, Proteomics, 2010

Fragmentation variability (ii) See: Barsnes et al., Proteomics 2011

Direct pathway analysis pathways patients

CompOmicsgroupand collaborators Dr. Kenny Helsens, UGent Dr. HaraldBarsnes, UiB, Bergen, NO Dr. Michael Mueller, ICL, London, UK Dr. Sven Degroeve, UGent Dr.ElienVandermarliere, UGent LuminitaMoruz, CBR/SU, SK NielsHulstaert, UGent Marc Vaudel, ISAS, Dortmund, DE Giulia Gonnelli, UGent ThiloMuth, MPI Magdeburg, DE Joe Foster, EMBL-EBI, Cambridge, UK Dr.NiklaasColaert, ex-UGent

Acknowledgments - Collaborators VIB / UGent, Gent, Belgium Prof. Dr. Joël Vandekerckhove, Dept. Head (emeritus) Stockholm University, CBR, Sweden Prof. Dr. Lukas Käll, Group Leader ISAS, Dortmund, Germany Prof. Dr. Albert Sickmann, Director Bioanalytics EMBL-EBI, Cambridge, UK Dr. Rolf Apweiler, PANDA Group Leader Dr. Juan Antonio Vizcaíno, PRIDE Group Coordinator Bergen University, Bergen, Norway Prof. Ingvar Eidhammer, BCCS Dr. Frode Berven, PROBE Director

What's hot

Bioinformatics, its application mainKAUSHAL SAHU

Sigma Xi 2021 Andrew Gao PresentationAndrewGao12

BioinformaticsHafeezarana

Project report-on-bio-informaticsDaniela Rotariu

Bioinformatics Final PresentationShruthi Choudary

Cimetta et al., 2013Fran Flores

Genomics2 Phenomics CompleteInterpretOmics

Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...eventi-ITBbari

Informal presentation on bioinformaticsAtai Rabby

iOmicsInterpretOmics

Introduction to systems biologylemberger

Applications of bioinformaticsSudha Rameshwari

Bioinformatics in present and its futureহলুদ হিমু

Bioinformatics & It's Scope in BiotechnologyTuhin Samanta

The Value of Bioinformatics SoftwareRobert Ward Cutler Thailand

presentationPeter Langfelder

Applications of bioinformatics, main by kk sahuKAUSHAL SAHU

Cell Authentication By STR ProfilingCreative-Bioarray

Proposal for 2016 survey of WGS capacity in EU/EEA Member StatesEuropean Center for Disease Prevention and Control (ECDC)

Systems biology & Approaches of genomics and proteomicssonam786

What's hot (20)

Bioinformatics, its application main

Sigma Xi 2021 Andrew Gao Presentation

Bioinformatics

Project report-on-bio-informatics

Bioinformatics Final Presentation

Cimetta et al., 2013

Genomics2 Phenomics Complete

Maria A. Diroma – MEWAs: sviluppo di un sistema bioinformatico per studi di a...

Informal presentation on bioinformatics

iOmics

Introduction to systems biology

Applications of bioinformatics

Bioinformatics in present and its future

Bioinformatics & It's Scope in Biotechnology

The Value of Bioinformatics Software

presentation

Applications of bioinformatics, main by kk sahu

Cell Authentication By STR Profiling

Proposal for 2016 survey of WGS capacity in EU/EEA Member States

Systems biology & Approaches of genomics and proteomics

Similar to High-throughput proteomics: from understanding data to predicting them

INBIOMEDvision Workshop at MIE 2011. Victoria LópezINBIOMEDvision

Gellibolian 2010 Audio Visual2Robert Gellibolian, Ph.D

Introducción a la bioinformaticaMartín Arrieta

A statistical framework for multiparameter analysis at the single cell levelShashaanka Ashili

Bms 2010Philip Bourne

Methods to enhance the validity of precision guidelines emerging from big dataChirag Patel

Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...Varij Nayan

BIOINFORMATICS Applications And ChallengesAmos Watentena

Genomics and Proteomics - Impact on Drug DiscoveryPhilip Bourne

Role of bioinformatics of drug designingDr NEETHU ASOKAN

Exploring proteins, chemicals and their interactions with STRING and STITCHbiocs

Stephen Friend MIT 2011-10-20Sage Base

EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for HarmonizationEuropean Centre for Disease Prevention and Control (ECDC)

proteomicssathish sak

Pluripotent stem cells An in vitro model for nanotoxicityDr. Harish Handral

Bio ontology drtc-seminar_anweshaanwesha bhattacharya

Grafström - Lush Prize Conference 2014LushPrize

Ontologies for Semantic Normalization of Immunological DataYannick Pouliot

Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...Fundación Ramón Areces

Comparative differential leucocyte count and morphometrical analyses of black...African Journal of Biological Sciences

Similar to High-throughput proteomics: from understanding data to predicting them (20)

INBIOMEDvision Workshop at MIE 2011. Victoria López

Gellibolian 2010 Audio Visual2

Introducción a la bioinformatica

A statistical framework for multiparameter analysis at the single cell level

Bms 2010

Methods to enhance the validity of precision guidelines emerging from big data

Interactomics, Integromics to Systems Biology: Next Animal Biotechnology Fron...

BIOINFORMATICS Applications And Challenges

Genomics and Proteomics - Impact on Drug Discovery

Role of bioinformatics of drug designing

Exploring proteins, chemicals and their interactions with STRING and STITCH

Stephen Friend MIT 2011-10-20

EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization

proteomics

Pluripotent stem cells An in vitro model for nanotoxicity

Bio ontology drtc-seminar_anwesha

Grafström - Lush Prize Conference 2014

Ontologies for Semantic Normalization of Immunological Data

Bertrand de Meulder-El impacto de las ciencias ómicas en la medicina, la nutr...

Comparative differential leucocyte count and morphometrical analyses of black...

Recently uploaded

Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth

Accessible design: Minimum effort, maximum impactdawncurless

Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani

Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"National Information Standards Organization (NISO)

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxRAM LAL ANAND COLLEGE, DELHI UNIVERSITY.

Activity 01 - Artificial Culture (1).pdfciinovamais

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr

Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1

1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh

Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande

Measures of Central Tendency: Mean, Median and ModeThiyagu K

Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB

mini mental status format.docxPoojaSen20

How to Make a Pirate ship Primary Education.pptxmanuelaromero2013

Nutritional Needs Presentation - HLTH 104misteraugie

URLs and Routing in the Odoo 17 Website AppCeline George

Staff of Color (SOC) Retention Efforts DDSDDavid Douglas School District

1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh

Recently uploaded (20)

Introduction to ArtificiaI Intelligence in Higher Education

Accessible design: Minimum effort, maximum impact

Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991

Sanyam Choudhary Chemistry practical.pdf

Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"

INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx

Activity 01 - Artificial Culture (1).pdf

“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...

Employee wellbeing at the workplace.pptx

1029 - Danh muc Sach Giao Khoa 10 . pdf

Web & Social Media Analytics Previous Year Question Paper.pdf

Measures of Central Tendency: Mean, Median and Mode

Beyond the EU: DORA and NIS 2 Directive's Global Impact

mini mental status format.docx

How to Make a Pirate ship Primary Education.pptx

Nutritional Needs Presentation - HLTH 104

URLs and Routing in the Odoo 17 Website App

Staff of Color (SOC) Retention Efforts DDSD

1029-Danh muc Sach Giao Khoa khoi 6.pdf

High-throughput proteomics: from understanding data to predicting them

1. proteomics and cross-omics integration lennart martens lennart.martens@ugent.be Computational Omics and Systems Biology Group Department of Medical Protein Research, VIB Department of Biochemistry, Ghent University Ghent, Belgium

2. OMICS TECHNOLOGIESIN (CLINICAL) RESEARCH

3. Omics technologies are massively parallel microarray 2D gel shotgun LC-MS next-gen sequencing interaction network pathway systems biology modelling

4. …and have a vast analytical range Anderson’s analysis of identified plasma proteins across three proteomics analyses illustrates the difficulties in consistently finding low-abundance proteins using standard, explorative proteomics analyses. At the same time, it proves the tremendous ability of the instruments to span 11 orders of magnitude in a single analysis! From: Anderson, J. Physiol., 563.1:23-60 (2005), and http://powersof10.com

5. ANALYZINGMS PROTEOMICS DATA

6. Tools to visualize your hard-earned data See: Colaert et al., Journal of Proteome Research, 2011

7. Looking at protein quantification See: Colaert et al., Proteomics 2010, and Colaert et al., Nature Methods, 2011

8. Analysing separation of plasma samples 373 SCX separations See: Foster et al., Proteomics 2011

9. Viewing the analysed data (peptide level) See: Foster et al., Proteomics 2011

10. A whole experiment in 100 numbers See: Foster et al., Proteomics 2011

11. From 20 magicnumbers to 2 dimensions yeast human green plants zebrafish Drosophila See: Foster et al., Proteomics 2011

12. PREDICTING MS PROTEOMICS DATA

13. Predicting RT for modified peptides See: Moruz et al., submitted

14. Fragmentation variability (i) See: Barsnes et al, Proteomics, 2010

15. Fragmentation variability (ii) See: Barsnes et al., Proteomics 2011

16. Predicting fragment ion intensities (i)

17. INTEGRATING OMICS DATA

18. Clinical data – lipidomics CRC

19. Patient clustering

20. Direct pathway analysis pathways patients

21. ACKNOWLEDGMENTS

22. CompOmicsgroupand collaborators Dr. Kenny Helsens, UGent Dr. HaraldBarsnes, UiB, Bergen, NO Dr. Michael Mueller, ICL, London, UK Dr. Sven Degroeve, UGent Dr.ElienVandermarliere, UGent LuminitaMoruz, CBR/SU, SK NielsHulstaert, UGent Marc Vaudel, ISAS, Dortmund, DE Giulia Gonnelli, UGent ThiloMuth, MPI Magdeburg, DE Joe Foster, EMBL-EBI, Cambridge, UK Dr.NiklaasColaert, ex-UGent

23. Acknowledgments - Collaborators VIB / UGent, Gent, Belgium Prof. Dr. Joël Vandekerckhove, Dept. Head (emeritus) Stockholm University, CBR, Sweden Prof. Dr. Lukas Käll, Group Leader ISAS, Dortmund, Germany Prof. Dr. Albert Sickmann, Director Bioanalytics EMBL-EBI, Cambridge, UK Dr. Rolf Apweiler, PANDA Group Leader Dr. Juan Antonio Vizcaíno, PRIDE Group Coordinator Bergen University, Bergen, Norway Prof. Ingvar Eidhammer, BCCS Dr. Frode Berven, PROBE Director

24. Acknowledgments - Funding

25. Thank you! Questions?

Editor's Notes

From the HUPO PPP2 data set submitted by the Richard Smith Lab at PNNL, 373 experiment, each representing an SCX fraction were retried from pride. The experiments represented 12 individual samples that had undergone a combination of either IgY / MARS depletion and Cys/N-glycosylated peptide fractionation. A experiment vs peptide frequency matrix is generated and then subject to some filtering by tf-idf to increase the contribution of lower abundance peptides to the experiment. The matrix then undergoes latent semantic analysis to further boost signal and identify hidden patterns. This is then transformed into a distance matrix and visualised as a heat map.Approximately one third of the way through the SCX fractionation procedure peptides appear to be bleeding across all subsequent fractions, reducing the separation efficience and hence the detection sensitivity of the system considerably. ii) The effect seen in (i) is confirmed here: the separation is performing quite poorly, with bleeding evident. iii) Additionally, the region highlighted in (ii) shows unexpected similarity between 'MARS Cys' and 'MARS non-Cys' experiments; in theory, the overlap should be extremely small due to the opposite selection procedure. iv) Slight black blurring around the diagonal indicates peptide identification similarity between adjacent fractions; potentially an early warning sign that the SCX separation performance is starting to degrade. We do see superb reproducibility between samples that have undergone the same sample preparation protocol, however. v) Further evidence of the points made in (iv): somewhat further increased blurring, but excellent reproducibility of identifications obtained via IgY depletion. vi) Shows reproduciblity in identifications between different depletion methods; a good QC measure but it also indicated the depletion method does alter the peptides you detect in addition to removing highly abundant proteins. vii) Another example of the points raised in (vi), but now for a different peptide selection technology. viii) An unexpected similarity between 'IgY Non-Cys' and 'IgY Non-Gly' sample separation.
For single experiment all the MS2 spectra are collected, the peaklist is then filtered for the top 10% most intense peaks. The m/z components are then turned into a distance matrix, these matrices are then combined into a single vector, and a histogram plotted of the frequencies of m/z differences between peaks. On the left we see the region 40-200 plotted (the m/z range of amino acids) the m/z units corresponding to amino acids are shaded in grey, these peak clearly separate themselves form the general level of noise in. This highlights that the majority of peaks really represent peptides. In the graph on the right the same region is plotted, we see the amino acid bars lie well within the noise of the graphs and there is an unusually large peak at 44. this more than likely represented PEG a common contaminants in mass spectrometry which has overshadowed the valuable peaks hindering peptide identification.

High-throughput proteomics: from understanding data to predicting them

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to High-throughput proteomics: from understanding data to predicting them

Similar to High-throughput proteomics: from understanding data to predicting them (20)

More from Maté Ongenaert

More from Maté Ongenaert (18)

Recently uploaded

Recently uploaded (20)

High-throughput proteomics: from understanding data to predicting them

Editor's Notes