SlideShare une entreprise Scribd logo
1  sur  14
Kaitlin Hart
Protein Function Prediction Using ProMol,
PyMol, and other Bioinformatics Tools
Rochester Institute of Technology
Introduction
● 3,454 (as of 7/25/2014) PDB entries with unknown function in
the Protein Data Bank
● The purpose of this project is to determine the hypothetical
function of proteins in a timely and cost effective manner
● In silico methods allow us to rapidly narrow candidate functions
● In vitro characterization alone is expensive and time consuming
ProMol
4EZI ProMol Alignment with 1ORV (a hydrolase)
RMSD: 0.36
RMSD Alpha: 0.19
RMSD:Alpha & Beta: 0.21
Red: 4EZI Blue: 1ORV
4EZI BLAST Alignment
with 3H2H
4EZI Dali Alignment
with 3H2H
Query
Cover
E value Ident
78% 4e-20 27%
Z-score
RMSD
Value
46.0 1.9
4EZI Alignment with 1ORV
(a hydrolase)
4EZI Alignment with 3H2H
(an esterase)
Red: 4EZI Blue: 3H2HRed: 4EZI Blue: 1ORV
RMSD: 0.36
RMSD Alpha: 0.19
RMSD:Alpha & Beta: 0.21
RMSD: 0.24
RMSD Alpha: 0.17
RMSD:Alpha & Beta: 0.21
p-Nitrophenyl Acetate Kinetics Assay
Molecular Structure of PNP Acetate
AutoDock Vina
O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a
new scoring function, efficient optimization and multithreading, Journal of
Computational Chemistry 31 (2010) 455-461.
AutoDock Vina Alignment of Inositol Phosphate
and 4EZI
Conclusion and Future Plans
● 4EZI has been hypothesized as possibly hydrolase, or
more specifically, an esterase
● 4EZI shows little activity with p-nitrophenyl acetate
● AutoDock suggests that inositol would be a good
substrate candidate
Conclusions:
Future Plans:
● Continue work with AutoDock to find other possible
substrates
● Test substates with kinetics assays
Acknowledgements
Thank you to all of the present and past members of the SBEVSL (Structural Biology Extensible
Visualization Scripting Language) project.
This project was funded by RIT, Dowling College, the National Institute of Health
(2R15GM078077 & 3R15GM078077-02S1), and the National Science Foundation (NSF ATE
040208).
Dr. Paul Craig
Dr. Lea
Michel
Greg Dodge
Dr. Herbert
Bernstein
Dr. Jeff Mills
References
Altschul, S. F., T. L. Madden, A. A. Schäffer, J Zhang, Z Zhang, W Miller, and D J Lipman. “Gapped BLAST and
PSI-BLAST: a New Generation of Protein Database Search Programs.” Nucleic Acids Research 25, no. 17
(September 1, 1997): 3389–3402.
Bernstein, F. C., T. F. Koetzle, G. J.Williams, E. F. Meyer Jr, M. D. Brice, J. R. Rodgers, O. Kennard, T. Shimanouchi,
and M. Tasumi. “The Protein Data Bank: a Computer-based Archival File for Macromolecular Structures.” Journal
of Molecular Biology 112, no. 3 (May 25, 1977): 535–542.
Furnham N., G.L Holliday, T.A. de Beer, J.O. Jacobsen, W.R. Pearson, J.M. Thornton. “The Catalytic Site Atlas 2.0:
cataloging catalytic sites and residues identified in enzymes.” Nucleic Acids Research 42. (January
2014):D485-D489. doi: 10.1093/nar/gkt1243.
Hanson, B., C. Westin, M. Rosa, A. Grier, M. Osipovitch, M. MacDonald, G. Dodge, P. Boli, C. Corwin, H. Kessler, T.
McKay, H. Bernstein, P. Craig. “Estimation of Protein Function Using Template-Based Alignment of Enzyme
Active Sites.” BMC Bioinformatics 15, no. 87. (March 27, 2014): doi:10.1186/1471-2105-15-87
Holm, L., and P. Rosenström. “Dali Server: Conservation Mapping in 3D.” Nucleic Acids Research 38, no. suppl 2
(July 1, 2010): W545–W549. doi:10.1093/nar/gkq366.
O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function,
efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461
Punta M., P.C. Coggill, R.Y. Eberhardt, J. Mistry, J. Tate, C. Boursnell, N. Pang, K. Forslund, G. Ceric, J. Clements, A.
Heger, L. Holm, E.L.L. Sonnhammer, S.R. Eddy, A. Bateman, R.D. Finn. “The Pfam Protein Families
Database.” Nucleic Acids Research 40, no. D1 (November 29, 2011): D290-D301. doi:10.1093/nar/gk
Questions
Kaitlin Hart
knh2647@rit.edu
www.promol.org

Contenu connexe

Similaire à Hart_Kaitlin_ASBMB_4EZI_Presentation_August2015_Final

ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5Kaitlin Hart
 
20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation Profiling20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation ProfilingYi-Feng Chang
 
Integrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics dataIntegrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics dataEnrico Glaab
 
DNA Methylation Data Analysis
DNA Methylation Data AnalysisDNA Methylation Data Analysis
DNA Methylation Data AnalysisYi-Feng Chang
 
Integration of heterogeneous data
Integration of heterogeneous dataIntegration of heterogeneous data
Integration of heterogeneous dataLars Juhl Jensen
 
The STRING database and related tools
The STRING database and related toolsThe STRING database and related tools
The STRING database and related toolsLars Juhl Jensen
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014Anita de Waard
 
Dissecting human brain development at high resolution using RNA-seq
Dissecting human brain development  at high resolution using RNA-seq Dissecting human brain development  at high resolution using RNA-seq
Dissecting human brain development at high resolution using RNA-seq lcolladotor
 
Hui Zhang CV 2016
Hui Zhang CV 2016Hui Zhang CV 2016
Hui Zhang CV 2016Hui Zhang
 
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartAlexander Pico
 
Thesis defense - QUANG ONG - FINAL
Thesis defense - QUANG ONG - FINALThesis defense - QUANG ONG - FINAL
Thesis defense - QUANG ONG - FINALQuang Ong
 
Unraveling cellular phosphorylation networks using computational biology
Unraveling cellular phosphorylation networks using computational biologyUnraveling cellular phosphorylation networks using computational biology
Unraveling cellular phosphorylation networks using computational biologyLars Juhl Jensen
 
20141218 Methylation Sequencing Analysis
20141218  Methylation Sequencing Analysis20141218  Methylation Sequencing Analysis
20141218 Methylation Sequencing AnalysisYi-Feng Chang
 
Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomeRajarshi Guha
 
Cross-species data integration
Cross-species data integrationCross-species data integration
Cross-species data integrationLars Juhl Jensen
 

Similaire à Hart_Kaitlin_ASBMB_4EZI_Presentation_August2015_Final (20)

ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5ASBMB Poster_16April2014_Draft5
ASBMB Poster_16April2014_Draft5
 
20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation Profiling20140613 Analysis of High Throughput DNA Methylation Profiling
20140613 Analysis of High Throughput DNA Methylation Profiling
 
Integrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics dataIntegrative bioinformatics analysis of Parkinson's disease related omics data
Integrative bioinformatics analysis of Parkinson's disease related omics data
 
DNA Methylation Data Analysis
DNA Methylation Data AnalysisDNA Methylation Data Analysis
DNA Methylation Data Analysis
 
Integration of heterogeneous data
Integration of heterogeneous dataIntegration of heterogeneous data
Integration of heterogeneous data
 
The STRING database and related tools
The STRING database and related toolsThe STRING database and related tools
The STRING database and related tools
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
 
Dissecting human brain development at high resolution using RNA-seq
Dissecting human brain development  at high resolution using RNA-seq Dissecting human brain development  at high resolution using RNA-seq
Dissecting human brain development at high resolution using RNA-seq
 
Hui Zhang CV 2016
Hui Zhang CV 2016Hui Zhang CV 2016
Hui Zhang CV 2016
 
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
dkNET Webinar: Illuminating The Druggable Genome With Pharos 10/23/2020
 
clement cc proceedings APS 2015
clement cc proceedings APS 2015clement cc proceedings APS 2015
clement cc proceedings APS 2015
 
NetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver HartNetBioSIG2014-Talk by Traver Hart
NetBioSIG2014-Talk by Traver Hart
 
Thesis defense - QUANG ONG - FINAL
Thesis defense - QUANG ONG - FINALThesis defense - QUANG ONG - FINAL
Thesis defense - QUANG ONG - FINAL
 
Unraveling cellular phosphorylation networks using computational biology
Unraveling cellular phosphorylation networks using computational biologyUnraveling cellular phosphorylation networks using computational biology
Unraveling cellular phosphorylation networks using computational biology
 
20141218 Methylation Sequencing Analysis
20141218  Methylation Sequencing Analysis20141218  Methylation Sequencing Analysis
20141218 Methylation Sequencing Analysis
 
mbCVc
mbCVcmbCVc
mbCVc
 
Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark Genome
 
Cross-species data integration
Cross-species data integrationCross-species data integration
Cross-species data integration
 
Abhishek RBF final
Abhishek RBF finalAbhishek RBF final
Abhishek RBF final
 
iEvobIO
iEvobIO iEvobIO
iEvobIO
 

Hart_Kaitlin_ASBMB_4EZI_Presentation_August2015_Final

  • 1. Kaitlin Hart Protein Function Prediction Using ProMol, PyMol, and other Bioinformatics Tools Rochester Institute of Technology
  • 2. Introduction ● 3,454 (as of 7/25/2014) PDB entries with unknown function in the Protein Data Bank ● The purpose of this project is to determine the hypothetical function of proteins in a timely and cost effective manner ● In silico methods allow us to rapidly narrow candidate functions ● In vitro characterization alone is expensive and time consuming
  • 3.
  • 5. 4EZI ProMol Alignment with 1ORV (a hydrolase) RMSD: 0.36 RMSD Alpha: 0.19 RMSD:Alpha & Beta: 0.21 Red: 4EZI Blue: 1ORV
  • 6. 4EZI BLAST Alignment with 3H2H 4EZI Dali Alignment with 3H2H Query Cover E value Ident 78% 4e-20 27% Z-score RMSD Value 46.0 1.9
  • 7. 4EZI Alignment with 1ORV (a hydrolase) 4EZI Alignment with 3H2H (an esterase) Red: 4EZI Blue: 3H2HRed: 4EZI Blue: 1ORV RMSD: 0.36 RMSD Alpha: 0.19 RMSD:Alpha & Beta: 0.21 RMSD: 0.24 RMSD Alpha: 0.17 RMSD:Alpha & Beta: 0.21
  • 8. p-Nitrophenyl Acetate Kinetics Assay Molecular Structure of PNP Acetate
  • 9. AutoDock Vina O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461.
  • 10. AutoDock Vina Alignment of Inositol Phosphate and 4EZI
  • 11. Conclusion and Future Plans ● 4EZI has been hypothesized as possibly hydrolase, or more specifically, an esterase ● 4EZI shows little activity with p-nitrophenyl acetate ● AutoDock suggests that inositol would be a good substrate candidate Conclusions: Future Plans: ● Continue work with AutoDock to find other possible substrates ● Test substates with kinetics assays
  • 12. Acknowledgements Thank you to all of the present and past members of the SBEVSL (Structural Biology Extensible Visualization Scripting Language) project. This project was funded by RIT, Dowling College, the National Institute of Health (2R15GM078077 & 3R15GM078077-02S1), and the National Science Foundation (NSF ATE 040208). Dr. Paul Craig Dr. Lea Michel Greg Dodge Dr. Herbert Bernstein Dr. Jeff Mills
  • 13. References Altschul, S. F., T. L. Madden, A. A. Schäffer, J Zhang, Z Zhang, W Miller, and D J Lipman. “Gapped BLAST and PSI-BLAST: a New Generation of Protein Database Search Programs.” Nucleic Acids Research 25, no. 17 (September 1, 1997): 3389–3402. Bernstein, F. C., T. F. Koetzle, G. J.Williams, E. F. Meyer Jr, M. D. Brice, J. R. Rodgers, O. Kennard, T. Shimanouchi, and M. Tasumi. “The Protein Data Bank: a Computer-based Archival File for Macromolecular Structures.” Journal of Molecular Biology 112, no. 3 (May 25, 1977): 535–542. Furnham N., G.L Holliday, T.A. de Beer, J.O. Jacobsen, W.R. Pearson, J.M. Thornton. “The Catalytic Site Atlas 2.0: cataloging catalytic sites and residues identified in enzymes.” Nucleic Acids Research 42. (January 2014):D485-D489. doi: 10.1093/nar/gkt1243. Hanson, B., C. Westin, M. Rosa, A. Grier, M. Osipovitch, M. MacDonald, G. Dodge, P. Boli, C. Corwin, H. Kessler, T. McKay, H. Bernstein, P. Craig. “Estimation of Protein Function Using Template-Based Alignment of Enzyme Active Sites.” BMC Bioinformatics 15, no. 87. (March 27, 2014): doi:10.1186/1471-2105-15-87 Holm, L., and P. Rosenström. “Dali Server: Conservation Mapping in 3D.” Nucleic Acids Research 38, no. suppl 2 (July 1, 2010): W545–W549. doi:10.1093/nar/gkq366. O. Trott, A. J. Olson, AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading, Journal of Computational Chemistry 31 (2010) 455-461 Punta M., P.C. Coggill, R.Y. Eberhardt, J. Mistry, J. Tate, C. Boursnell, N. Pang, K. Forslund, G. Ceric, J. Clements, A. Heger, L. Holm, E.L.L. Sonnhammer, S.R. Eddy, A. Bateman, R.D. Finn. “The Pfam Protein Families Database.” Nucleic Acids Research 40, no. D1 (November 29, 2011): D290-D301. doi:10.1093/nar/gk

Notes de l'éditeur

  1. -Protein Data Bank is a repository of protein structures. All publicly relseased structures are published there -There are almost 3,500 proteins in the PDB with no assigned function; basically, we know the structure, but we do not know what these proteins actually do -purpose of our project was to analyze these proteins in an attempt to find their function - we can do this via in silico (meaning using computers), which allows us to rapidly narrow candidate functions -can do this via in vitro methods (testing in the wet lab), but this is costly and time consuming, but accurate -we need to use both methods when characterizing proteins
  2. -we analyzed 4EZI in silico using three additional bioinformatics tools (BLAST, Dali, and Pfam), which categorize proteins based on entire sequence and/or 3D structure -after we compared the results of all for computer programs, we performed literature searches to determine the best methods for testing the function of the protein in the wet lab -In vitro (meaning in the wet lab) characterization involved mixing the enzyme with different substrates. The rate at which the enzyme breaks down the different substrates was measured.
  3. -ProMOL is a computer program used to visualize structural alignment among proteins - developed by RIT students in 2008 -ProMol looks at specific parts of proteins that we know the function for, called the active site. This is where substrates bind to the protein to catalyze chemical reactions (hand demo) -ran most of the 3,000+ proteins of unknown function through ProMOL -found the promising hit, 4EZI, which we decided to pursue
  4. -started with running 4ezi through promol -best match was 1orv, a hydrolase. Hydrolases are enzymes which cut bonds by adding water molecules -in this image, 4EZI is red and 1orv is blue. In ProMOL, this is considered an excellent match
  5. -BLAST compares sequences of proteins. When we ran the sequence of 4EZI through BLAST, the best hit was 3H2H, which is a specific type of hydrolase, called an esterase. Esterases cut molecules with carbon chains with less than 10 carbons. -Dali compares 3D structures of proteins. When we ran 4EZI through the Dali database, the best hit was again the esterase 3H2H -query cover: percent of 3h2h that is aligned with 4ezi: 3h2h has an extra loop that might have a different function -e value: expecation value: probability that this alignment will randomly occur. more negative, the better the value (it wouldn’t normally be expected to happen) -identical: percent that they are identical
  6. -the image on the left is from a previous slide; remember that 1orv is a hydrolase -The image on the right was generated by running the esterase (a type of hydrolase) 3h2h (the best hit from BLAST and Dali) through ProMOL. -as you can see, the two alignments are very similar 4EZI and 1ORV RMSD: 0.36 RMSD Alpha: 0.19 RMSD:Alpha & Beta: 0.21 4EZI and 3H2H RMSD All: 0.24 RMSD Alpha: 0.17 RMSD Alpha & Beta: 0.21
  7. acetate y = 0.0011ln(x) + 0.0089 R² = 0.924 PNP is a chromogenic substrate slope: 0.02692
  8. Dr. Oleg Trott
  9. Figure 7. Binding of ITP in 4EZI active site. Serine-195 is highlighted in lime green within the active site of 4EZI. The Phosphate group of the ITP interacts with the hydroxy group of the serine with a distance of 2.9 Å. ITP bound to 4EZI with an affinity of -6.9 kcal/mol.