SlideShare une entreprise Scribd logo
1  sur  31
Secondary Structure Prediction
Abida Shehezadi
Centre of Excellence in Molecular Biology
University of the Punjab, Lahore
BIOINFORMATICS
WHY PROTEIN STRUCTURE?
• Function of the biological macromolecules is intricately related
to their 3-D shape and structure
• Structural knowledge is therefore an important step to
understand the function
• Structures better conserved than sequences
• Designing site-directed mutants to test hypotheses about
function
• Identification of active/binding site
• Modeling substrate specificity
• Protein-protein Docking simulations
WHY PROTEIN STRUCTURE?
• Protein Engineering
• Drug Designing
• Identifying structure-function relationship of
proteins
HOW TO FIND STRUCTURE?
• Experimental Procedures
– X-ray Crystallography
– NMR Spectroscopy
– Cryo-EM
• Prediction Methods
X-Ray Crystallography
• X-ray crystallography is the science of determining the arrangement of
atoms within a crystal, crystal acts as 3-D grating and produce
diffraction when a beam of X-rays is passes through it. The diffraction
pattern contain the complete information of the placement of electrons
in atoms.
• By Fourier transformation of the diffraction pattern, we can obtain the
structure of the molecule in the crystal.
• The method also produces the 3-D picture of the density of electrons
within the crystal, from which the mean atomic positions, their chemical
bonds, their disorder and several other information can be derived.
• A wide variety of materials can form crystals — such as salts, metals,
minerals, semiconductors, as well as various inorganic, organic and
biological molecules — which has made X-ray crystallography
fundamental to many scientific fields.
X-Ray Crystallography
• Slow, resource intensive process
• Pure and homogeneous Protein
• Screening of crystallization conditions
• Protein must be able to crystallize
• Non-aqueous
• Crystal packing – may deform structure of few proteins
NMR Spectroscopy
• NMR spectroscopy is used to obtain information about the
structure and dynamics of proteins.
• Protein nuclear magnetic resonance spectroscopy – protein
NMR techniques are continually being used and improved in
both academia and the biotech industry.
• Structure determination by NMR spectroscopy usually consists
of several phases, each using a separate set of highly
specialized techniques.
– sample preparation,
– resonances assignment,
– restraints generation and
– a structure calculation and validation.
Principles of NMR
• Measures nuclear magnetism or changes in nuclear magnetism
in a molecule
• NMR spectroscopy measures the absorption of light due to
changes in nuclear spin orientation
• NMR only occurs when a sample is in a strong magnetic field
• Different nuclei absorb light at different energies
NMR
• Crystal is not required
• Protein samples are in aqueous media
• Size of Protein is limited (20-30 kDa)
• Protein must be soluble in high concentrations
(30mg/ml)
Cryo-Electron Microscopy
• Cryo-Electron microscopy – cryo-EM is a form of electron
microscopy (EM) where the sample is studied at cryogenic
temperatures (generally liquid nitrogen temperatures).
• Cryo-EM is developing popularity in structural biology.
• A version of cryo-EM is cryo-electron tomography (CET) where
a 3D reconstruction of a sample is created from tilted 2D images,
again at cryogenic temperatures (either liquid nitrogen or helium).
Cryo-Electron Microscopy
• Frozen Hydrated samples used
• Electron beam used to create an image
• Proteins components as C,N,H,O could be studied.
These give very low absorption hence image contrast
is very low
Prediction Methods
Why Attempt?
• A good guess is better than nothing!
– Enables the design of experiments
– Does not need material
– Complementary to Crystallography/NMR/Cryo-EM
– Pretty high accuracy
• Crystallography/NMR/Cryo-EM don’t work always!
– Many important proteins do not crystallize
– Size limitations with NMR
– Many important proteins have atoms other than C, N, H, O
Prediction of Protein Structure
• Sequence dictates structure
• ideally, we should be capable of structure
determination by using computer simulation
programs that mimic the process of protein
folding…
BUT
Prediction of Protein Structure
• Protein folding problem is not solved yet
• Folding occurs very rapidly with several
intermediate states which are unstable
• Structure determination methods fail to
capture these unstable states
What determines fold?
Anfinsen’s experiments in 1957
demonstrated that proteins can
fold spontaneously into their
native conformations under
physiological conditions. This
implies that primary structure
does indeed determine folding
or 3-D structure.
Other factors
• Physical properties of protein that influence
stability & therefore, determine its fold:
– Rigidity of backbone
– Amino acid interaction with water
– Interactions among amino acids
• Electrostatic interactions
• Hydrogen, disulphide bonds
Structure Prediction Methods
• Secondary Structure Prediction
• Tertiary Structure Prediction
– Ab-initio prediction
– Fold recognition
– Homology modeling
Why predict secondary structure?
• Prediction of secondary structure is a step towards 3-D
structure prediction (Ab-initio method)
• Can be used in threading methods to identify distinctly
related proteins
• Provides information about class, architecture and
therefore can provide clues to mine further aspects of
structure and function
Secondary Structure Prediction
Methods
• Single Sequence based Procedure
– Statistical Methods (e.g. Chou-Fasman, GOR)
• Multiple Sequence based procedure
– Neural Network Approach (e.g. PHD)
Chou-Fasman Method
Biochemistry, 13:222-245, 1974
• Chou-Fasman method are an empirical technique to predict the
secondary structures of proteins, originally developed in the
1970s.
• The method is based on analyses of the relative frequencies of each
amino acid in α-helices, β-sheets, and turns based on known
protein structures solved with X-ray crystallography
• Based on analyzing frequency of amino acids in different
secondary structures
– A, E, L, and M: α-helix former
– P and G: helix breaker
…continued
• Table of predictive values created for α-helices, β-sheets, and
loops
• Structure with greatest overall prediction value greater than 1
used to determine the structure
• The method is at most about 50-60% accurate in identifying
correct secondary structures
GOR Method
• GOR method (Garnier-Osguthorpe-Robson) is an information
theory-based method for the prediction of secondary structures in
proteins, developed in late 1970's shortly after the Chou-Fasman
method
• Like Chou-Fasman, GOR method is based on probability
parameters derived from empirical studies of known protein
tertiary structures solved by X-ray crystallography
• However, unlike Chou-Fasman, GOR method takes into account
not only the tendency of individual amino acids to form particular
secondary structures, but also the conditional probability of the
amino acid to form a secondary structure given that its immediate
neighbors have already formed that structure
What are neural networks?
• Artificial neural network (ANN) is a mathematical model or computational
model based on biological neural networks.
• It consists of an interconnected group of artificial neurons and processes
information using a connectionist approach to computation.
• In most cases an ANN is an adaptive system that changes its structure based on
external or internal information that flows through the network during the
learning phase.
• In more practical terms neural networks are non-linear statistical data modeling
tools. They can be used to model complex relationships between inputs and
outputs or to find patterns in data.
• Parallel, distributed information processing structures which draw their ultimate
inspiration from neurons in the brain
• Main class = feed-forward network alias multi-layer perceptron
• Paradigm for tackling pattern classification and regression tasks
…continued
• Neural network methods use training sets of solved structures to
identify common sequence motifs associated with particular
arrangements of secondary structures.
• These methods are over 70% accurate in their predictions, although
β-strands are still often under predicted due to the lack of 3-D
structural information that would allow assessment of hydrogen
bonding patterns that can promote formation of the extended
conformation required for the presence of a complete β-sheet.
• Support vector machines have proven particularly useful for
predicting the locations of turns, which are difficult to identify with
statistical methods
• The requirement of relatively small training sets has also been cited
as an advantage to avoid over-fitting to existing structural data
Neural Network Models
• Machine learning approach
• Provides training sets of structures (α-helices,
non α-helices)
• Computers are trained to recognize the patterns
in known secondary structures
…continued
• First successful implementation of neural network is secondary
structure predictions is by Rost and Sander (1993) – PHD
• PHD system uses a combination MSA and Neural network
• When a protein is input, PHD finds all the homologues and finds
residue allowances at every position using a MSA and feeds that
information into a series of NNs
• The design of the system was guided by the following observations:
– MSA is useful (regular SSs are mostly structurally conserved)
– In predicting what is happening at residues, it is useful to consider a
local window around it
– Helices and sheets occur in runs (you do not see αβαβ typically you
expect to see at least 4 α-helical residues in a row to form an α-helix
Some interesting facts
• Accuracy 55% – 85%
• Higher accuracy for α-helices than β-strands
• Accuracy is dependent on protein families
• Prediction of engineered proteins are less accurate
Tertiary Structure Prediction
• Ab-initio Method
• Threading or Fold recognition Methods
• Homology Modeling
Ab-Initio Prediction
The assumption:
Native structure is at global energy minimum
• Predicting the 3D structure of a protein without any “prior
knowledge”
• Used when homology modeling or fold recognition have
failed (no homologues are evident)
• Equivalent to solving the “Protein Folding Problem”
Ab-Initio Prediction
The algorithm:
1. Reasonably generate all conformations by applying
force-fields
2. Score with an appropriate scoring function to find global
energy minimum
3. Choose the one with best score
Ab-initio Method
• Not always possible
• Resource intensive
• Need of improved, simplified procedure
• Still an ongoing research problem, but
becoming less essential as databases grow

Contenu connexe

Similaire à protein Modeling Abi.pptx

Modelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural BiologyModelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural BiologyAntonio E. Serrano
 
In silico structure prediction
In silico structure predictionIn silico structure prediction
In silico structure predictionSubin E K
 
Computer Aided Molecular Modeling
Computer Aided Molecular ModelingComputer Aided Molecular Modeling
Computer Aided Molecular Modelingpkchoudhury
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure predictionSamvartika Majumdar
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptxAmnaAkram29
 
De novo str_prediction
De novo str_predictionDe novo str_prediction
De novo str_predictionShwetA Kumari
 
Protein structure 2
Protein structure 2Protein structure 2
Protein structure 2Rainu Rajeev
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Lee Larcombe
 
NAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptx
NAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptxNAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptx
NAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptxankit dhillon
 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination EL Sayed Sabry
 
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screeningDeependra Ban
 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniquesROHIT PAL
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionArindam Ghosh
 
Structural bioinformatics.
Structural bioinformatics.Structural bioinformatics.
Structural bioinformatics.SALIHAMUGHAL
 
conformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mappingconformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mappingVishakha Giradkar
 
Class powerpoint.ppt
Class powerpoint.pptClass powerpoint.ppt
Class powerpoint.pptMaryamFazeli7
 

Similaire à protein Modeling Abi.pptx (20)

Modelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural BiologyModelling Proteins By Computational Structural Biology
Modelling Proteins By Computational Structural Biology
 
Quaternary structuree determination
Quaternary structuree determinationQuaternary structuree determination
Quaternary structuree determination
 
In silico structure prediction
In silico structure predictionIn silico structure prediction
In silico structure prediction
 
Computer Aided Molecular Modeling
Computer Aided Molecular ModelingComputer Aided Molecular Modeling
Computer Aided Molecular Modeling
 
Protein structure analysis
Protein structure analysis Protein structure analysis
Protein structure analysis
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure prediction
 
Drug discovery presentation
Drug discovery presentationDrug discovery presentation
Drug discovery presentation
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptx
 
De novo str_prediction
De novo str_predictionDe novo str_prediction
De novo str_prediction
 
Protein structure 2
Protein structure 2Protein structure 2
Protein structure 2
 
Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014Intro to in silico drug discovery 2014
Intro to in silico drug discovery 2014
 
NAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptx
NAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptxNAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptx
NAVIGATING THE PROTEOME TOOLS AND STRATEGIES FOR PROTEOME ANALYSIS.pptx
 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination
 
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening1  -val_gillet_-_ligand-based_and_structure-based_virtual_screening
1 -val_gillet_-_ligand-based_and_structure-based_virtual_screening
 
X ray crystallography analysis
X ray crystallography analysis X ray crystallography analysis
X ray crystallography analysis
 
Virtual screening techniques
Virtual screening techniquesVirtual screening techniques
Virtual screening techniques
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Structural bioinformatics.
Structural bioinformatics.Structural bioinformatics.
Structural bioinformatics.
 
conformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mappingconformational search used in Pharmacophore mapping
conformational search used in Pharmacophore mapping
 
Class powerpoint.ppt
Class powerpoint.pptClass powerpoint.ppt
Class powerpoint.ppt
 

Plus de MuhammadRizwan863722

Plus de MuhammadRizwan863722 (7)

Plant sensing and responding to stress
Plant sensing and responding to stress Plant sensing and responding to stress
Plant sensing and responding to stress
 
somatic-hybridization.ppt
somatic-hybridization.pptsomatic-hybridization.ppt
somatic-hybridization.ppt
 
Phylogeny-Abida.pptx
Phylogeny-Abida.pptxPhylogeny-Abida.pptx
Phylogeny-Abida.pptx
 
Lecture-6 Energy, Enzymes, and Metabolism.ppt
Lecture-6 Energy, Enzymes, and Metabolism.pptLecture-6 Energy, Enzymes, and Metabolism.ppt
Lecture-6 Energy, Enzymes, and Metabolism.ppt
 
Workshop -Mendeley Reference Management.pptx
Workshop -Mendeley Reference Management.pptxWorkshop -Mendeley Reference Management.pptx
Workshop -Mendeley Reference Management.pptx
 
lecture-9.ppt
lecture-9.pptlecture-9.ppt
lecture-9.ppt
 
8-glycolysis.ppt
8-glycolysis.ppt8-glycolysis.ppt
8-glycolysis.ppt
 

Dernier

Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...GQ Research
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in collegessuser7a7cd61
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 

Dernier (20)

Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
Biometric Authentication: The Evolution, Applications, Benefits and Challenge...
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
While-For-loop in python used in college
While-For-loop in python used in collegeWhile-For-loop in python used in college
While-For-loop in python used in college
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 

protein Modeling Abi.pptx

  • 1. Secondary Structure Prediction Abida Shehezadi Centre of Excellence in Molecular Biology University of the Punjab, Lahore BIOINFORMATICS
  • 2. WHY PROTEIN STRUCTURE? • Function of the biological macromolecules is intricately related to their 3-D shape and structure • Structural knowledge is therefore an important step to understand the function • Structures better conserved than sequences • Designing site-directed mutants to test hypotheses about function • Identification of active/binding site • Modeling substrate specificity • Protein-protein Docking simulations
  • 3. WHY PROTEIN STRUCTURE? • Protein Engineering • Drug Designing • Identifying structure-function relationship of proteins
  • 4. HOW TO FIND STRUCTURE? • Experimental Procedures – X-ray Crystallography – NMR Spectroscopy – Cryo-EM • Prediction Methods
  • 5. X-Ray Crystallography • X-ray crystallography is the science of determining the arrangement of atoms within a crystal, crystal acts as 3-D grating and produce diffraction when a beam of X-rays is passes through it. The diffraction pattern contain the complete information of the placement of electrons in atoms. • By Fourier transformation of the diffraction pattern, we can obtain the structure of the molecule in the crystal. • The method also produces the 3-D picture of the density of electrons within the crystal, from which the mean atomic positions, their chemical bonds, their disorder and several other information can be derived. • A wide variety of materials can form crystals — such as salts, metals, minerals, semiconductors, as well as various inorganic, organic and biological molecules — which has made X-ray crystallography fundamental to many scientific fields.
  • 6. X-Ray Crystallography • Slow, resource intensive process • Pure and homogeneous Protein • Screening of crystallization conditions • Protein must be able to crystallize • Non-aqueous • Crystal packing – may deform structure of few proteins
  • 7. NMR Spectroscopy • NMR spectroscopy is used to obtain information about the structure and dynamics of proteins. • Protein nuclear magnetic resonance spectroscopy – protein NMR techniques are continually being used and improved in both academia and the biotech industry. • Structure determination by NMR spectroscopy usually consists of several phases, each using a separate set of highly specialized techniques. – sample preparation, – resonances assignment, – restraints generation and – a structure calculation and validation.
  • 8. Principles of NMR • Measures nuclear magnetism or changes in nuclear magnetism in a molecule • NMR spectroscopy measures the absorption of light due to changes in nuclear spin orientation • NMR only occurs when a sample is in a strong magnetic field • Different nuclei absorb light at different energies
  • 9. NMR • Crystal is not required • Protein samples are in aqueous media • Size of Protein is limited (20-30 kDa) • Protein must be soluble in high concentrations (30mg/ml)
  • 10. Cryo-Electron Microscopy • Cryo-Electron microscopy – cryo-EM is a form of electron microscopy (EM) where the sample is studied at cryogenic temperatures (generally liquid nitrogen temperatures). • Cryo-EM is developing popularity in structural biology. • A version of cryo-EM is cryo-electron tomography (CET) where a 3D reconstruction of a sample is created from tilted 2D images, again at cryogenic temperatures (either liquid nitrogen or helium).
  • 11. Cryo-Electron Microscopy • Frozen Hydrated samples used • Electron beam used to create an image • Proteins components as C,N,H,O could be studied. These give very low absorption hence image contrast is very low
  • 12. Prediction Methods Why Attempt? • A good guess is better than nothing! – Enables the design of experiments – Does not need material – Complementary to Crystallography/NMR/Cryo-EM – Pretty high accuracy • Crystallography/NMR/Cryo-EM don’t work always! – Many important proteins do not crystallize – Size limitations with NMR – Many important proteins have atoms other than C, N, H, O
  • 13. Prediction of Protein Structure • Sequence dictates structure • ideally, we should be capable of structure determination by using computer simulation programs that mimic the process of protein folding… BUT
  • 14. Prediction of Protein Structure • Protein folding problem is not solved yet • Folding occurs very rapidly with several intermediate states which are unstable • Structure determination methods fail to capture these unstable states
  • 15. What determines fold? Anfinsen’s experiments in 1957 demonstrated that proteins can fold spontaneously into their native conformations under physiological conditions. This implies that primary structure does indeed determine folding or 3-D structure.
  • 16. Other factors • Physical properties of protein that influence stability & therefore, determine its fold: – Rigidity of backbone – Amino acid interaction with water – Interactions among amino acids • Electrostatic interactions • Hydrogen, disulphide bonds
  • 17. Structure Prediction Methods • Secondary Structure Prediction • Tertiary Structure Prediction – Ab-initio prediction – Fold recognition – Homology modeling
  • 18. Why predict secondary structure? • Prediction of secondary structure is a step towards 3-D structure prediction (Ab-initio method) • Can be used in threading methods to identify distinctly related proteins • Provides information about class, architecture and therefore can provide clues to mine further aspects of structure and function
  • 19. Secondary Structure Prediction Methods • Single Sequence based Procedure – Statistical Methods (e.g. Chou-Fasman, GOR) • Multiple Sequence based procedure – Neural Network Approach (e.g. PHD)
  • 20. Chou-Fasman Method Biochemistry, 13:222-245, 1974 • Chou-Fasman method are an empirical technique to predict the secondary structures of proteins, originally developed in the 1970s. • The method is based on analyses of the relative frequencies of each amino acid in α-helices, β-sheets, and turns based on known protein structures solved with X-ray crystallography • Based on analyzing frequency of amino acids in different secondary structures – A, E, L, and M: α-helix former – P and G: helix breaker
  • 21. …continued • Table of predictive values created for α-helices, β-sheets, and loops • Structure with greatest overall prediction value greater than 1 used to determine the structure • The method is at most about 50-60% accurate in identifying correct secondary structures
  • 22. GOR Method • GOR method (Garnier-Osguthorpe-Robson) is an information theory-based method for the prediction of secondary structures in proteins, developed in late 1970's shortly after the Chou-Fasman method • Like Chou-Fasman, GOR method is based on probability parameters derived from empirical studies of known protein tertiary structures solved by X-ray crystallography • However, unlike Chou-Fasman, GOR method takes into account not only the tendency of individual amino acids to form particular secondary structures, but also the conditional probability of the amino acid to form a secondary structure given that its immediate neighbors have already formed that structure
  • 23. What are neural networks? • Artificial neural network (ANN) is a mathematical model or computational model based on biological neural networks. • It consists of an interconnected group of artificial neurons and processes information using a connectionist approach to computation. • In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network during the learning phase. • In more practical terms neural networks are non-linear statistical data modeling tools. They can be used to model complex relationships between inputs and outputs or to find patterns in data. • Parallel, distributed information processing structures which draw their ultimate inspiration from neurons in the brain • Main class = feed-forward network alias multi-layer perceptron • Paradigm for tackling pattern classification and regression tasks
  • 24. …continued • Neural network methods use training sets of solved structures to identify common sequence motifs associated with particular arrangements of secondary structures. • These methods are over 70% accurate in their predictions, although β-strands are still often under predicted due to the lack of 3-D structural information that would allow assessment of hydrogen bonding patterns that can promote formation of the extended conformation required for the presence of a complete β-sheet. • Support vector machines have proven particularly useful for predicting the locations of turns, which are difficult to identify with statistical methods • The requirement of relatively small training sets has also been cited as an advantage to avoid over-fitting to existing structural data
  • 25. Neural Network Models • Machine learning approach • Provides training sets of structures (α-helices, non α-helices) • Computers are trained to recognize the patterns in known secondary structures
  • 26. …continued • First successful implementation of neural network is secondary structure predictions is by Rost and Sander (1993) – PHD • PHD system uses a combination MSA and Neural network • When a protein is input, PHD finds all the homologues and finds residue allowances at every position using a MSA and feeds that information into a series of NNs • The design of the system was guided by the following observations: – MSA is useful (regular SSs are mostly structurally conserved) – In predicting what is happening at residues, it is useful to consider a local window around it – Helices and sheets occur in runs (you do not see αβαβ typically you expect to see at least 4 α-helical residues in a row to form an α-helix
  • 27. Some interesting facts • Accuracy 55% – 85% • Higher accuracy for α-helices than β-strands • Accuracy is dependent on protein families • Prediction of engineered proteins are less accurate
  • 28. Tertiary Structure Prediction • Ab-initio Method • Threading or Fold recognition Methods • Homology Modeling
  • 29. Ab-Initio Prediction The assumption: Native structure is at global energy minimum • Predicting the 3D structure of a protein without any “prior knowledge” • Used when homology modeling or fold recognition have failed (no homologues are evident) • Equivalent to solving the “Protein Folding Problem”
  • 30. Ab-Initio Prediction The algorithm: 1. Reasonably generate all conformations by applying force-fields 2. Score with an appropriate scoring function to find global energy minimum 3. Choose the one with best score
  • 31. Ab-initio Method • Not always possible • Resource intensive • Need of improved, simplified procedure • Still an ongoing research problem, but becoming less essential as databases grow