SlideShare une entreprise Scribd logo
1  sur  32
Submitted by-
Ishi tandon
CT-IV
Gene:
• Asequence of nucleotides coding for protein.
CentralDogma:
• Proposed in 1958 by Francis Crick.
• Hepostulated that all possibleinformation
transferred, are not viable.
• Hepublished apaper in 1970.
CODONS:
• Discovered by Sydney Brenner and Francis Crickin
1961.
• In every triplet of nucleotides, each codoncodesfor
one amino acid in aprotein.
DNA RNA PROTEIN PHENOTYPE
2
4
cDNA
1 3
1. TRANSCRIPTION
2. TRANSLATION
3. GENE EXPRESSION
4. REVERSETRANSCRIPTION
DEfiniTION
• It is aprerequisite for detailed functionalannotation
of genesand genomes.
• It candetect location of ORFs(Open Reading
Frames), structures of introns andexons.
• It describes all the genescomputationally withnear
100% accuracy.
• It canreduce the amount ofexperimental
verification work required.
TYPES
• Abinitio- gene signals, intron splice, transcription
factor binding site, ribosomal binding site, poly-
adenylation site, triplet codon structure and gene
content.
• Homology- significant matches of query sequence
with sequence of knowngenes.
• Probabilistic models like Markov model or Hidden
Markov Models (HMMs).
Abinitio-based
Homology-
based
Translation
Protein
Splicing
mRNA Cap- -Poly(A)
Transcription
pre-mRNA Cap- -Poly(A)
Genomic DNA
Stop codon
GT AG
exon intron
Splice sites
Donor site Acceptor site
SEQUENCE
SIGNALS
Start codon
Exonsare usually
shorter thanintrons.
Prokaryoticgene
prediction
• Geneprediction is easier in microbialgenomes.
• Smaller genomes, high gene density, very few
repetitive sequence, more sequenced genomes.
• Start codon is ATG.
• Ribosomal binding site/Shine Dalgarno sequence.
Openreadingframes
• A sequence defined by in-frame start and stop
codon, which in turn defines aputative amino acid
sequence.
• Agenome of length n is comprised of (n/3)codons.
• Stop codons break genome into segments between
consecutive stop codons.
• Thesub-segments of these that start from the Start
codon (ATG)areORFs.
• DNA is translated in all six possible frames,
three frames forward and three reverse.
ATG TGA
Genomic Sequence
Open reading frame
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
Probabilisticmodels
• Statistical description of agene.
• Markov Models &Hidden Markov Models.
• Usedto distinguish oligonucleotide distributions in
the coding regions from those for non-coding
regions.
• Probability of distribution of nucleotides inDNA
sequence depends on the order k.
• Typesof order- zero,first and second.
• Order , gene canpredicted more accurately.
Genecontent and length distribution of
prokaryotic genes
TYPICAL ATYPICAL
Ranges from100
to 500amino
acids with a
nucleotide
distribution
typical ofthe
organism.
Shorter or longer
with different
nucleotidestatistics.
Genes tend toescape
detection when
typical gene modelis
used.
Genefindingprogramsin
prokaryotes
• Theprograms are based on HMM/IMM.
 GeneMark.hmm (microbial genomes)
 Glimmer (UNIX program from TIGR). Computation
involves two steps viz. model building & gene
prediction.
 FGENESB (bacterial sequences). It uses Vertibi
algorithm & linear discriminant analysis(LDA).
 RBSfinder- Searches from ribosomal binding site or
shine dalgarno sequence for prediction of translation
initiation site.
Sensitivity Ability to include correct predictions. It is the
fraction of known genescorrectlypredicted.
Specificity Ability to exclude incorrect predictions. It is the
fraction of predicted genes that correspond to true genes.
 Both are the proportion of true signals.
Eukaryoticgeneprediction
• Genomes are much larger than prokaryotes(10Mbp to
670 Gbp).
• Low gene density.
• Spacebetween genesis very large and rich in
repetitive sequences & transposableelements.
• Splitting of genesby intervening noncodingsequences
(introns) and joining of coding sequences(exons).
• Splice junctions follow GT-AGrule.
• An intron at the 5’ splice junction hasaconsensus
motif GTAAGTand that at 3’ endNCAG.
exon 1 exon 2
• Geneshave ahigh density of CGdinucleotides near
the transcription start site. Thisregion is CpGisland. It
helps to identify the transcription initiation site of an
eukaryotic gene.
• Somepost-transcriptional modification occur with the
transcript to become mature mRNAviz. Capping,
Splicing and Polyadenylation.
Acceptor
Site
Donor
Site
GT AG
o CAPPING: Occurs at the 5’ end of the transcript. It
involves methylation at the initial residue of the
RNA.
o SPLICING: Processof removal of intronsand
joining of exons. It involves alargeRNA-protein
complex called spliceosome.
o POLYADENYLATION:Addition of astretch ofAs
(~250) at the 3’ end of the RNA.Theprocessis
accomplished by poly-Apolymerase.
Genefindingprogramsin
EUkaryotes
• Three categories of algorithms
 Ab Initiobased-
It joins the exonsin correct order.Twosignals->
a) Genesignals: asmall pattern within the genomic
DNAincluding putative splice sites, start and stop
sites of transcription or translation, branchpoints,
transcription factor binding sites, recognizable
consensus sequences.
b) Genecontent: aregion of genomic DNAincluding
nucleotide and amino acid distribution, Synonymous
codon usageand hexamer frequencies.
 Neural network based algorithm
-Composed of network of mathematicalvariables.
-Multiple layers like input, output and hiddenlayers.
-GRAIL (Splice junctions, start and stop codons, poly-A
sites, promoters and CpGislands). It scansthe query
sequence with windows of variable lengths &scores.
 Discriminant analysis
-Linear Discriminant Analysis (LDA) represents 2D
graph of coding signals vs. all possible 3’ splice site
positions; adiagonal line.
-Quadratic DiscriminantAnalysis (QDA)represents
quadratic function; acurved line.
-FGENES (LDA)
-FGENESH [Find Genes] (HMMs)
-FGENESH_C (Similarity based)
-FGENESH+ (Combination of ab initio &similarity
based)
-MZEF [Michael Zhang’s Exon Finder](QDA)
 HMMs
-GENSCAN (Fifth order HMMs); combination of
hexamer frequencies with coding signals;probability
score P>0.5
-HMMgene (Conditional Maximum Likelihood);
combination of ab initio & homology-basedalgorithm
 Homology-based-
Exonstructures and sequencesof related speciesare
highly conserved.
Comparison of homologous sequences derived from
cDNAor ExpressedSequenceTags (ESTs).
-GenomeScan (Combination of GENSCANprediction
results with BLASTXsimilaritysearches)
-EST2Genome (Intron-exon boundaries); Comparison
of an ESTsequence with agenomic DNAsequence
-SGP-1 [Syntenic Gene Prediction] (Similar to EST2)
-TwinScan (gene-finding server; similar to
GenomeScan)
 Consensus-based-
Combination of results of multiple programsbased
on consensus.
Improvement of specificity by correctingfalse
positives & problem ofoverprediction.
Lowered sensitivity & missedpredictons.
-GeneComber (Combination of HMMgene&
GenScanprediction results)
-DIGIT (Combination of FGENESH,GENSCAN&
HMMgene)
GENE EXPRESSION
Two steps are required
1. Translation
The synthesis of a polypeptide chain using the genetic
code on the mRNA molecule as its guide.
1. Transcription
The synthesis of mRNA uses the gene on the DNA
molecule as a template
This happens in the nucleus of eukaryotes
Types OF RNA
Messenger RNA (mRNA) <5%
Ribosomal RNA (rRNA) Up to 80%
Transfer RNA (tRNA) About 15%
In eukaryotes small nuclear ribonucleoproteins (snRNP aka
spliceosomes
Structural characteristics of RNA molecules
Single polynucleotide strand which may be looped or
coiled (not a double helix)
Sugar Ribose (not deoxyribose)
Bases used: Adenine, Guanine, Cytosine and Uracil (not
Thymine
Transcription: The synthesis of a strand of mRNA (and
other RNAs)
Uses an enzyme RNA polymerase
Proceeds in the same direction as replication (5’ to 3’)
Forms a complementary strand of mRNA
It begins at a promotor site, which signals that the beginning of
the gene is near (about 20 to 30 nucleotides away)
After the end of the gene is reached, there is a terminator
sequence that tells RNA polymerase to stop transcribing
NB Terminator sequence ≠ terminator codon
RNA POLYMERASE
Editing the mRNA
In prokaryotes, transcribed mRNA
goes straight to the ribosomes in the
cytoplasm
In eukaryotes, freshly transcribed
mRNA in the nucleus is about 5000
nucleotides long
When the same mRNA is used for
translation at the ribosome it is only
1000 nucleotides long
The mRNA has been edited
The parts which are kept for gene
expression are called EXONS (exons =
expressed)
The parts which are edited out (by
spliceosomes) are called INTRONS.
Translation
TRANSLATION
Complete protein
Polypeptide chain
Ribosomes
Stop codon Start codon
© 2016 Paul Billiet ODWS
Translation
 Location: The ribosomes in the cytoplasm
that provide the environment for translation
 The genetic code is brought by the mRNA
molecule.
© 2016 Paul Billiet ODWS
An important discovery Retro viruses (e.g. HIV)
carry RNA as their
genetic information
 When they invade their
host cell they convert
their RNA into a DNA
copy using reverse
transcriptase
 Thus the central dogma is modified:
DNA↔RNAProtein
 This has helped to explain an important paradox in the
evolution of life.
Reverse transcriptase
© 2016 Paul Billiet ODWS
The paradox of DNA
 DNA is a very stable molecule
 It is a good medium for storing genetic material
but…
 DNA can do nothing for itself
 It requires enzymes for replication
 It requires enzymes for gene expression
 The information in DNA is required to synthesise
enzymes (proteins) but enzymes are require to
make DNA function
 Which came first in the origin of life DNA or
enzymes?
© 2016 Paul Billiet ODWS
RIBOZYMES: Both genetic and
catalytic
 Certain forms of RNA have catalytic properties
 RIBOZYMES
 Ribosomes and spliceosomes are ribozymes
 RNA could have been the first genetic information
synthesizing proteins…
 …and at the same time a biocatalyst
 Reverse transcriptase provides the possibility of
producing DNA copies from RNA.
© 2016 Paul Billiet ODWS
The ribosome a ribozyme
REFERENCES
 http://www.4ulr.com/products/currentprotocols/bioinformatics.html
 http://proxy.lib.iastate.edu:2103/nrg/journal/v3/n9/full/nrg890_fs.html
 http://proxy.lib.iastate.edu:2103/nrg/journal/v5/n4/full/nrg1315_fs.html
 Xiong J.;Essential bioinformatics; QH324.2.X56 2006

Contenu connexe

Tendances

Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-naveed ul mushtaq
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)talhakhat
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAthira RG
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignmentRamya S
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 
Orthologs,Paralogs & Xenologs
 Orthologs,Paralogs & Xenologs  Orthologs,Paralogs & Xenologs
Orthologs,Paralogs & Xenologs OsamaZafar16
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted MutationAmit Kyada
 
Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis Nitin Naik
 
Sequence alignment global vs. local
Sequence alignment  global vs. localSequence alignment  global vs. local
Sequence alignment global vs. localbenazeer fathima
 
Serial analysis of gene expression
Serial analysis of gene expressionSerial analysis of gene expression
Serial analysis of gene expressionAshwini R
 

Tendances (20)

Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Cath
CathCath
Cath
 
Protein protein interactions
Protein protein interactionsProtein protein interactions
Protein protein interactions
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Orthologs,Paralogs & Xenologs
 Orthologs,Paralogs & Xenologs  Orthologs,Paralogs & Xenologs
Orthologs,Paralogs & Xenologs
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted Mutation
 
Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis
 
Sequence alignment global vs. local
Sequence alignment  global vs. localSequence alignment  global vs. local
Sequence alignment global vs. local
 
Clustal
ClustalClustal
Clustal
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
 
DNA footprinting
DNA footprintingDNA footprinting
DNA footprinting
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Serial analysis of gene expression
Serial analysis of gene expressionSerial analysis of gene expression
Serial analysis of gene expression
 
Scop database
Scop databaseScop database
Scop database
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 

Similaire à Gene prediction and expression

Central dogma
Central dogmaCentral dogma
Central dogmaneizylah
 
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdfIf you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdfnaveenkumar29100
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionAashish Patel
 
Central Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptxCentral Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptxAdrianPerezTastar
 
Central dogma of molecular genetics valerio
Central dogma of molecular genetics valerioCentral dogma of molecular genetics valerio
Central dogma of molecular genetics valerioGenny Valerio
 
Transcription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdfTranscription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdfssuser880f82
 
Role of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein SynthesisRole of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein SynthesisCharupriyaChauhan1
 
Translation of Proteins.ppt
Translation of Proteins.pptTranslation of Proteins.ppt
Translation of Proteins.pptDrBeenishAftab
 
lecture 3 Gene expression pptx
lecture 3 Gene expression           pptxlecture 3 Gene expression           pptx
lecture 3 Gene expression pptxHanySaid33
 
Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...muhammad aleem ijaz
 
5.Genetics in orthodontics
5.Genetics in orthodontics5.Genetics in orthodontics
5.Genetics in orthodonticsAbirajkr
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Cleophas Rwemera
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Cleophas Rwemera
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 

Similaire à Gene prediction and expression (20)

Central dogma
Central dogmaCentral dogma
Central dogma
 
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdfIf you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
 
Central Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptxCentral Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptx
 
Central Dogma of Life
Central Dogma of LifeCentral Dogma of Life
Central Dogma of Life
 
Central dogma of molecular genetics valerio
Central dogma of molecular genetics valerioCentral dogma of molecular genetics valerio
Central dogma of molecular genetics valerio
 
Ig
IgIg
Ig
 
chapter 7
chapter 7chapter 7
chapter 7
 
Transcription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdfTranscription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdf
 
Role of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein SynthesisRole of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein Synthesis
 
Translation of Proteins.ppt
Translation of Proteins.pptTranslation of Proteins.ppt
Translation of Proteins.ppt
 
11 transcription
11 transcription11 transcription
11 transcription
 
lecture 3 Gene expression pptx
lecture 3 Gene expression           pptxlecture 3 Gene expression           pptx
lecture 3 Gene expression pptx
 
protein synthesis
protein synthesisprotein synthesis
protein synthesis
 
Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...
 
5.Genetics in orthodontics
5.Genetics in orthodontics5.Genetics in orthodontics
5.Genetics in orthodontics
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
 
Genes
GenesGenes
Genes
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 

Dernier

Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docxPoojaSen20
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxnegromaestrong
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 

Dernier (20)

Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Gene prediction and expression

  • 2. Gene: • Asequence of nucleotides coding for protein. CentralDogma: • Proposed in 1958 by Francis Crick. • Hepostulated that all possibleinformation transferred, are not viable. • Hepublished apaper in 1970. CODONS: • Discovered by Sydney Brenner and Francis Crickin 1961. • In every triplet of nucleotides, each codoncodesfor one amino acid in aprotein.
  • 3. DNA RNA PROTEIN PHENOTYPE 2 4 cDNA 1 3 1. TRANSCRIPTION 2. TRANSLATION 3. GENE EXPRESSION 4. REVERSETRANSCRIPTION
  • 4. DEfiniTION • It is aprerequisite for detailed functionalannotation of genesand genomes. • It candetect location of ORFs(Open Reading Frames), structures of introns andexons. • It describes all the genescomputationally withnear 100% accuracy. • It canreduce the amount ofexperimental verification work required.
  • 5. TYPES • Abinitio- gene signals, intron splice, transcription factor binding site, ribosomal binding site, poly- adenylation site, triplet codon structure and gene content. • Homology- significant matches of query sequence with sequence of knowngenes. • Probabilistic models like Markov model or Hidden Markov Models (HMMs). Abinitio-based Homology- based
  • 6. Translation Protein Splicing mRNA Cap- -Poly(A) Transcription pre-mRNA Cap- -Poly(A) Genomic DNA Stop codon GT AG exon intron Splice sites Donor site Acceptor site SEQUENCE SIGNALS Start codon Exonsare usually shorter thanintrons.
  • 7. Prokaryoticgene prediction • Geneprediction is easier in microbialgenomes. • Smaller genomes, high gene density, very few repetitive sequence, more sequenced genomes. • Start codon is ATG. • Ribosomal binding site/Shine Dalgarno sequence.
  • 8. Openreadingframes • A sequence defined by in-frame start and stop codon, which in turn defines aputative amino acid sequence. • Agenome of length n is comprised of (n/3)codons. • Stop codons break genome into segments between consecutive stop codons. • Thesub-segments of these that start from the Start codon (ATG)areORFs. • DNA is translated in all six possible frames, three frames forward and three reverse. ATG TGA Genomic Sequence Open reading frame
  • 10. Probabilisticmodels • Statistical description of agene. • Markov Models &Hidden Markov Models. • Usedto distinguish oligonucleotide distributions in the coding regions from those for non-coding regions. • Probability of distribution of nucleotides inDNA sequence depends on the order k. • Typesof order- zero,first and second. • Order , gene canpredicted more accurately.
  • 11. Genecontent and length distribution of prokaryotic genes TYPICAL ATYPICAL Ranges from100 to 500amino acids with a nucleotide distribution typical ofthe organism. Shorter or longer with different nucleotidestatistics. Genes tend toescape detection when typical gene modelis used.
  • 12. Genefindingprogramsin prokaryotes • Theprograms are based on HMM/IMM.  GeneMark.hmm (microbial genomes)  Glimmer (UNIX program from TIGR). Computation involves two steps viz. model building & gene prediction.  FGENESB (bacterial sequences). It uses Vertibi algorithm & linear discriminant analysis(LDA).  RBSfinder- Searches from ribosomal binding site or shine dalgarno sequence for prediction of translation initiation site.
  • 13. Sensitivity Ability to include correct predictions. It is the fraction of known genescorrectlypredicted. Specificity Ability to exclude incorrect predictions. It is the fraction of predicted genes that correspond to true genes.  Both are the proportion of true signals.
  • 14. Eukaryoticgeneprediction • Genomes are much larger than prokaryotes(10Mbp to 670 Gbp). • Low gene density. • Spacebetween genesis very large and rich in repetitive sequences & transposableelements. • Splitting of genesby intervening noncodingsequences (introns) and joining of coding sequences(exons).
  • 15. • Splice junctions follow GT-AGrule. • An intron at the 5’ splice junction hasaconsensus motif GTAAGTand that at 3’ endNCAG. exon 1 exon 2 • Geneshave ahigh density of CGdinucleotides near the transcription start site. Thisregion is CpGisland. It helps to identify the transcription initiation site of an eukaryotic gene. • Somepost-transcriptional modification occur with the transcript to become mature mRNAviz. Capping, Splicing and Polyadenylation. Acceptor Site Donor Site GT AG
  • 16. o CAPPING: Occurs at the 5’ end of the transcript. It involves methylation at the initial residue of the RNA. o SPLICING: Processof removal of intronsand joining of exons. It involves alargeRNA-protein complex called spliceosome. o POLYADENYLATION:Addition of astretch ofAs (~250) at the 3’ end of the RNA.Theprocessis accomplished by poly-Apolymerase.
  • 17. Genefindingprogramsin EUkaryotes • Three categories of algorithms  Ab Initiobased- It joins the exonsin correct order.Twosignals-> a) Genesignals: asmall pattern within the genomic DNAincluding putative splice sites, start and stop sites of transcription or translation, branchpoints, transcription factor binding sites, recognizable consensus sequences. b) Genecontent: aregion of genomic DNAincluding nucleotide and amino acid distribution, Synonymous codon usageand hexamer frequencies.
  • 18.  Neural network based algorithm -Composed of network of mathematicalvariables. -Multiple layers like input, output and hiddenlayers. -GRAIL (Splice junctions, start and stop codons, poly-A sites, promoters and CpGislands). It scansthe query sequence with windows of variable lengths &scores.  Discriminant analysis -Linear Discriminant Analysis (LDA) represents 2D graph of coding signals vs. all possible 3’ splice site positions; adiagonal line. -Quadratic DiscriminantAnalysis (QDA)represents quadratic function; acurved line. -FGENES (LDA)
  • 19. -FGENESH [Find Genes] (HMMs) -FGENESH_C (Similarity based) -FGENESH+ (Combination of ab initio &similarity based) -MZEF [Michael Zhang’s Exon Finder](QDA)  HMMs -GENSCAN (Fifth order HMMs); combination of hexamer frequencies with coding signals;probability score P>0.5 -HMMgene (Conditional Maximum Likelihood); combination of ab initio & homology-basedalgorithm
  • 20.  Homology-based- Exonstructures and sequencesof related speciesare highly conserved. Comparison of homologous sequences derived from cDNAor ExpressedSequenceTags (ESTs). -GenomeScan (Combination of GENSCANprediction results with BLASTXsimilaritysearches) -EST2Genome (Intron-exon boundaries); Comparison of an ESTsequence with agenomic DNAsequence -SGP-1 [Syntenic Gene Prediction] (Similar to EST2) -TwinScan (gene-finding server; similar to GenomeScan)
  • 21.  Consensus-based- Combination of results of multiple programsbased on consensus. Improvement of specificity by correctingfalse positives & problem ofoverprediction. Lowered sensitivity & missedpredictons. -GeneComber (Combination of HMMgene& GenScanprediction results) -DIGIT (Combination of FGENESH,GENSCAN& HMMgene)
  • 22. GENE EXPRESSION Two steps are required 1. Translation The synthesis of a polypeptide chain using the genetic code on the mRNA molecule as its guide. 1. Transcription The synthesis of mRNA uses the gene on the DNA molecule as a template This happens in the nucleus of eukaryotes
  • 23. Types OF RNA Messenger RNA (mRNA) <5% Ribosomal RNA (rRNA) Up to 80% Transfer RNA (tRNA) About 15% In eukaryotes small nuclear ribonucleoproteins (snRNP aka spliceosomes Structural characteristics of RNA molecules Single polynucleotide strand which may be looped or coiled (not a double helix) Sugar Ribose (not deoxyribose) Bases used: Adenine, Guanine, Cytosine and Uracil (not Thymine
  • 24. Transcription: The synthesis of a strand of mRNA (and other RNAs) Uses an enzyme RNA polymerase Proceeds in the same direction as replication (5’ to 3’) Forms a complementary strand of mRNA It begins at a promotor site, which signals that the beginning of the gene is near (about 20 to 30 nucleotides away) After the end of the gene is reached, there is a terminator sequence that tells RNA polymerase to stop transcribing NB Terminator sequence ≠ terminator codon RNA POLYMERASE
  • 25. Editing the mRNA In prokaryotes, transcribed mRNA goes straight to the ribosomes in the cytoplasm In eukaryotes, freshly transcribed mRNA in the nucleus is about 5000 nucleotides long When the same mRNA is used for translation at the ribosome it is only 1000 nucleotides long The mRNA has been edited The parts which are kept for gene expression are called EXONS (exons = expressed) The parts which are edited out (by spliceosomes) are called INTRONS.
  • 27. Translation  Location: The ribosomes in the cytoplasm that provide the environment for translation  The genetic code is brought by the mRNA molecule. © 2016 Paul Billiet ODWS
  • 28. An important discovery Retro viruses (e.g. HIV) carry RNA as their genetic information  When they invade their host cell they convert their RNA into a DNA copy using reverse transcriptase  Thus the central dogma is modified: DNA↔RNAProtein  This has helped to explain an important paradox in the evolution of life. Reverse transcriptase © 2016 Paul Billiet ODWS
  • 29. The paradox of DNA  DNA is a very stable molecule  It is a good medium for storing genetic material but…  DNA can do nothing for itself  It requires enzymes for replication  It requires enzymes for gene expression  The information in DNA is required to synthesise enzymes (proteins) but enzymes are require to make DNA function  Which came first in the origin of life DNA or enzymes? © 2016 Paul Billiet ODWS
  • 30. RIBOZYMES: Both genetic and catalytic  Certain forms of RNA have catalytic properties  RIBOZYMES  Ribosomes and spliceosomes are ribozymes  RNA could have been the first genetic information synthesizing proteins…  …and at the same time a biocatalyst  Reverse transcriptase provides the possibility of producing DNA copies from RNA. © 2016 Paul Billiet ODWS
  • 31. The ribosome a ribozyme
  • 32. REFERENCES  http://www.4ulr.com/products/currentprotocols/bioinformatics.html  http://proxy.lib.iastate.edu:2103/nrg/journal/v3/n9/full/nrg890_fs.html  http://proxy.lib.iastate.edu:2103/nrg/journal/v5/n4/full/nrg1315_fs.html  Xiong J.;Essential bioinformatics; QH324.2.X56 2006