SlideShare une entreprise Scribd logo
1  sur  39
C. elegans cosmid K06A5, 24323 bp.
Flat sequence file –3955 bp shown.

>CEK06A5
acaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgttggaaggatg
aaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcttctctcctcgttctctgctcacaac
tcgtctatcactcatatcacatttatttcccaatatcattttaacaacatcttccgatgcatgttcgtcaatattgcgcaaccactttgcaatattgtcaaaacttttcgcat
ttgtgatatcgtaaaccagcataattcccattgctccgcggtaatatgatgttgtgattgtgtggaatcgttcttgtccagctgtgtcccagatttgtaatttaatctttttt
ccttttaattcgatagttttaattttgaagtcgattcctgaatgaaaaaagaaaattattttgaaatcactagattctgaataaaaactaaccaatagttgagatgaatgtgg
tgttaaaggcatcatccgaaaatctgtacagaatgcaagtttttccaactcctgagtcgcctattagcagcaatttgaagagcatgtcatacggtcggcgagccatttttctt
ctgaaatgagaaaaagttgagaactaaagttgcacaaaagtaagagaaaagcacttgagtcatggcaaatagaacgaacactttgagatttcgaagaagttatcaagagttga
caattggaagatatttggaagaactttctaatttttttctagttttccaaaattaggtttttgtcataaaatgttgtcaaagaaaaaacaggacaaaatagttaattgttgtt
tccattataacaaaaaaaaatttgaacggagctattaacgcgtgcatgcgcaaatcacatcgattagctgtttctgggaaattctcgggaaaaggtgaacagcagctgctggc
ttcctctgcgggtcacgaaaacacaaagagatcattataattgttatttggaaaggaagcgaatctaaaacgggtacaggtggacgtttattgatcgaaagtgctttttattt
gaaattgaatggtgaactttgcaattttgtaatgcaaagtacgttatcagatggcatgagatgtgtgaagtgataaggaataaaatgtgaacgacatgttcaagaaactgtga
tttttcaataatttgtgatgaaatattttaggaacagaaatgaacatattaattgatataaaaacaataggaacactaactcataattatgataggtgaatatcaaaatgtgc
tagattttttgaagttaaaaaatacatttctaatattttttcaaataataagtttcagctgaaatttcagggtgatttcagaaagctatgttttgataaattgttttgaaaat
taaaagaagctacagcaaaaaaaaattaaagagaacatcgctccctcgtagtgtataatttttgattatcgaaaaaaatgagtcaatgatgaaaaggaagtcgcaatctcaaa
acttcaaaaatcaaaagaagccgttgcctctgtcatcaaaaattcagaagacaaggttgttgacaagggtcaattctcagtggtggagggcattgggcgtggtgaaatttttg
aaggctagtgtggttggacctctactagatagacaaaacccccgaaatagacgtttaatttgatgagatggtggagaaagaaaaggactcattctctagatgatagagagacc
agagatacagacaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgt
tggaaggatgaaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcatgtgtttttatgttt
ccggtgggagaaggttcaacaaaaaatgaaaagaaaaagttcaagcggcatgaatcattctgagtttaaaacaaaattattgcgaaaattaatattaaaaccttttcacaaaa
cttcaagctaatctgttcatgaaaatttgaataatagttttttcccacctatttagaattaacttcatattaacgaaattaattaacgaatcgaaaattatgacttttcagaa
tcatctgaagttttttcacattccatgctgcatggaataatttgatcctggaatcgatatgtttttatggtatactttttaaccttcaatttagctggaaaagtatggaataa
ataattcccgaagctatgtacatatatgtagaattattgaatgattgtgagaacaacttgactttagcttgagtaggaatcggaatggctatcgaccgatcaacacttaggat
tgtaagaatggcagtaagaatatattgaagaaagaatgtttgttcataggaagagaaagagtattgcgaaatcatcatcgcccactttagaatggacgggcggtgagcggaca
tagagaattgtgaatgactaatgcttttgcagaatctagggcaaaatcgtaggaacaaacaattgtaatacggagaaaacaatcatatcgatcgatgatcatggagaaaaatg
tgatttaagtgagtagacttggaaaaattaataaaagcatgaattgtcgatatttttcatttattttcattataaagctctttaaaaacaaattaaatattgagaatggcttc
gaagaatattgtttcaaatatgttcaatggtgacaccttgcggataaaattaatgtaaaaatcatggaacacagattcactgatatctcattatctcaagcagtgtaattaga
gattttttggaacaattattttataaaactataaataaaccgtttatactactcaaagccaaatattcaagctattaccattttttttctaactaattcttgagcaattaaag
tattccccagtttttattttgcaacgactccaggcaaacacgctccgttgcacttgccgccaaggcgttgcattcaaatcagagagacatctcattccgatttctgtttttct
tccaataaacggtattttatgcctaatgggtgatacggaaattgttcctcttcgagtacaaaatgtacttgatagcgaaatcattcgtctcaacttgtggtccatgaaggtaa
ctgtctagtttttttaagttttcatgatttcaatatttttacagtttaacgcgaccagtttcaaactcgaaggttttgtgagaaatgaagaaggcactatgatgcagaaagtt
tgttccgaatttatttgtgtaagtcgagaaacatattcgtcaacaattttcattaaatattcagagacgcttcacttctacgttgcttttcgatgtttccggacgtttcttcg
acttggtcggacagattgatcgggaatatcaacaaaaaatgggaatgcctagtagaattattgatgaattttcaaatggaattcctgaaaattgggccgaccttatctattcc
tgcatgtcagccaaccaaagaagcgcacttcgccctatccaacaggctccaaaagaaccaattagaactagaacagaaccaattgttacgttggcagatgaaaccgagctaac
tggaggatgccagaaaaattccgaaaacgagaaagaaaggaacagacgtgagcgtgaagaacagcaaacaaaggaacgtgagagaagattagaagaagaaaaacaacgacgag
atgctgaagctgaggctgaaagaaggcgaaaagaagaggaagagctggaagaagctaattacacccttcgtgctccgaaatctcagaacggcgagccaatcactccgataaga
Genome sequence of C.elegans.

                                Sequence of entire genome.

                                Sequence of cDNA clones.

                                Approximately 19,500 PREDICTED protein
                                coding gene sequences.

                                Large number of various kinds of functional
                                RNAs – not discuss further.

                                For this lecture – focus predicted proteins.




                                     Gene prediction? How?
Science, December 1998.
Computer based predictions

GENEFINDER (C.elegans), BLAST (all genomes) and other computer
programs.

Biases in coding sequence - in C. elegans non-coding is AT rich.
Splice site signals, initiator methionines, termination codons.
Likely exons and probable/possible splice patterns.

BLAST – compare the Translation of all 6 reading frames.


      • Evidence that a prediction is correct?
      • Homology with genes in other organisms – homologues.
      • Known protein families.

      •Experimental evidence.
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity
between sequences.

The program compares nucleotide or protein sequences to sequence databases and
calculates the statistical significance of matches.

http://www.ncbi.nlm.nih.gov/
The National Center for Biotechnology Information (NCBI), the U.S. National Library of Medicine.



How does BLAST work?
mqnpmillifclfcavicsrgtdsdiphef                                                    Protein Sequence
                                                                                  Single Letter code
                                             Search windows



BLAST compares small sequential blocks – or WINDOWS- of sequence against massive
databases.
It looks for regions of similarity and scores them.
More BLAST

       High similarity BLAST score                    Conserved regions
                                                      Non-conserved regions
       Low similarity BLAST score
                                                                        Large Protein




Small windows of comparison - detect LOCAL regions of similarity.

Output - % identity and % similarity (permits conservative substitutions of aa.)

Gives overall score and probability of relatedness.

If the entire protein sequence was compared in one go, you may get a relatively low
overall similarity.

How did genes and gene families evolve and what is meant by protein domains?
We need to come back to this – remember the question!
Below is the sequence of a protein:
                                                        HOMEWORK
   mqnpmillif clfcavicsr gtdsdiphef hkmlkhaksl nsllrdlhvi yspemtnrhvektdkhgaal slksgsmsaq
   rivsiqnisd demdgytlfh lqsmkdikqg ndtcnlqsvcvpipqlsddp qvlmypkcye vkqcvgsccn svetchpgti
   nlvkkhvael lyigngrfmfnmtkeitmee htscscfdcg sntpqcapgf vvgrsctcec ankeernncv
   gnatwnaetckcecdlkcee gkilhkdrcd cvrrrqhhgg prghhghrhh hrsrpidtee vqkigqlkvgrigg




Go to NCBI http://www.ncbi.nlm.nih.gov/
Go to Blast then look down the left for “Choose a BLAST program to run”
From within that section, select “protein blast”.
Copy the above protein sequence and paste it into the box on the top left of web page.
Scroll down the page and click the big blue BLAST button.

Have a look at the outcome – any questions – post to the Forum on moodle.



 BLAST is one of the powerful computational tools for Comparative Genomics
Computational biology is mostly predictive – not EXPERIMENTAL

     Lets look at simple experimental evidence for existence of genes.


       “The Central Dogma” of Molecular Biology

       DNA → mRNA → Protein

     Expressed sequence tags (ESTs) – cDNA clones.

To make cDNA mRNA is copied to DNA with reverse transcriptase.

RNA → DNA



  Retroviruses (e.g. HIV).

  RNA genome → DNA → integration → mRNA → protein
Making cDNA
        Typical eukaryotic gene - double stranded DNA

                                                                               exon
                                                                               intron
  1.                           RNA Polymerase

         Primary transcript – single sense strand RNA – introns present
   5’                                                                                   3’OH
                                                                                 RNA exon
  2.                       Capping, splicing, poly-adenylation
         Messenger RNA (mRNA)
5’ CAP                                                    AAAAAAAAAAA 3’OH
                                                      OH-TTTTTTTT-5’         DNA primer
  3.     First strand cDNA synthesis -reverse transcriptase
                                                          AAAAAAAAAAA   RNA/cDNA duplex
                                                          TTTTTTTT


  4.      Second strand cDNA – DNA polymerase
                                                          AAAAAAAA
                                                          TTTTTTTT   Double stranded cDNA
EST sequencing was carried out in parallel to genome sequencing.

   Simplest experimental evidence that a bit of genomic DNA contains a gene.

                              Making cDNA
   cDNA synthesis oligo dT priming
   Messenger RNA (mRNA)
                                                  AAAAAAAAAAA 3’OH
                                             OH-TTTTTTTT-5’           DNA primer

    cDNA synthesis by random priming
                                                  AAAAAAAAAAA 3’OH
                                                                      DNA primer

              OH-NNNNNNNNN-5’
            Random 6-mers or 9-mers


The advantage of Random Priming is cDNA clones not biased towards 3’ end of gene.
Sequence data from Random Primed cDNA – ESTs (or EST Tags)
     Typical eukaryotic gene - double stranded DNA


       EST 1
                          EST 2
                                                                  EST 3
EST sequences

                EST 4




    The sequencing of ESTs uncovered frequent examples of differential splicing.

    Common examples of which are exon skipping (above)

    Alternative 5’ exons, alternative splice altering stop codons, genes within
    genes etc.

    Above true for C. elegans, humans, flies, and many other species.
• C. elegans EST data from approximately 50,000 cDNA clones.
       • Identified 9,356 different genes.



1.   Grind up thousands of worms.
2.   Prepare mRNA – convert to cDNA with reverse transcriptase – clone in plasmid.
3.   Some mRNSs exist at extremely low levels of abundance.
4.   Low abundance cDNAs may be impossible to clone randomly.
Reverse transcriptase PCR – very sensitive.
                                                                            Gene



                                                 AAAAAAAA            mRNA

Primer A.


                                      Primer B


  cDNA from mRNA using reverse transcriptase.

  Amplify cDNA by PCR – primers designed from predicted genes.

  Clone and analyse products.

  Experimentally confirmed genes raised to > 18,000.

  Full length cDNA– valuable for confirming intron/exon structure.
Summary of predicted and known gene sequences in C. elegans




1.   Predicted 19,500 genes.

2.   At least 18,000 expressed as RNA.

3.   Average of 1 gene per 5 kb.

4.   ~ 42% have detectable homologies to genes/proteins outside Nematoda.
Genome Size

Organism                          Genome     Genes

E.coli (bacteria)                 4.64 Mb    4,377
S. cerevisiae (fungal)            12.1 Mb    6,163
C.elegans (metazoan)              100 Mb     19,300
Arabadopsis (plant)               118 Mb     ~20,000
D. melanogaster (fruit fly)       135.6 Mb   13,472
Mus musculus (mouse)              3059 Mb    ~25,000
Homo sapiens (obvious)            3286 Mb    ~25,000
The C. elegans Top 20 protein Homologies

Number   Description

650      7 TM chemoreceptor
410      Eukaryotic protein kinase domain
240      Zinc finger, C4 (transcription factor)
170      Collagen
140      7 TM receptor
130      Zinc finger, C2H2 (transcription factor)
120      Lectin C-type domain short and long forms
100      RNA recognition motif (RRM, RBD, or RNP domain)
90       Zinc finger, C3HC4 type (transcription factor)
90       Protein-tyrosine phosphatase
90       Ankyrin repeat
90       WD domain, G-beta repeats
80       Homeobox domain (transcription factor)
80       Neurotransmitter-gated ion channel
80       Cytochrome P450
80       Helicases conserved C-terminal domain
80       Alcohol/other dehydrogenases, short-chain type
70       UDP-glucoronosyl and UDP-glucosyl transferases
70       EGF-like domain
70       Immunoglobulin superfamily
Does the “Top 20” list tell us anything?


                Previous slide looked rather boring?

                Test your memory – what was on the list?



Many of the large gene families are implicated in developmental control.




  Core set of proteins needed for general cell biology/metabolism to make a cell
  – e.g. S. cerevisiae ~6,163 genes.

  Evolution of developmental complexity – amplification of families of
  regulatory molecules.

  The above in part explains the increase in number of genes in multicellular
  organisms – it does not explain fully the increase in DNA content.
How much does DNA sequence teach us?

Remember that what we can learn from protein similarities
is limited by what we know about the similar proteins.

We still need to connect genes/proteins with functions.
How has genomics influenced genetics?

                     C. elegans mutants
Wild Type

            dpy-7:   Short fat worm – exoskeletal defect.

            ced-4:   Programmed cell death defective.

            unc-51: Paralysed - abnormal axons.

            dec-2:   long defecation cycle – genetically constipated.
We wanted to investigate the molecular detail of gene defined by mutation.
   We knew where mutant genes mapped and we knew their phenotype.

            Chromosome I               Genetic mapping.

Left arm    m.u.   bli-3
                                       m.u. = map unit.
            -15    egl-30
                                       Genetic mapping – recombination.
                   mab-20
            -10
                                       1 m.u. is 1% recombination per meiosis.
             -5     fog-1
                    unc-73 unc-57
Central      0      dpy-5
                           dpy-14
cluster             fer-1
             5      lin-11 unc-29

                    unc-75                   Parent              Recombinant
            10
                    unc-101
            15

            20      glp-4                fog-1     +       fog-1           +
            25
                    unc-54               glp-4     +         +             glp-4
Right arm
Sequence of genomes – individual chromosomes
    AGCCTTTATGGCGAGATGGATAGCT………………………..………………………………………….TATAA




Physical Map of clones




                                                                                unc-101




                                                                                                       unc-54
                                                                       unc-75
                                             unc-73
                            mab-20




                                                              lin-11
                                                      dpy-5




                                                                                          glp-4
                                     fog-1
                   egl-30




                                                              fer-1
          bli-3




Genetic
map




                                                                          10


                                                                                    15


                                                                                          20


                                                                                                  25
                                                       0


                                                                  5
                  -15


                               -10


                                     -5




     How can the physical and genetic maps be aligned?
     Identify the sequence of genes defined by mutation.
unc-101
                                                                          unc-75




                                                                                                          unc-54
                                                        unc-73
                                       mab-20




                                                                 lin-11
                                                        dpy-5




                                                                                             glp-4
                                                fog-1
                              egl-30




                                                                 fer-1
                     bli-3
  Genetic map




                                                                             10

                                                                                      15

                                                                                             20

                                                                                                     25
                                                           0

                                                                     5
                             -15

                                          -10

                                                -5
Physical map




   • An association or alignment between the physical and genetic maps.
Positional cloning of genes defined by mutation.




                                                                          unc-101




                                                                                                 unc-54
                                                                 unc-75
                                               unc-73
                                      mab-20




                                                        lin-11
                                               dpy-5




                                                                                    glp-4
                                               fog-1
                             egl-30




                                                        fer-1
                    bli-3
     Genetic map




                                                                   10

                                                                             15

                                                                                    20

                                                                                            25
                                                    0

                                                           5
                            -15

                                        -10

                                               -5
Physical map




               Imagine lin-11 and unc-101 had both been cloned.

               Where on the physical map might unc-75 be?
Transgenic C.elegans – rescue of mutant phenotype.

     DNA injected into the gonads of the adult hermaphrodites.

     Form large heritable DNA molecules termed "free arrays".
Phenotypic Rescue
1.   Inject cosmid into the mutant.
2.   Observe transgenic progeny for phenotypic rescue.
3.   Subclone individual genes from cosmid.
4.   Observe transgenic progeny for phenotypic rescue.



                        Cosmid sequence


                                             Genes




                             Inject unc-75 mutant worms.
Positional cloning of genes defined by mutation.




                                                                            unc-101




                                                                                                   unc-54
                                                                   unc-75
                                                 unc-73
                                        mab-20




                                                          lin-11
                                                 dpy-5




                                                                                      glp-4
                                                 fog-1
                               egl-30




                                                          fer-1
                      bli-3
     Genetic map




                                                                     10

                                                                               15

                                                                                      20

                                                                                              25
                                                      0

                                                             5
                              -15

                                          -10

                                                 -5
Physical map




                         Attempt phenotypic rescue with cosmids.


       • The standard route to clone C. elegans genes defined by mutation.

       • The more genes are cloned the easier it becomes to clone others.
Can’t make transgenic humans – but the same positional
information is used to identify Human disease genes.
RNA Interference (RNAi)

RNAi - sequence-specific inactivation of gene function by, either by double stranded
RNA or siRNA.

Since its discovery in C.elegans, it has been found to work in many organisms – e.g.
cultured vertebrate cells, plants, trypanosomes, Drosophila.
Mediators of RNAi - short interfering RNAs (siRNAs)

                 21-23 nt dsRNA duplexes.


DICER – Highly conserved family of RNaseIII enzymes.
Targets double stranded RNA.
Argonaute




Single Stranded interfering RNA
RNAi in C.elegans.




        ds RNA




Observer phenotype of F1 offspring
Noticed that site of injection did not matter – intestine works??
How could that affect embryos?
Systemic RNAi
Bacterial Feeding Method in C. elegans
Express dsRNA of a cloned C.elegans gene in a strain of E.coli.
Worms eat the bacteria as food.

RNAi of the gene can be obtained both in the worms that feed on the dsRNA
expressing bacteria, and in the F1 progeny of these worms.
sid-1 mutants are defective
in systemic RNAi




                                     SID-1 protein




                              Transport of dsRNA into Cells by
                              the Transmembrane Protein SID-1
                              Science 301, 1545 (2003)
RNAi as a tool for genetic analysis

Loss of function phenotype can be estimated by RNAi.

RNAi by feeding method – whole genome RNAi projects.

Clones of 16,757 predicted genes tested in genome wide screen.

10.3% gave obvious phenotype.



Redundancy between genes.

RNAi is capable of functioning for more than one gene at a time.

Permits analysis of functionally redundant genes.
Summary, C. elegans Genomics

Permits comparisons with human genes.

Most human disease genes have C. elegans homologues.

Powerful genetic tools – experiments on genes.

Detailed anatomy – relate gene to function.


             Examples of processes investigated.

             Programmed cell death.
             Signalling.
             Cell adhesion.
             Axonal guidance.
             Oncogene function.
             Insulin Pathway
             Ageing
How did genes evolve and what are gene/protein families
Early genomes
– Early genomes made of RNA
     • RNA world - no cells (in modern sense), just RNA, starting with 1
       gene
     • RNotide polymerase activity - catalyse own synth.
     • Later on - translation - encoded info for production of proteins
        – Involves nucleic acids ‘coding for’ proteins
– Later emergence of DNA as the info store - genome      stability - less
   labile
– Modern functions of nucleic acids
     •   coding - proteins via mRNA
     •   catalytic – ribozymes
     •   structural – rRNA, tRNA                                                *
     •   regulatory - miRNAs
             nucleotides
                                                         tRNA, rRNA
                    RNA

                                 DNA
                                               mRNA
Inorganic surface
                                                                      protein
Where did our genome come from?….

‘Tree of Life’
        - Tree of all Animals

Common ancestor
=> common genome



*
• Each species’ genome
  descended with modification
  from genome of ancestor

 Reconstruction of picture of ‘ancestral
 genome’?

 Comparative genomics - tells us about state
 of ancestor and changes along each branch
Genes and Genome evolution
• What processes lead to genome evolution…?
                                                                  *
           Initial ligation to form early chromosomes


                             inversion

                                         duplication / deletion


    accumn. of point mutations


  Invasion - horizontal gene transfer & transposable elements
Structure of a typical eukaryotic gene

                         TSS                       ATG                            stop
 gene
              promoter                  Intron 1
                         Exon 1                      Exon 2              Exon 3   Exon 4


mRNA                                                                                        Poly A tail
                               5’-UTR                                              3’-UTR


protein
                                               Domain 1       Domain 2                      *

        What features of all genes are missing from this diagram….?

Contenu connexe

Tendances

Tendances (20)

Sandesh pawar master seminar
Sandesh pawar  master seminarSandesh pawar  master seminar
Sandesh pawar master seminar
 
RNA Interference (RNAi) and RNA Induced Gene Silencing
RNA Interference (RNAi) and RNA Induced Gene Silencing RNA Interference (RNAi) and RNA Induced Gene Silencing
RNA Interference (RNAi) and RNA Induced Gene Silencing
 
(Rn ai)
(Rn ai)(Rn ai)
(Rn ai)
 
Gene knockoff
Gene knockoffGene knockoff
Gene knockoff
 
RNAi, miRNA & siRNA
RNAi, miRNA & siRNARNAi, miRNA & siRNA
RNAi, miRNA & siRNA
 
RNA interference
RNA interferenceRNA interference
RNA interference
 
Role of Antisense and RNAi-based Gene Silencing in Crop Improvement
Role of Antisense and RNAi-based Gene Silencing in Crop ImprovementRole of Antisense and RNAi-based Gene Silencing in Crop Improvement
Role of Antisense and RNAi-based Gene Silencing in Crop Improvement
 
RNA interference (RNAi):A therapeutic strategy for aquaculture
RNA interference (RNAi):A therapeutic strategy for aquacultureRNA interference (RNAi):A therapeutic strategy for aquaculture
RNA interference (RNAi):A therapeutic strategy for aquaculture
 
Transgene silencing
Transgene silencingTransgene silencing
Transgene silencing
 
Creative biogene-The Extended Applications of RNAi
Creative biogene-The Extended Applications of RNAiCreative biogene-The Extended Applications of RNAi
Creative biogene-The Extended Applications of RNAi
 
Antisense genes in plants and their applications in crop improvement
Antisense genes in plants and their applications in crop improvementAntisense genes in plants and their applications in crop improvement
Antisense genes in plants and their applications in crop improvement
 
RNA interference
RNA interferenceRNA interference
RNA interference
 
Lectut btn-202-ppt-l38. rna interference
Lectut btn-202-ppt-l38. rna interferenceLectut btn-202-ppt-l38. rna interference
Lectut btn-202-ppt-l38. rna interference
 
RNA interference
RNA interferenceRNA interference
RNA interference
 
Gene silencing
Gene silencing Gene silencing
Gene silencing
 
Antisense RNA in crop
Antisense RNA in cropAntisense RNA in crop
Antisense RNA in crop
 
Antisense RNA Technology Forr Crop Improvement
Antisense RNA Technology Forr Crop ImprovementAntisense RNA Technology Forr Crop Improvement
Antisense RNA Technology Forr Crop Improvement
 
Antisense rna technology
Antisense rna technologyAntisense rna technology
Antisense rna technology
 
RNAi – Mechanism and Its Application In Crop Improvement
RNAi – Mechanism and Its Application In Crop ImprovementRNAi – Mechanism and Its Application In Crop Improvement
RNAi – Mechanism and Its Application In Crop Improvement
 
Sandipayan seminar gene silencing
Sandipayan seminar gene silencing Sandipayan seminar gene silencing
Sandipayan seminar gene silencing
 

Similaire à Genomics lecture 3

Central dogma
Central dogmaCentral dogma
Central dogma
neizylah
 
2013 transcription
2013 transcription2013 transcription
2013 transcription
kuldip sodhi
 
2013 transcription
2013 transcription2013 transcription
2013 transcription
kuldip sodhi
 
Central dogma of molecular genetics valerio
Central dogma of molecular genetics valerioCentral dogma of molecular genetics valerio
Central dogma of molecular genetics valerio
Genny Valerio
 

Similaire à Genomics lecture 3 (20)

Rna seq and chip seq
Rna seq and chip seqRna seq and chip seq
Rna seq and chip seq
 
Honors ~ Dna 1314
Honors ~ Dna 1314Honors ~ Dna 1314
Honors ~ Dna 1314
 
Central dogma
Central dogmaCentral dogma
Central dogma
 
Transcription
TranscriptionTranscription
Transcription
 
Gene prediction and expression
Gene prediction and expressionGene prediction and expression
Gene prediction and expression
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
dna_replication.pptx
dna_replication.pptxdna_replication.pptx
dna_replication.pptx
 
Replication (reviewed, 2018)
Replication (reviewed, 2018)Replication (reviewed, 2018)
Replication (reviewed, 2018)
 
2013 transcription
2013 transcription2013 transcription
2013 transcription
 
2013 transcription
2013 transcription2013 transcription
2013 transcription
 
Annotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamAnnotating nc-RNAs with Rfam
Annotating nc-RNAs with Rfam
 
Central dogma of molecular genetics valerio
Central dogma of molecular genetics valerioCentral dogma of molecular genetics valerio
Central dogma of molecular genetics valerio
 
Dna replication
Dna replicationDna replication
Dna replication
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
 
Motiffs
MotiffsMotiffs
Motiffs
 
Approaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and AnalysisApproaches to cDNA Cloning and Analysis
Approaches to cDNA Cloning and Analysis
 
Dna sequencing and its types
Dna sequencing and its typesDna sequencing and its types
Dna sequencing and its types
 
Introduction to NGS
Introduction to NGSIntroduction to NGS
Introduction to NGS
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Dernier (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

Genomics lecture 3

  • 1. C. elegans cosmid K06A5, 24323 bp. Flat sequence file –3955 bp shown. >CEK06A5 acaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgttggaaggatg aaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcttctctcctcgttctctgctcacaac tcgtctatcactcatatcacatttatttcccaatatcattttaacaacatcttccgatgcatgttcgtcaatattgcgcaaccactttgcaatattgtcaaaacttttcgcat ttgtgatatcgtaaaccagcataattcccattgctccgcggtaatatgatgttgtgattgtgtggaatcgttcttgtccagctgtgtcccagatttgtaatttaatctttttt ccttttaattcgatagttttaattttgaagtcgattcctgaatgaaaaaagaaaattattttgaaatcactagattctgaataaaaactaaccaatagttgagatgaatgtgg tgttaaaggcatcatccgaaaatctgtacagaatgcaagtttttccaactcctgagtcgcctattagcagcaatttgaagagcatgtcatacggtcggcgagccatttttctt ctgaaatgagaaaaagttgagaactaaagttgcacaaaagtaagagaaaagcacttgagtcatggcaaatagaacgaacactttgagatttcgaagaagttatcaagagttga caattggaagatatttggaagaactttctaatttttttctagttttccaaaattaggtttttgtcataaaatgttgtcaaagaaaaaacaggacaaaatagttaattgttgtt tccattataacaaaaaaaaatttgaacggagctattaacgcgtgcatgcgcaaatcacatcgattagctgtttctgggaaattctcgggaaaaggtgaacagcagctgctggc ttcctctgcgggtcacgaaaacacaaagagatcattataattgttatttggaaaggaagcgaatctaaaacgggtacaggtggacgtttattgatcgaaagtgctttttattt gaaattgaatggtgaactttgcaattttgtaatgcaaagtacgttatcagatggcatgagatgtgtgaagtgataaggaataaaatgtgaacgacatgttcaagaaactgtga tttttcaataatttgtgatgaaatattttaggaacagaaatgaacatattaattgatataaaaacaataggaacactaactcataattatgataggtgaatatcaaaatgtgc tagattttttgaagttaaaaaatacatttctaatattttttcaaataataagtttcagctgaaatttcagggtgatttcagaaagctatgttttgataaattgttttgaaaat taaaagaagctacagcaaaaaaaaattaaagagaacatcgctccctcgtagtgtataatttttgattatcgaaaaaaatgagtcaatgatgaaaaggaagtcgcaatctcaaa acttcaaaaatcaaaagaagccgttgcctctgtcatcaaaaattcagaagacaaggttgttgacaagggtcaattctcagtggtggagggcattgggcgtggtgaaatttttg aaggctagtgtggttggacctctactagatagacaaaacccccgaaatagacgtttaatttgatgagatggtggagaaagaaaaggactcattctctagatgatagagagacc agagatacagacaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgt tggaaggatgaaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcatgtgtttttatgttt ccggtgggagaaggttcaacaaaaaatgaaaagaaaaagttcaagcggcatgaatcattctgagtttaaaacaaaattattgcgaaaattaatattaaaaccttttcacaaaa cttcaagctaatctgttcatgaaaatttgaataatagttttttcccacctatttagaattaacttcatattaacgaaattaattaacgaatcgaaaattatgacttttcagaa tcatctgaagttttttcacattccatgctgcatggaataatttgatcctggaatcgatatgtttttatggtatactttttaaccttcaatttagctggaaaagtatggaataa ataattcccgaagctatgtacatatatgtagaattattgaatgattgtgagaacaacttgactttagcttgagtaggaatcggaatggctatcgaccgatcaacacttaggat tgtaagaatggcagtaagaatatattgaagaaagaatgtttgttcataggaagagaaagagtattgcgaaatcatcatcgcccactttagaatggacgggcggtgagcggaca tagagaattgtgaatgactaatgcttttgcagaatctagggcaaaatcgtaggaacaaacaattgtaatacggagaaaacaatcatatcgatcgatgatcatggagaaaaatg tgatttaagtgagtagacttggaaaaattaataaaagcatgaattgtcgatatttttcatttattttcattataaagctctttaaaaacaaattaaatattgagaatggcttc gaagaatattgtttcaaatatgttcaatggtgacaccttgcggataaaattaatgtaaaaatcatggaacacagattcactgatatctcattatctcaagcagtgtaattaga gattttttggaacaattattttataaaactataaataaaccgtttatactactcaaagccaaatattcaagctattaccattttttttctaactaattcttgagcaattaaag tattccccagtttttattttgcaacgactccaggcaaacacgctccgttgcacttgccgccaaggcgttgcattcaaatcagagagacatctcattccgatttctgtttttct tccaataaacggtattttatgcctaatgggtgatacggaaattgttcctcttcgagtacaaaatgtacttgatagcgaaatcattcgtctcaacttgtggtccatgaaggtaa ctgtctagtttttttaagttttcatgatttcaatatttttacagtttaacgcgaccagtttcaaactcgaaggttttgtgagaaatgaagaaggcactatgatgcagaaagtt tgttccgaatttatttgtgtaagtcgagaaacatattcgtcaacaattttcattaaatattcagagacgcttcacttctacgttgcttttcgatgtttccggacgtttcttcg acttggtcggacagattgatcgggaatatcaacaaaaaatgggaatgcctagtagaattattgatgaattttcaaatggaattcctgaaaattgggccgaccttatctattcc tgcatgtcagccaaccaaagaagcgcacttcgccctatccaacaggctccaaaagaaccaattagaactagaacagaaccaattgttacgttggcagatgaaaccgagctaac tggaggatgccagaaaaattccgaaaacgagaaagaaaggaacagacgtgagcgtgaagaacagcaaacaaaggaacgtgagagaagattagaagaagaaaaacaacgacgag atgctgaagctgaggctgaaagaaggcgaaaagaagaggaagagctggaagaagctaattacacccttcgtgctccgaaatctcagaacggcgagccaatcactccgataaga
  • 2. Genome sequence of C.elegans. Sequence of entire genome. Sequence of cDNA clones. Approximately 19,500 PREDICTED protein coding gene sequences. Large number of various kinds of functional RNAs – not discuss further. For this lecture – focus predicted proteins. Gene prediction? How? Science, December 1998.
  • 3. Computer based predictions GENEFINDER (C.elegans), BLAST (all genomes) and other computer programs. Biases in coding sequence - in C. elegans non-coding is AT rich. Splice site signals, initiator methionines, termination codons. Likely exons and probable/possible splice patterns. BLAST – compare the Translation of all 6 reading frames. • Evidence that a prediction is correct? • Homology with genes in other organisms – homologues. • Known protein families. •Experimental evidence.
  • 4. The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. http://www.ncbi.nlm.nih.gov/ The National Center for Biotechnology Information (NCBI), the U.S. National Library of Medicine. How does BLAST work? mqnpmillifclfcavicsrgtdsdiphef Protein Sequence Single Letter code Search windows BLAST compares small sequential blocks – or WINDOWS- of sequence against massive databases. It looks for regions of similarity and scores them.
  • 5. More BLAST High similarity BLAST score Conserved regions Non-conserved regions Low similarity BLAST score Large Protein Small windows of comparison - detect LOCAL regions of similarity. Output - % identity and % similarity (permits conservative substitutions of aa.) Gives overall score and probability of relatedness. If the entire protein sequence was compared in one go, you may get a relatively low overall similarity. How did genes and gene families evolve and what is meant by protein domains? We need to come back to this – remember the question!
  • 6. Below is the sequence of a protein: HOMEWORK mqnpmillif clfcavicsr gtdsdiphef hkmlkhaksl nsllrdlhvi yspemtnrhvektdkhgaal slksgsmsaq rivsiqnisd demdgytlfh lqsmkdikqg ndtcnlqsvcvpipqlsddp qvlmypkcye vkqcvgsccn svetchpgti nlvkkhvael lyigngrfmfnmtkeitmee htscscfdcg sntpqcapgf vvgrsctcec ankeernncv gnatwnaetckcecdlkcee gkilhkdrcd cvrrrqhhgg prghhghrhh hrsrpidtee vqkigqlkvgrigg Go to NCBI http://www.ncbi.nlm.nih.gov/ Go to Blast then look down the left for “Choose a BLAST program to run” From within that section, select “protein blast”. Copy the above protein sequence and paste it into the box on the top left of web page. Scroll down the page and click the big blue BLAST button. Have a look at the outcome – any questions – post to the Forum on moodle. BLAST is one of the powerful computational tools for Comparative Genomics
  • 7. Computational biology is mostly predictive – not EXPERIMENTAL Lets look at simple experimental evidence for existence of genes. “The Central Dogma” of Molecular Biology DNA → mRNA → Protein Expressed sequence tags (ESTs) – cDNA clones. To make cDNA mRNA is copied to DNA with reverse transcriptase. RNA → DNA Retroviruses (e.g. HIV). RNA genome → DNA → integration → mRNA → protein
  • 8. Making cDNA Typical eukaryotic gene - double stranded DNA exon intron 1. RNA Polymerase Primary transcript – single sense strand RNA – introns present 5’ 3’OH RNA exon 2. Capping, splicing, poly-adenylation Messenger RNA (mRNA) 5’ CAP AAAAAAAAAAA 3’OH OH-TTTTTTTT-5’ DNA primer 3. First strand cDNA synthesis -reverse transcriptase AAAAAAAAAAA RNA/cDNA duplex TTTTTTTT 4. Second strand cDNA – DNA polymerase AAAAAAAA TTTTTTTT Double stranded cDNA
  • 9. EST sequencing was carried out in parallel to genome sequencing. Simplest experimental evidence that a bit of genomic DNA contains a gene. Making cDNA cDNA synthesis oligo dT priming Messenger RNA (mRNA) AAAAAAAAAAA 3’OH OH-TTTTTTTT-5’ DNA primer cDNA synthesis by random priming AAAAAAAAAAA 3’OH DNA primer OH-NNNNNNNNN-5’ Random 6-mers or 9-mers The advantage of Random Priming is cDNA clones not biased towards 3’ end of gene.
  • 10. Sequence data from Random Primed cDNA – ESTs (or EST Tags) Typical eukaryotic gene - double stranded DNA EST 1 EST 2 EST 3 EST sequences EST 4 The sequencing of ESTs uncovered frequent examples of differential splicing. Common examples of which are exon skipping (above) Alternative 5’ exons, alternative splice altering stop codons, genes within genes etc. Above true for C. elegans, humans, flies, and many other species.
  • 11. • C. elegans EST data from approximately 50,000 cDNA clones. • Identified 9,356 different genes. 1. Grind up thousands of worms. 2. Prepare mRNA – convert to cDNA with reverse transcriptase – clone in plasmid. 3. Some mRNSs exist at extremely low levels of abundance. 4. Low abundance cDNAs may be impossible to clone randomly.
  • 12. Reverse transcriptase PCR – very sensitive. Gene AAAAAAAA mRNA Primer A. Primer B cDNA from mRNA using reverse transcriptase. Amplify cDNA by PCR – primers designed from predicted genes. Clone and analyse products. Experimentally confirmed genes raised to > 18,000. Full length cDNA– valuable for confirming intron/exon structure.
  • 13. Summary of predicted and known gene sequences in C. elegans 1. Predicted 19,500 genes. 2. At least 18,000 expressed as RNA. 3. Average of 1 gene per 5 kb. 4. ~ 42% have detectable homologies to genes/proteins outside Nematoda.
  • 14. Genome Size Organism Genome Genes E.coli (bacteria) 4.64 Mb 4,377 S. cerevisiae (fungal) 12.1 Mb 6,163 C.elegans (metazoan) 100 Mb 19,300 Arabadopsis (plant) 118 Mb ~20,000 D. melanogaster (fruit fly) 135.6 Mb 13,472 Mus musculus (mouse) 3059 Mb ~25,000 Homo sapiens (obvious) 3286 Mb ~25,000
  • 15. The C. elegans Top 20 protein Homologies Number Description 650 7 TM chemoreceptor 410 Eukaryotic protein kinase domain 240 Zinc finger, C4 (transcription factor) 170 Collagen 140 7 TM receptor 130 Zinc finger, C2H2 (transcription factor) 120 Lectin C-type domain short and long forms 100 RNA recognition motif (RRM, RBD, or RNP domain) 90 Zinc finger, C3HC4 type (transcription factor) 90 Protein-tyrosine phosphatase 90 Ankyrin repeat 90 WD domain, G-beta repeats 80 Homeobox domain (transcription factor) 80 Neurotransmitter-gated ion channel 80 Cytochrome P450 80 Helicases conserved C-terminal domain 80 Alcohol/other dehydrogenases, short-chain type 70 UDP-glucoronosyl and UDP-glucosyl transferases 70 EGF-like domain 70 Immunoglobulin superfamily
  • 16. Does the “Top 20” list tell us anything? Previous slide looked rather boring? Test your memory – what was on the list? Many of the large gene families are implicated in developmental control. Core set of proteins needed for general cell biology/metabolism to make a cell – e.g. S. cerevisiae ~6,163 genes. Evolution of developmental complexity – amplification of families of regulatory molecules. The above in part explains the increase in number of genes in multicellular organisms – it does not explain fully the increase in DNA content.
  • 17. How much does DNA sequence teach us? Remember that what we can learn from protein similarities is limited by what we know about the similar proteins. We still need to connect genes/proteins with functions.
  • 18. How has genomics influenced genetics? C. elegans mutants Wild Type dpy-7: Short fat worm – exoskeletal defect. ced-4: Programmed cell death defective. unc-51: Paralysed - abnormal axons. dec-2: long defecation cycle – genetically constipated.
  • 19. We wanted to investigate the molecular detail of gene defined by mutation. We knew where mutant genes mapped and we knew their phenotype. Chromosome I Genetic mapping. Left arm m.u. bli-3 m.u. = map unit. -15 egl-30 Genetic mapping – recombination. mab-20 -10 1 m.u. is 1% recombination per meiosis. -5 fog-1 unc-73 unc-57 Central 0 dpy-5 dpy-14 cluster fer-1 5 lin-11 unc-29 unc-75 Parent Recombinant 10 unc-101 15 20 glp-4 fog-1 + fog-1 + 25 unc-54 glp-4 + + glp-4 Right arm
  • 20. Sequence of genomes – individual chromosomes AGCCTTTATGGCGAGATGGATAGCT………………………..………………………………………….TATAA Physical Map of clones unc-101 unc-54 unc-75 unc-73 mab-20 lin-11 dpy-5 glp-4 fog-1 egl-30 fer-1 bli-3 Genetic map 10 15 20 25 0 5 -15 -10 -5 How can the physical and genetic maps be aligned? Identify the sequence of genes defined by mutation.
  • 21. unc-101 unc-75 unc-54 unc-73 mab-20 lin-11 dpy-5 glp-4 fog-1 egl-30 fer-1 bli-3 Genetic map 10 15 20 25 0 5 -15 -10 -5 Physical map • An association or alignment between the physical and genetic maps.
  • 22. Positional cloning of genes defined by mutation. unc-101 unc-54 unc-75 unc-73 mab-20 lin-11 dpy-5 glp-4 fog-1 egl-30 fer-1 bli-3 Genetic map 10 15 20 25 0 5 -15 -10 -5 Physical map Imagine lin-11 and unc-101 had both been cloned. Where on the physical map might unc-75 be?
  • 23. Transgenic C.elegans – rescue of mutant phenotype. DNA injected into the gonads of the adult hermaphrodites. Form large heritable DNA molecules termed "free arrays".
  • 24. Phenotypic Rescue 1. Inject cosmid into the mutant. 2. Observe transgenic progeny for phenotypic rescue. 3. Subclone individual genes from cosmid. 4. Observe transgenic progeny for phenotypic rescue. Cosmid sequence Genes Inject unc-75 mutant worms.
  • 25. Positional cloning of genes defined by mutation. unc-101 unc-54 unc-75 unc-73 mab-20 lin-11 dpy-5 glp-4 fog-1 egl-30 fer-1 bli-3 Genetic map 10 15 20 25 0 5 -15 -10 -5 Physical map Attempt phenotypic rescue with cosmids. • The standard route to clone C. elegans genes defined by mutation. • The more genes are cloned the easier it becomes to clone others.
  • 26. Can’t make transgenic humans – but the same positional information is used to identify Human disease genes.
  • 27. RNA Interference (RNAi) RNAi - sequence-specific inactivation of gene function by, either by double stranded RNA or siRNA. Since its discovery in C.elegans, it has been found to work in many organisms – e.g. cultured vertebrate cells, plants, trypanosomes, Drosophila.
  • 28. Mediators of RNAi - short interfering RNAs (siRNAs) 21-23 nt dsRNA duplexes. DICER – Highly conserved family of RNaseIII enzymes. Targets double stranded RNA.
  • 30. RNAi in C.elegans. ds RNA Observer phenotype of F1 offspring Noticed that site of injection did not matter – intestine works?? How could that affect embryos? Systemic RNAi
  • 31. Bacterial Feeding Method in C. elegans Express dsRNA of a cloned C.elegans gene in a strain of E.coli. Worms eat the bacteria as food. RNAi of the gene can be obtained both in the worms that feed on the dsRNA expressing bacteria, and in the F1 progeny of these worms.
  • 32. sid-1 mutants are defective in systemic RNAi SID-1 protein Transport of dsRNA into Cells by the Transmembrane Protein SID-1 Science 301, 1545 (2003)
  • 33. RNAi as a tool for genetic analysis Loss of function phenotype can be estimated by RNAi. RNAi by feeding method – whole genome RNAi projects. Clones of 16,757 predicted genes tested in genome wide screen. 10.3% gave obvious phenotype. Redundancy between genes. RNAi is capable of functioning for more than one gene at a time. Permits analysis of functionally redundant genes.
  • 34. Summary, C. elegans Genomics Permits comparisons with human genes. Most human disease genes have C. elegans homologues. Powerful genetic tools – experiments on genes. Detailed anatomy – relate gene to function. Examples of processes investigated. Programmed cell death. Signalling. Cell adhesion. Axonal guidance. Oncogene function. Insulin Pathway Ageing
  • 35. How did genes evolve and what are gene/protein families
  • 36. Early genomes – Early genomes made of RNA • RNA world - no cells (in modern sense), just RNA, starting with 1 gene • RNotide polymerase activity - catalyse own synth. • Later on - translation - encoded info for production of proteins – Involves nucleic acids ‘coding for’ proteins – Later emergence of DNA as the info store - genome stability - less labile – Modern functions of nucleic acids • coding - proteins via mRNA • catalytic – ribozymes • structural – rRNA, tRNA * • regulatory - miRNAs nucleotides tRNA, rRNA RNA DNA mRNA Inorganic surface protein
  • 37. Where did our genome come from?…. ‘Tree of Life’ - Tree of all Animals Common ancestor => common genome * • Each species’ genome descended with modification from genome of ancestor Reconstruction of picture of ‘ancestral genome’? Comparative genomics - tells us about state of ancestor and changes along each branch
  • 38. Genes and Genome evolution • What processes lead to genome evolution…? * Initial ligation to form early chromosomes inversion duplication / deletion accumn. of point mutations Invasion - horizontal gene transfer & transposable elements
  • 39. Structure of a typical eukaryotic gene TSS ATG stop gene promoter Intron 1 Exon 1 Exon 2 Exon 3 Exon 4 mRNA Poly A tail 5’-UTR 3’-UTR protein Domain 1 Domain 2 * What features of all genes are missing from this diagram….?