SlideShare une entreprise Scribd logo
1  sur  33
A
C
G
T
Comparative Analysis of HumanComparative Analysis of Human
Chromosome 22q11.1-q12.3 withChromosome 22q11.1-q12.3 with
Syntenic Regions in the Chimpanzee,Syntenic Regions in the Chimpanzee,
Baboon, Bovine, Mouse, Pufferfish andBaboon, Bovine, Mouse, Pufferfish and
Zebrafish GenomesZebrafish Genomes
Dr. Bruce A. RoeDr. Bruce A. Roe
George Lynn Cross Research ProfessorGeorge Lynn Cross Research Professor
Advanced Center for Genome TechnologyAdvanced Center for Genome Technology
Department of Chemistry and BiochemistryDepartment of Chemistry and Biochemistry
University of OklahomaUniversity of Oklahoma
broe@ou.edu www.genome.ou.edubroe@ou.edu www.genome.ou.edu
LXVIII CSHL Symposium
“The Genome of Homo Sapiens”
May 28 - June 3, 2003
A
C
G
T
““The joy of science is the peopleThe joy of science is the people
you meet along the way and howyou meet along the way and how
they influence your life”they influence your life”
Jochanan Stenesh and Lilian Myers at Western Michigan University
and Bernie Dudock at SUNY Stony Brook
Bart Barrell and Alan Coulson
originally at the MRC-Hills
Road Cambridge and Ian
Dunham both now at the
Sanger Institute
Watson and Crick
Fred Sanger
Bev Emanuel at Childrens
Hospital of Philadelphia
A
C
G
T
Sanger,
Keio,
Wash U,
OU
A
C
G
T
Human Chromosome 22Human Chromosome 22
Sequence FeaturesSequence Features
• 39 % of the sequence is occupied by genes including39 % of the sequence is occupied by genes including
their introns, 5’ and 3’ non-translated regions.their introns, 5’ and 3’ non-translated regions.
• 3 % of the complete sequence encodes the protein3 % of the complete sequence encodes the protein
products of these genes.products of these genes.
• 42 % of the sequence is composed of repetitive42 % of the sequence is composed of repetitive
sequences, compared to 46 % for the entire genome.sequences, compared to 46 % for the entire genome.
• Only slightly over half of the genes predicted forOnly slightly over half of the genes predicted for
human chromosome 22 can be experimentallyhuman chromosome 22 can be experimentally
validated.*validated.*
* Shoemaker DD., et al. Experimental annotation of the human
genome using microarray technology. Nature. 409, 922-7 (2001).
A
C
G
T
An Individual’s Genome
Differs from the DNA of:
• Siblings by 1 to 2 million bases, ~99.98% identical, with
coding regions 99.99999% identical
• Unrelated humans by 6 million bases, ~99.8% identical
overall, with coding regions 99.9999% identical
• Chimpanzees by about 100 million base pairs ~98%
identical
• Baboons by about 300 million base pairs ~92% identical
• Mice by about 2.8 billion bases, but coding regions are
~90% identical
• Leaf spinach by about 2.9 billion bases, but coding
regions are ~40% identical
A
C
G
T
AGCCACACAGTGTCCACCGGATGGTTGATTTTGAAGCAGAGTAGCCACACAGTGTCCACCGGATGGTTGATTTTGAAGCAGAGT
TAGCTTGTCACCTGCCTCCCTTTCCCGGGACAACAGAAGCTGATAGCTTGTCACCTGCCTCCCTTTCCCGGGACAACAGAAGCTGA
CCTCTTTGCCTCTTTGNNTCTCTTGCGCAGTCTCTTGCGCAGATGATGAGTCTCCGGGGCTCTAATGATGAGTCTCCGGGGCTCTA
TGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGTGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAG
CAGAGTTCAACAGAGTTCAAGTAAGTACTGGTTTGGGGAGGTAAGTACTGGTTTGGGGAGNNAGGGTTGCAGCGAGGGTTGCAGCG
GCGCNNGAGCCAGGGTCTCCACCCAGGAAGGACTGAGCCAGGGTCTCCACCCAGGAAGGACTNNATCGGGCAGGGATCGGGCAGGG
TGTGGGGAAACAGGGAGGTTGTTCAGATGACCTGTGGGGAAACAGGGAGGTTGTTCAGATGACCACGGGACACCTACGGGACACCT
TTGACCCTGGCCGCTGTGGAGTGTTTGTGCTGGTTGATGCCTTTTGACCCTGGCCGCTGTGGAGTGTTTGTGCTGGTTGATGCCTT
CTGGGTGTGGAATTGTTTTTCCCGGAGTGGCCTCTGCCCTCTCCTGGGTGTGGAATTGTTTTTCCCGGAGTGGCCTCTGCCCTCTC
CCCTAGCCTGTCTCAGATCCTGGGAGCTGGTGAGCTGCCCCCTCCCTAGCCTGTCTCAGATCCTGGGAGCTGGTGAGCTGCCCCCT
GCAGGTGGATCGAGTAATTGCAGGGGTTTGGCAAGGACTTTGAGCAGGTGGATCGAGTAATTGCAGGGGTTTGGCAAGGACTTTGA
CAGACATCCCCAGGGGTGCCCGGGAGTGTGGGGTCCCAGACATCCCCAGGGGTGCCCGGGAGTGTGGGGTCCNNAGCCAGAGCCAG
Differences between individuals
The yellow underlined sequence is the first exon of
the BCR gene involved in leukemia. Only 5 bases
(NN) differ in non-gene regions.
A
C
G
T
Human Chromosome 22
Single Nucleotide Polymorphisms*
Number of overlaps 335
Size of overlaps 13,203,147 bp
Number of SNPs 11,116 (~1/1000 bp)
Number of substitutions 9,123 (82%)
Number of ins/del 1,193 (18%)
Only 48 of the 11,116 SNPs were in coding
regions ~ 10 fold lower than in non-coding
* E. Dawson, et al. A SNP Resource For Human Chromosome 22: Extracting Dense
Clusters of SNPs from the Genomic Sequence. Genome Research, 11, 170-178 (2001).
A
C
G
T
““We each are like a different symphony orchestra”We each are like a different symphony orchestra”
““All playing the same instruments slightly differently”All playing the same instruments slightly differently”
A
C
G
T
Good news and Bad newsGood news and Bad news
• Good news <40,000 genes (counting dark space?)Good news <40,000 genes (counting dark space?)
• Bad newsBad news
• 2-4 times as many proteins as other2-4 times as many proteins as other
species due to extensive alternativespecies due to extensive alternative
splicing in humans.splicing in humans.
• We only know the function of aboutWe only know the function of about
half the predicted genes.half the predicted genes.
• Likely > 1 million different geneLikely > 1 million different gene
products based on alternative splicingproducts based on alternative splicing
and post-translational modifications.and post-translational modifications.
A
C
G
T
Where we stand now
• We essentially have the ‘dictionary’ with allWe essentially have the ‘dictionary’ with all
the words (genes) spelled correctly, but onlythe words (genes) spelled correctly, but only
slightly more than half of the words (genes)slightly more than half of the words (genes)
have definitions.have definitions.
• Through comparative genomic sequencingThrough comparative genomic sequencing
we can annotate the human genome basedwe can annotate the human genome based
on evolutionary conserved gene sequenceson evolutionary conserved gene sequences
and use model systems to study geneand use model systems to study gene
expression.expression.
• Slightly over half of the genes predicted forSlightly over half of the genes predicted for
human chromosome 22 have beenhuman chromosome 22 have been
experimentally validated.experimentally validated.
A
C
G
T
A
C
G
T
Chimpanzee and Baboon
Genomic Sequencing
• Medically important model eukaryotic organisms
• The chimpanzee is our nearest evolutionary
relative with a genome that has ~98 %
sequence identity with the human genome
• The baboon genome has ~92 % sequence
identity with the human genome
A
C
G
T
PIP Plot of
a region of
human
chr22
compared
to syntenic
regions of
baboon
and mouse
human-
specific
repeat
regions
Questionable
gene present
in primates
but not in
rodents
A
C
G
T
Variations in the regions syntenic to the
human chr 22 immunoglobulin light chain
region from chimp, baboon, rat and mouse
A
C
G
T
34 Kbp
deletion
in
baboon
A
C
G
T
Exons in one
copy of a
zebrafish
duplicated
gene with
75%
homology to
human but
greatly
diverged,
<50%
homology, in
the other
copy
A
C
G
T
Instance
of a rare
alu
deletion in
chimp and
a gene
having
very low
homology
in fish
A
C
G
T
Conclusions from the analysis of
vertebrate genomic sequences
• Approximately 40% of the genome is expressed into
hnRNA which is processed to 10-fold smaller mature
mRNA with extensive alternative splicing (1 gene -->
multiple proteins).
• Approximately 40% repeat sequence density.
• Conserved coding sequences, promoters and enhancers
and exon spacing approximately proportional to
evolutionary distance from a common ancestor.
• Additional endogenous retroviral and alu sequences in the
human genome and some regions not present is different
vertebrates.
• Sequence drift in duplicated gene families.
• About half of the predicted genes have yet to be assigned
any known function.
A
C
G
T
“Zebrafish are small people that swim
in the water and breathe through gills”
Han Wang, Dept. Zoology and Director of the
University of Oklahoma Zebrafish Facility
A
C
G
T
How much of the ~1.7 Gbp genome has been sequenced so far?
The whole genome shotgun project comprises roughly 11.6 million traces by
now. With an average quality clipped trace length of 517 bp this adds to 6 Gb in
total, so the genome is covered 3.5 times.
The new assembly Zv2 is built on 11.7 million traces with an average trace
length of 651 bp length, adding up to 7.64 Gbp (4.5 x coverage).
The current Sanger Institute in-house statistics for the clone sequencing are:
* 322,712,747 bp unfinished
* 112,494,895 bp finished
* 435,207,642 bp total
A
C
G
T
Individuals within a single developing clutch
hatch sporadically during the whole period.
Hatching Period (48-72 h)
Embryos developing to the phyolotypic stage
when it posesses the classic vertebrate
bauplan.Migration of the posterior lateral line
primordium. Rapid organogenesis continues.
Pharyngula Period (24-48 h)
Somites develop, the rudiments of the primary
organs become visible, the tail bud becomes
more prominent and the embryo elongates. The
first cells differentiate morphologically, and the
first body movements appear.
Segmentation Period (10 1/3 - 24 h)
Morphogenetic cell movements of involution,
convergence, and extension occur, producing
the primary germ layers and the embryonic axis.
Gastrula Period (5 1/4 - 10 1/3h)
Begins at 128-cell stage or 8th zygotic cell cycle.
Embryo enters midblastula transition (MBT), the
onset of zygotic transcription. Period ends at the
onset ofgastrulation.
Blastula Period (2 1/4 - 5 1/4 h)
After the first cleavage, blastomeres divide at
approximately 15 minute intervals
Cleavage Period (0.7- 2.2 h)
The newly fertilized egg is in the zygote period
until the first cleavage occurs
Zygote Period (0-3/4 h)
DescriptionZebrafish Developmental stages(HPF*)
Kimmel CB, et al. Stages of embryonic development of the zebrafish. Dev Dyn 203, 253-310 (1995).
A
C
G
T
• Created and sequenced 10,000 clones from a zebrafish brain
and eye cDNA library.
• After a blast vs human chromosome 22, obtained the set of
zebrafish cDNA clones corresponding to several predicted
human chromosome 22 genes.
• Picked an EST whose expression profile matched a hypothetical
protein with and EST from a human fetal brain library.
Gene Expression in Zebrafish
A
C
G
T
Gene Expression in Zebrafish (cont)
• An antisense RNA hybridization probe was generated by in vitro
transcription in the presence of dig-UTP after cloning into an
expression vector.
• Whole mount in situ hybridization was to 24, 48, and 72 hours post-
fertilization zebrafish embryos.
• Hybridization was detected by anti-dig antibody.
1b6: AP000557.1.mRNA chr22 position:18495442-18504448 KIAA1020 hypothetical protein matches EST b6n20zf
24hpf 48hpf 72hpf
Probe1 b6
Probe1 b6 shows hybridization in the brain from 24 hours onward and in the eye
from 48 hours onward.
A
C
G
T
Exon-specific gene expression in zebra fish
embryos during development that is
amenable to automation
Incorporated mouse in situ methods for zebrafish that:
• shorten the length of probes from 1000 bp to 100 bp, thus
exon-specific probes,
• hybridizations in a 96 well multiplex microtiter plate format,
• digoxigenin labeled ssDNA probes generated from
assymetric, single primer amplification off PCR (eliminating
sub-cloning of each PCR product into T3/T7 expression
vectors), and
• eliminated the spurious labeling of the eye by introducing
glycine as the reagent of choice to rapidly inhibit the
proteinase K used to increase permeability of the embryos.
A
C
G
T
QuickTime™ and a Graphics decompressor are needed to see this picture.
QuickTime™ and a Graphics decompressor are needed to see this picture.
Whole mount in situ hybridization with
ssDNA-digoxigenin labeled probe
made from a PCR product. Brain-
specific expression of this mRNA
during embryonic development
A
C
G
T
Anti-sense probe Sense probe No probe
Typically only see anti-sense probe hybridizing,
and therefore stained by anti-dig antibody with
some probe-independent staining in the eye.
The importance of a “no probe” antibody staining
control to determine if any probe-independent
antibody staining occurs in the lens
72 hour post fertilization embryo
A
C
G
T
A probe to the unique 3’ UTR if
there are multiple paralogs
One last experiment with a surprise ending
A
C
G
T
Hybridization
probe a8h24
unique to 3’
UTR of
zebrafish gene
2 based on our
zebrafish EST
sequence
A
C
G
T
Anti-sense probe Sense probe No probe
Both the anti-sense and sense probes hybridized
to 72 hour post fertilization embryonic brain.
Indicating RNA transcribed from
the opposite, non-coding strand?
One too many controls sometimes
results in a surprise observation
A
C
G
T
What’s next for our Genome Center?
• Participate in sequencing the mouse, chimp, baboon,Participate in sequencing the mouse, chimp, baboon,
lemur, bovine, dog, cat, chicken and zebra fishlemur, bovine, dog, cat, chicken and zebra fish
genomes concentrating on:genomes concentrating on:
• Regions of high biological interest andRegions of high biological interest and
• Regions orthologous to human chromosome 22Regions orthologous to human chromosome 22
• Sequence theSequence the Medicago truncatulaMedicago truncatula (alfalfa) genome(alfalfa) genome
using a mapped BAC-based approach concentratingusing a mapped BAC-based approach concentrating
on coding regionson coding regions
• Continued sequencing of selected pathogenic bacteriaContinued sequencing of selected pathogenic bacteria
• Investigate the function of the predicted genes withInvestigate the function of the predicted genes with
unknown function in the zebrafish system first byunknown function in the zebrafish system first by
whole mountwhole mount in situin situ and then expression knock downand then expression knock down
experiments with morpholino oligos.experiments with morpholino oligos.
A
C
G
T
Laboratory OrganizationLaboratory Organization
Bruce Roe, PIBruce Roe, PI
InformaticsInformatics
Support TeamsSupport Teams
ProductionProduction AdministrationAdministration
Jim WhiteJim White
Steve KentonSteve Kenton
Hongshing LaiHongshing Lai
Sean Qian***Sean Qian***
Rose Morales-Diaz*Rose Morales-Diaz*
Mounir Elharam*Mounir Elharam*
Steve Shaull**Steve Shaull**
Doug WhiteDoug White
Work-study Undergraduate students**Work-study Undergraduate students**
KayLynn HaleKayLynn Hale
Dixie WishnuckDixie Wishnuck
Tami WomackTami Womack
Mary Catherine WilliamsMary Catherine Williams
DNA SynthesisDNA Synthesis
Phoebe Loh*Phoebe Loh*
Sulan QiSulan Qi
Bart Ford*Bart Ford*
Reagents &Reagents &
Equip. Maint.Equip. Maint.
Mounir Elharam*Mounir Elharam*
Doug WhiteDoug White
Clayton Powell**Clayton Powell**
Axin Hua***Axin Hua***
Weihong Xu****Weihong Xu****
Yanhong LiYanhong Li
Jami Milam****Jami Milam****
Sara Downard**Sara Downard**
Ging Sobhraksha**Ging Sobhraksha**
Limei YangLimei Yang
Angie Prescott*Angie Prescott*
Audra Wendt**Audra Wendt**
Mandi Aycock**Mandi Aycock**
Ziyun Yao***Ziyun Yao***
Steve Shaull*Steve Shaull*
Youngju Yoon****Youngju Yoon****
Trang DoTrang Do
Anh DoAnh Do
Lily FuLily Fu
Yang Ye**Yang Ye**
Tessa Manning**Tessa Manning**
Fu YingFu Ying
Liping ZhouLiping Zhou
Ruihua Shi****Ruihua Shi****
Junjie Wu****Junjie Wu****
Stephan Deschamps***Stephan Deschamps***
Shelly Oommen****Shelly Oommen****
Christopher Lau****Christopher Lau****
Research TeamsResearch Teams
Doris KupferDoris Kupfer
Julia Kim*Julia Kim*
Sun SoSun So
Graham Wiley**Graham Wiley**
Lin Song****Lin Song****
Ying NiYing Ni
Huarong JiangHuarong Jiang
ShaoPing Lin***ShaoPing Lin***
Honggui JiaHonggui Jia
Hongming WuHongming Wu
Baifang QinBaifang Qin
Peng ZhangPeng Zhang
Shuling LiShuling Li
Fares Najar***Fares Najar***
Chunmei QuChunmei Qu
Keqin WangKeqin Wang
Funding from the NHGRI, Noble Foundation, DOE, NSF (pending)
- Collaborators at Sanger, CWRU, CHOP, Keio, UIUC and Riken
Pheobe LohPheobe Loh **
Sulan QiSulan Qi
Bart Ford*Bart Ford*
* Previous undergraduate res. student* Previous undergraduate res. student
** Present undergraduate res. student** Present undergraduate res. student
*** Previous graduate student*** Previous graduate student
**** Present graduate student**** Present graduate student
A
C
G
T The AACCGGTT Team
A
C
G
T
Peggy and Charles Stephenson CenterPeggy and Charles Stephenson Center

Contenu connexe

Tendances

L14 human genome
L14 human genomeL14 human genome
L14 human genomeMUBOSScz
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomesavrilcoghlan
 
Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Mark Pallen
 
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...Pat (JS) Heslop-Harrison
 
When is a genome finished?
When is a genome finished? When is a genome finished?
When is a genome finished? Keith Bradnam
 
Genome evolution - tales of scales DNA to crops,months to billions of years, ...
Genome evolution - tales of scales DNA to crops,months to billions of years, ...Genome evolution - tales of scales DNA to crops,months to billions of years, ...
Genome evolution - tales of scales DNA to crops,months to billions of years, ...Pat (JS) Heslop-Harrison
 
Yeast genome project
Yeast genome projectYeast genome project
Yeast genome projectNazish_Nehal
 
The language of life (all the subtitles)first ppt 2 bimester
The language of life (all the subtitles)first ppt 2 bimesterThe language of life (all the subtitles)first ppt 2 bimester
The language of life (all the subtitles)first ppt 2 bimesterSofia Paz
 
Human genetics evolutionary genetics
Human genetics   evolutionary geneticsHuman genetics   evolutionary genetics
Human genetics evolutionary geneticsDan Gaston
 
Genomics 101 jun 15 2012
Genomics 101 jun 15 2012Genomics 101 jun 15 2012
Genomics 101 jun 15 2012Genome Alberta
 
chloroplast genome ppt.
chloroplast genome ppt.chloroplast genome ppt.
chloroplast genome ppt.dbskkv
 
Application of genomics in animals
Application of genomics in animalsApplication of genomics in animals
Application of genomics in animalsUsman Arshad
 
Rice genome sequencing by utkarsh
Rice genome sequencing by utkarshRice genome sequencing by utkarsh
Rice genome sequencing by utkarshutkarsh2011
 

Tendances (20)

L14 human genome
L14 human genomeL14 human genome
L14 human genome
 
Introduction to genomes
Introduction to genomesIntroduction to genomes
Introduction to genomes
 
THE human genome
THE human genomeTHE human genome
THE human genome
 
Genome origin
Genome originGenome origin
Genome origin
 
Bio153 microbial genomics 2012
Bio153 microbial genomics 2012Bio153 microbial genomics 2012
Bio153 microbial genomics 2012
 
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...
 
Gene mapping
Gene mappingGene mapping
Gene mapping
 
When is a genome finished?
When is a genome finished? When is a genome finished?
When is a genome finished?
 
Genomics seminar copy
Genomics seminar   copyGenomics seminar   copy
Genomics seminar copy
 
Fogarty Report
Fogarty ReportFogarty Report
Fogarty Report
 
Human Genome
Human Genome Human Genome
Human Genome
 
Yeast Genome
Yeast Genome Yeast Genome
Yeast Genome
 
Genome evolution - tales of scales DNA to crops,months to billions of years, ...
Genome evolution - tales of scales DNA to crops,months to billions of years, ...Genome evolution - tales of scales DNA to crops,months to billions of years, ...
Genome evolution - tales of scales DNA to crops,months to billions of years, ...
 
Yeast genome project
Yeast genome projectYeast genome project
Yeast genome project
 
The language of life (all the subtitles)first ppt 2 bimester
The language of life (all the subtitles)first ppt 2 bimesterThe language of life (all the subtitles)first ppt 2 bimester
The language of life (all the subtitles)first ppt 2 bimester
 
Human genetics evolutionary genetics
Human genetics   evolutionary geneticsHuman genetics   evolutionary genetics
Human genetics evolutionary genetics
 
Genomics 101 jun 15 2012
Genomics 101 jun 15 2012Genomics 101 jun 15 2012
Genomics 101 jun 15 2012
 
chloroplast genome ppt.
chloroplast genome ppt.chloroplast genome ppt.
chloroplast genome ppt.
 
Application of genomics in animals
Application of genomics in animalsApplication of genomics in animals
Application of genomics in animals
 
Rice genome sequencing by utkarsh
Rice genome sequencing by utkarshRice genome sequencing by utkarsh
Rice genome sequencing by utkarsh
 

Similaire à CSHL

Marzillier_09052014.pdf
Marzillier_09052014.pdfMarzillier_09052014.pdf
Marzillier_09052014.pdf7006ASWATHIRR
 
Human genome project
Human genome projectHuman genome project
Human genome projectsabahayat3
 
Prokaryote genome
Prokaryote genomeProkaryote genome
Prokaryote genomemonanarayan
 
Genome concept, types, and function
Genome  concept, types, and functionGenome  concept, types, and function
Genome concept, types, and functionPraveen Garg
 
Content of the genome
Content of the genomeContent of the genome
Content of the genomeKiran Modi
 
Human genome project (2) converted
Human genome project (2) convertedHuman genome project (2) converted
Human genome project (2) convertedGAnchal
 
Chapter 7 genome structure, chromatin, and the nucleosome (1)
Chapter 7   genome structure, chromatin, and the nucleosome (1)Chapter 7   genome structure, chromatin, and the nucleosome (1)
Chapter 7 genome structure, chromatin, and the nucleosome (1)Roger Mendez
 
Clase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfClase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfNoraCRuizGuevara
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-researchc.titus.brown
 
Ap Chapter 21
Ap Chapter 21Ap Chapter 21
Ap Chapter 21smithbio
 
Human genome project
Human genome projectHuman genome project
Human genome projectRakesh R
 
Human genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groupsHuman genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groupsMayank Sagar
 
Mitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and PhylogenyMitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and PhylogenyRachel Jacob
 
c elegans genome, life cycle and model organism
 c elegans genome, life cycle and model organism c elegans genome, life cycle and model organism
c elegans genome, life cycle and model organismSubhradeep sarkar
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsgroovescience
 

Similaire à CSHL (20)

Marzillier_09052014.pdf
Marzillier_09052014.pdfMarzillier_09052014.pdf
Marzillier_09052014.pdf
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
The Human Genome Project
The Human Genome Project The Human Genome Project
The Human Genome Project
 
Prokaryote genome
Prokaryote genomeProkaryote genome
Prokaryote genome
 
Genome concept, types, and function
Genome  concept, types, and functionGenome  concept, types, and function
Genome concept, types, and function
 
Human encodeproject
Human encodeprojectHuman encodeproject
Human encodeproject
 
Content of the genome
Content of the genomeContent of the genome
Content of the genome
 
Human genome project (2) converted
Human genome project (2) convertedHuman genome project (2) converted
Human genome project (2) converted
 
Markers
MarkersMarkers
Markers
 
Chapter 7 genome structure, chromatin, and the nucleosome (1)
Chapter 7   genome structure, chromatin, and the nucleosome (1)Chapter 7   genome structure, chromatin, and the nucleosome (1)
Chapter 7 genome structure, chromatin, and the nucleosome (1)
 
Clase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdfClase 2 - Genoma Humano proyecto conicet.pdf
Clase 2 - Genoma Humano proyecto conicet.pdf
 
2014 whitney-research
2014 whitney-research2014 whitney-research
2014 whitney-research
 
Ap Chapter 21
Ap Chapter 21Ap Chapter 21
Ap Chapter 21
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Human genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groupsHuman genetic diversity and origin of major human groups
Human genetic diversity and origin of major human groups
 
Mitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and PhylogenyMitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and Phylogeny
 
genetic variation
genetic variationgenetic variation
genetic variation
 
rapd.ppt
rapd.pptrapd.ppt
rapd.ppt
 
c elegans genome, life cycle and model organism
 c elegans genome, life cycle and model organism c elegans genome, life cycle and model organism
c elegans genome, life cycle and model organism
 
Human genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traitsHuman genetic variation and its contribution to complex traits
Human genetic variation and its contribution to complex traits
 

CSHL

  • 1. A C G T Comparative Analysis of HumanComparative Analysis of Human Chromosome 22q11.1-q12.3 withChromosome 22q11.1-q12.3 with Syntenic Regions in the Chimpanzee,Syntenic Regions in the Chimpanzee, Baboon, Bovine, Mouse, Pufferfish andBaboon, Bovine, Mouse, Pufferfish and Zebrafish GenomesZebrafish Genomes Dr. Bruce A. RoeDr. Bruce A. Roe George Lynn Cross Research ProfessorGeorge Lynn Cross Research Professor Advanced Center for Genome TechnologyAdvanced Center for Genome Technology Department of Chemistry and BiochemistryDepartment of Chemistry and Biochemistry University of OklahomaUniversity of Oklahoma broe@ou.edu www.genome.ou.edubroe@ou.edu www.genome.ou.edu LXVIII CSHL Symposium “The Genome of Homo Sapiens” May 28 - June 3, 2003
  • 2. A C G T ““The joy of science is the peopleThe joy of science is the people you meet along the way and howyou meet along the way and how they influence your life”they influence your life” Jochanan Stenesh and Lilian Myers at Western Michigan University and Bernie Dudock at SUNY Stony Brook Bart Barrell and Alan Coulson originally at the MRC-Hills Road Cambridge and Ian Dunham both now at the Sanger Institute Watson and Crick Fred Sanger Bev Emanuel at Childrens Hospital of Philadelphia
  • 4. A C G T Human Chromosome 22Human Chromosome 22 Sequence FeaturesSequence Features • 39 % of the sequence is occupied by genes including39 % of the sequence is occupied by genes including their introns, 5’ and 3’ non-translated regions.their introns, 5’ and 3’ non-translated regions. • 3 % of the complete sequence encodes the protein3 % of the complete sequence encodes the protein products of these genes.products of these genes. • 42 % of the sequence is composed of repetitive42 % of the sequence is composed of repetitive sequences, compared to 46 % for the entire genome.sequences, compared to 46 % for the entire genome. • Only slightly over half of the genes predicted forOnly slightly over half of the genes predicted for human chromosome 22 can be experimentallyhuman chromosome 22 can be experimentally validated.*validated.* * Shoemaker DD., et al. Experimental annotation of the human genome using microarray technology. Nature. 409, 922-7 (2001).
  • 5. A C G T An Individual’s Genome Differs from the DNA of: • Siblings by 1 to 2 million bases, ~99.98% identical, with coding regions 99.99999% identical • Unrelated humans by 6 million bases, ~99.8% identical overall, with coding regions 99.9999% identical • Chimpanzees by about 100 million base pairs ~98% identical • Baboons by about 300 million base pairs ~92% identical • Mice by about 2.8 billion bases, but coding regions are ~90% identical • Leaf spinach by about 2.9 billion bases, but coding regions are ~40% identical
  • 6. A C G T AGCCACACAGTGTCCACCGGATGGTTGATTTTGAAGCAGAGTAGCCACACAGTGTCCACCGGATGGTTGATTTTGAAGCAGAGT TAGCTTGTCACCTGCCTCCCTTTCCCGGGACAACAGAAGCTGATAGCTTGTCACCTGCCTCCCTTTCCCGGGACAACAGAAGCTGA CCTCTTTGCCTCTTTGNNTCTCTTGCGCAGTCTCTTGCGCAGATGATGAGTCTCCGGGGCTCTAATGATGAGTCTCCGGGGCTCTA TGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAGTGGGTTTCTGAATGTCATCGTCCACTCAGCCACTGGATTTAAG CAGAGTTCAACAGAGTTCAAGTAAGTACTGGTTTGGGGAGGTAAGTACTGGTTTGGGGAGNNAGGGTTGCAGCGAGGGTTGCAGCG GCGCNNGAGCCAGGGTCTCCACCCAGGAAGGACTGAGCCAGGGTCTCCACCCAGGAAGGACTNNATCGGGCAGGGATCGGGCAGGG TGTGGGGAAACAGGGAGGTTGTTCAGATGACCTGTGGGGAAACAGGGAGGTTGTTCAGATGACCACGGGACACCTACGGGACACCT TTGACCCTGGCCGCTGTGGAGTGTTTGTGCTGGTTGATGCCTTTTGACCCTGGCCGCTGTGGAGTGTTTGTGCTGGTTGATGCCTT CTGGGTGTGGAATTGTTTTTCCCGGAGTGGCCTCTGCCCTCTCCTGGGTGTGGAATTGTTTTTCCCGGAGTGGCCTCTGCCCTCTC CCCTAGCCTGTCTCAGATCCTGGGAGCTGGTGAGCTGCCCCCTCCCTAGCCTGTCTCAGATCCTGGGAGCTGGTGAGCTGCCCCCT GCAGGTGGATCGAGTAATTGCAGGGGTTTGGCAAGGACTTTGAGCAGGTGGATCGAGTAATTGCAGGGGTTTGGCAAGGACTTTGA CAGACATCCCCAGGGGTGCCCGGGAGTGTGGGGTCCCAGACATCCCCAGGGGTGCCCGGGAGTGTGGGGTCCNNAGCCAGAGCCAG Differences between individuals The yellow underlined sequence is the first exon of the BCR gene involved in leukemia. Only 5 bases (NN) differ in non-gene regions.
  • 7. A C G T Human Chromosome 22 Single Nucleotide Polymorphisms* Number of overlaps 335 Size of overlaps 13,203,147 bp Number of SNPs 11,116 (~1/1000 bp) Number of substitutions 9,123 (82%) Number of ins/del 1,193 (18%) Only 48 of the 11,116 SNPs were in coding regions ~ 10 fold lower than in non-coding * E. Dawson, et al. A SNP Resource For Human Chromosome 22: Extracting Dense Clusters of SNPs from the Genomic Sequence. Genome Research, 11, 170-178 (2001).
  • 8. A C G T ““We each are like a different symphony orchestra”We each are like a different symphony orchestra” ““All playing the same instruments slightly differently”All playing the same instruments slightly differently”
  • 9. A C G T Good news and Bad newsGood news and Bad news • Good news <40,000 genes (counting dark space?)Good news <40,000 genes (counting dark space?) • Bad newsBad news • 2-4 times as many proteins as other2-4 times as many proteins as other species due to extensive alternativespecies due to extensive alternative splicing in humans.splicing in humans. • We only know the function of aboutWe only know the function of about half the predicted genes.half the predicted genes. • Likely > 1 million different geneLikely > 1 million different gene products based on alternative splicingproducts based on alternative splicing and post-translational modifications.and post-translational modifications.
  • 10. A C G T Where we stand now • We essentially have the ‘dictionary’ with allWe essentially have the ‘dictionary’ with all the words (genes) spelled correctly, but onlythe words (genes) spelled correctly, but only slightly more than half of the words (genes)slightly more than half of the words (genes) have definitions.have definitions. • Through comparative genomic sequencingThrough comparative genomic sequencing we can annotate the human genome basedwe can annotate the human genome based on evolutionary conserved gene sequenceson evolutionary conserved gene sequences and use model systems to study geneand use model systems to study gene expression.expression. • Slightly over half of the genes predicted forSlightly over half of the genes predicted for human chromosome 22 have beenhuman chromosome 22 have been experimentally validated.experimentally validated.
  • 12. A C G T Chimpanzee and Baboon Genomic Sequencing • Medically important model eukaryotic organisms • The chimpanzee is our nearest evolutionary relative with a genome that has ~98 % sequence identity with the human genome • The baboon genome has ~92 % sequence identity with the human genome
  • 13. A C G T PIP Plot of a region of human chr22 compared to syntenic regions of baboon and mouse human- specific repeat regions Questionable gene present in primates but not in rodents
  • 14. A C G T Variations in the regions syntenic to the human chr 22 immunoglobulin light chain region from chimp, baboon, rat and mouse
  • 16. A C G T Exons in one copy of a zebrafish duplicated gene with 75% homology to human but greatly diverged, <50% homology, in the other copy
  • 17. A C G T Instance of a rare alu deletion in chimp and a gene having very low homology in fish
  • 18. A C G T Conclusions from the analysis of vertebrate genomic sequences • Approximately 40% of the genome is expressed into hnRNA which is processed to 10-fold smaller mature mRNA with extensive alternative splicing (1 gene --> multiple proteins). • Approximately 40% repeat sequence density. • Conserved coding sequences, promoters and enhancers and exon spacing approximately proportional to evolutionary distance from a common ancestor. • Additional endogenous retroviral and alu sequences in the human genome and some regions not present is different vertebrates. • Sequence drift in duplicated gene families. • About half of the predicted genes have yet to be assigned any known function.
  • 19. A C G T “Zebrafish are small people that swim in the water and breathe through gills” Han Wang, Dept. Zoology and Director of the University of Oklahoma Zebrafish Facility
  • 20. A C G T How much of the ~1.7 Gbp genome has been sequenced so far? The whole genome shotgun project comprises roughly 11.6 million traces by now. With an average quality clipped trace length of 517 bp this adds to 6 Gb in total, so the genome is covered 3.5 times. The new assembly Zv2 is built on 11.7 million traces with an average trace length of 651 bp length, adding up to 7.64 Gbp (4.5 x coverage). The current Sanger Institute in-house statistics for the clone sequencing are: * 322,712,747 bp unfinished * 112,494,895 bp finished * 435,207,642 bp total
  • 21. A C G T Individuals within a single developing clutch hatch sporadically during the whole period. Hatching Period (48-72 h) Embryos developing to the phyolotypic stage when it posesses the classic vertebrate bauplan.Migration of the posterior lateral line primordium. Rapid organogenesis continues. Pharyngula Period (24-48 h) Somites develop, the rudiments of the primary organs become visible, the tail bud becomes more prominent and the embryo elongates. The first cells differentiate morphologically, and the first body movements appear. Segmentation Period (10 1/3 - 24 h) Morphogenetic cell movements of involution, convergence, and extension occur, producing the primary germ layers and the embryonic axis. Gastrula Period (5 1/4 - 10 1/3h) Begins at 128-cell stage or 8th zygotic cell cycle. Embryo enters midblastula transition (MBT), the onset of zygotic transcription. Period ends at the onset ofgastrulation. Blastula Period (2 1/4 - 5 1/4 h) After the first cleavage, blastomeres divide at approximately 15 minute intervals Cleavage Period (0.7- 2.2 h) The newly fertilized egg is in the zygote period until the first cleavage occurs Zygote Period (0-3/4 h) DescriptionZebrafish Developmental stages(HPF*) Kimmel CB, et al. Stages of embryonic development of the zebrafish. Dev Dyn 203, 253-310 (1995).
  • 22. A C G T • Created and sequenced 10,000 clones from a zebrafish brain and eye cDNA library. • After a blast vs human chromosome 22, obtained the set of zebrafish cDNA clones corresponding to several predicted human chromosome 22 genes. • Picked an EST whose expression profile matched a hypothetical protein with and EST from a human fetal brain library. Gene Expression in Zebrafish
  • 23. A C G T Gene Expression in Zebrafish (cont) • An antisense RNA hybridization probe was generated by in vitro transcription in the presence of dig-UTP after cloning into an expression vector. • Whole mount in situ hybridization was to 24, 48, and 72 hours post- fertilization zebrafish embryos. • Hybridization was detected by anti-dig antibody. 1b6: AP000557.1.mRNA chr22 position:18495442-18504448 KIAA1020 hypothetical protein matches EST b6n20zf 24hpf 48hpf 72hpf Probe1 b6 Probe1 b6 shows hybridization in the brain from 24 hours onward and in the eye from 48 hours onward.
  • 24. A C G T Exon-specific gene expression in zebra fish embryos during development that is amenable to automation Incorporated mouse in situ methods for zebrafish that: • shorten the length of probes from 1000 bp to 100 bp, thus exon-specific probes, • hybridizations in a 96 well multiplex microtiter plate format, • digoxigenin labeled ssDNA probes generated from assymetric, single primer amplification off PCR (eliminating sub-cloning of each PCR product into T3/T7 expression vectors), and • eliminated the spurious labeling of the eye by introducing glycine as the reagent of choice to rapidly inhibit the proteinase K used to increase permeability of the embryos.
  • 25. A C G T QuickTime™ and a Graphics decompressor are needed to see this picture. QuickTime™ and a Graphics decompressor are needed to see this picture. Whole mount in situ hybridization with ssDNA-digoxigenin labeled probe made from a PCR product. Brain- specific expression of this mRNA during embryonic development
  • 26. A C G T Anti-sense probe Sense probe No probe Typically only see anti-sense probe hybridizing, and therefore stained by anti-dig antibody with some probe-independent staining in the eye. The importance of a “no probe” antibody staining control to determine if any probe-independent antibody staining occurs in the lens 72 hour post fertilization embryo
  • 27. A C G T A probe to the unique 3’ UTR if there are multiple paralogs One last experiment with a surprise ending
  • 28. A C G T Hybridization probe a8h24 unique to 3’ UTR of zebrafish gene 2 based on our zebrafish EST sequence
  • 29. A C G T Anti-sense probe Sense probe No probe Both the anti-sense and sense probes hybridized to 72 hour post fertilization embryonic brain. Indicating RNA transcribed from the opposite, non-coding strand? One too many controls sometimes results in a surprise observation
  • 30. A C G T What’s next for our Genome Center? • Participate in sequencing the mouse, chimp, baboon,Participate in sequencing the mouse, chimp, baboon, lemur, bovine, dog, cat, chicken and zebra fishlemur, bovine, dog, cat, chicken and zebra fish genomes concentrating on:genomes concentrating on: • Regions of high biological interest andRegions of high biological interest and • Regions orthologous to human chromosome 22Regions orthologous to human chromosome 22 • Sequence theSequence the Medicago truncatulaMedicago truncatula (alfalfa) genome(alfalfa) genome using a mapped BAC-based approach concentratingusing a mapped BAC-based approach concentrating on coding regionson coding regions • Continued sequencing of selected pathogenic bacteriaContinued sequencing of selected pathogenic bacteria • Investigate the function of the predicted genes withInvestigate the function of the predicted genes with unknown function in the zebrafish system first byunknown function in the zebrafish system first by whole mountwhole mount in situin situ and then expression knock downand then expression knock down experiments with morpholino oligos.experiments with morpholino oligos.
  • 31. A C G T Laboratory OrganizationLaboratory Organization Bruce Roe, PIBruce Roe, PI InformaticsInformatics Support TeamsSupport Teams ProductionProduction AdministrationAdministration Jim WhiteJim White Steve KentonSteve Kenton Hongshing LaiHongshing Lai Sean Qian***Sean Qian*** Rose Morales-Diaz*Rose Morales-Diaz* Mounir Elharam*Mounir Elharam* Steve Shaull**Steve Shaull** Doug WhiteDoug White Work-study Undergraduate students**Work-study Undergraduate students** KayLynn HaleKayLynn Hale Dixie WishnuckDixie Wishnuck Tami WomackTami Womack Mary Catherine WilliamsMary Catherine Williams DNA SynthesisDNA Synthesis Phoebe Loh*Phoebe Loh* Sulan QiSulan Qi Bart Ford*Bart Ford* Reagents &Reagents & Equip. Maint.Equip. Maint. Mounir Elharam*Mounir Elharam* Doug WhiteDoug White Clayton Powell**Clayton Powell** Axin Hua***Axin Hua*** Weihong Xu****Weihong Xu**** Yanhong LiYanhong Li Jami Milam****Jami Milam**** Sara Downard**Sara Downard** Ging Sobhraksha**Ging Sobhraksha** Limei YangLimei Yang Angie Prescott*Angie Prescott* Audra Wendt**Audra Wendt** Mandi Aycock**Mandi Aycock** Ziyun Yao***Ziyun Yao*** Steve Shaull*Steve Shaull* Youngju Yoon****Youngju Yoon**** Trang DoTrang Do Anh DoAnh Do Lily FuLily Fu Yang Ye**Yang Ye** Tessa Manning**Tessa Manning** Fu YingFu Ying Liping ZhouLiping Zhou Ruihua Shi****Ruihua Shi**** Junjie Wu****Junjie Wu**** Stephan Deschamps***Stephan Deschamps*** Shelly Oommen****Shelly Oommen**** Christopher Lau****Christopher Lau**** Research TeamsResearch Teams Doris KupferDoris Kupfer Julia Kim*Julia Kim* Sun SoSun So Graham Wiley**Graham Wiley** Lin Song****Lin Song**** Ying NiYing Ni Huarong JiangHuarong Jiang ShaoPing Lin***ShaoPing Lin*** Honggui JiaHonggui Jia Hongming WuHongming Wu Baifang QinBaifang Qin Peng ZhangPeng Zhang Shuling LiShuling Li Fares Najar***Fares Najar*** Chunmei QuChunmei Qu Keqin WangKeqin Wang Funding from the NHGRI, Noble Foundation, DOE, NSF (pending) - Collaborators at Sanger, CWRU, CHOP, Keio, UIUC and Riken Pheobe LohPheobe Loh ** Sulan QiSulan Qi Bart Ford*Bart Ford* * Previous undergraduate res. student* Previous undergraduate res. student ** Present undergraduate res. student** Present undergraduate res. student *** Previous graduate student*** Previous graduate student **** Present graduate student**** Present graduate student
  • 33. A C G T Peggy and Charles Stephenson CenterPeggy and Charles Stephenson Center