SlideShare une entreprise Scribd logo
1  sur  62
Télécharger pour lire hors ligne
Ulf Schmitz, Introduction to genomics and proteomics I 1
www. .uni-rostock.de
BioinformaticsBioinformatics
Introduction to genomics and proteomics IIntroduction to genomics and proteomics I
Ulf Schmitz
ulf.schmitz@informatik.uni-rostock.de
Bioinformatics and Systems Biology Group
www.sbi.informatik.uni-rostock.de
Ulf Schmitz, Introduction to genomics and proteomics I 2
www. .uni-rostock.de
Outline
Genomics/Genetics
1. The tree of life
• Prokaryotic Genomes
– Bacteria
– Archaea
• Eukaryotic Genomes
– Homo sapiens
2. Genes
• Expression Data
Ulf Schmitz, Introduction to genomics and proteomics I 3
www. .uni-rostock.de
Genomics - Definitions
Genetics: is the science of genes, heredity, and the variation of organisms.
Humans began applying knowledge of genetics in prehistory with
the domestication and breeding of plants and animals.
In modern research, genetics provides tools in the investigation
of the function of a particular gene, e.g. analysis of genetic
interactions.
Genomics: attempts the study of large-scale genetic patterns across the
genome for a given species. It deals with the systematic use of
genome information to provide answers in biology, medicine, and
industry.
Genomics has the potential of offering new therapeutic methods
for the treatment of some diseases, as well as new diagnostic
methods.
Major tools and methods related to genomics are bioinformatics,
genetic analysis, measurement of gene expression, and
determination of gene function.
Ulf Schmitz, Introduction to genomics and proteomics I 4
www. .uni-rostock.deGenes
• a gene coding for a protein corresponds to a sequence of
nucleotides along one or more regions of a molecule of DNA
• in species with double stranded DNA (dsDNA), genes may appear
on either strand
• bacterial genes are continuous regions of DNA
bacterium:
• a string of 3N nucleotides encodes a string of N amino acids
• or a string of N nucleotides encodes a structural RNA molecule of N
residues
eukaryote:
• a gene may appear split into separated segments in the DNA
• an exon is a stretch of DNA retained in mRNA that the ribosomes translate
into protein
Ulf Schmitz, Introduction to genomics and proteomics I 5
www. .uni-rostock.de
Genomics
Genome size comparison
4.1 million5,0001Bacterium
(E. coli)
19,000
14,000
14,000
31,000
22.5-30,000
28-35,000
Genes
97 million12Roundworm
(C. elegans)
137 million8Fruit Fly
(Drosophila melanogaster)
289 million6Malaria mosquito
(Anopheles gambiae)
365 million44Puffer fish
(Fugu rubripes)
2.7 billion40Mouse
(Mus musculus)
3.1 billion46
(23 pairs)
Human
(Homo sapiens)
Base pairsChrom.Species
Ulf Schmitz, Introduction to genomics and proteomics I 6
www. .uni-rostock.de
Genes
exon:
A section of DNA which carries the coding
sequence for a protein or part of it. Exons
are separated by intervening, non-coding
sequences (called introns). In eukaryotes
most genes consist of a number of exons.
exon:
A section of DNA which carries the coding
sequence for a protein or part of it. Exons
are separated by intervening, non-coding
sequences (called introns). In eukaryotes
most genes consist of a number of exons.
intron:
An intervening section of DNA which occurs
almost exclusively within a eukaryotic gene, but
which is not translated to amino-acid sequences in
the gene product.
The introns are removed from the pre-mature
mRNA through a process called splicing, which
leaves the exons untouched, to form an active
mRNA.
intron:
An intervening section of DNA which occurs
almost exclusively within a eukaryotic gene, but
which is not translated to amino-acid sequences in
the gene product.
The introns are removed from the pre-mature
mRNA through a process called splicing, which
leaves the exons untouched, to form an active
mRNA.
Ulf Schmitz, Introduction to genomics and proteomics I 7
www. .uni-rostock.de
Genes
exon intron
Globin gene – 1525 bp: 622 in exons, 893 in introns
Ovalbumin gene - ~ 7500 bp: 8 short exons comprising 1859 bp
Conalbumin gene - ~ 10,000 bp: 17 short exons comprising ~ 2,200 bp
Examples of the exon:intron mosaic of genes
Ulf Schmitz, Introduction to genomics and proteomics I 8
www. .uni-rostock.de
Picking out genes in genomes
• Computer programs for genome analysis identify ORFs
(open reading frames)
• An ORF begins with an initiation codon ATG (AUG)
• An ORF is a potential protein-coding region
• There are two approaches to identify protein coding
regions…
Ulf Schmitz, Introduction to genomics and proteomics I 9
www. .uni-rostock.de
Picking out genes in genomes
• Regions may encode amino acid sequences similar to known proteins
• Or may be similar to ESTs (correspond to genes known to be
expressed)
• Few hundred initial bases of cDNA are sequenced to identify a gene
1. Detection of regions similar to known coding regions from other organisms
2. Ab initio methods, seek to identify genes from the properties of the
DNA sequence itself
• Bacterial genes are easy to identify, because they are contiguous
• They have no introns and the space between genes is small
• Identification of exons in higher organisms is a problem, assembling
them another…
Ulf Schmitz, Introduction to genomics and proteomics I 10
www. .uni-rostock.de
Picking out genes in genomes
• The initial (5´) exon starts with a transcription start
point, preceded by a core promoter site such as the
TATA box (~30bp upstream)
– Free of stop codons
– End immediately before a GT splice-signal
Ab initio gene identification in eukaryotic genomes
binds and directs RNA polymerase
to the correct transcriptional start site
Ulf Schmitz, Introduction to genomics and proteomics I 11
www. .uni-rostock.de
Picking out genes in genomes
5' splice signal
3' splice signal
Ulf Schmitz, Introduction to genomics and proteomics I 12
www. .uni-rostock.de
Picking out genes in genomes
• Internal exons are free of stop codons too
– Begin after an AG splice signal
– End before a GT splice signal
Ab initio gene identification in eukaryotic genomes
Ulf Schmitz, Introduction to genomics and proteomics I 13
www. .uni-rostock.de
Picking out genes in genomes
• The final (3´) exon starts after a an AG splice signal
– Ends with a stop codon (TAA,TAG,TGA)
– Followed by a polyadenylation signal sequence
Ab initio gene identification in eukaryotic genomes
Ulf Schmitz, Introduction to genomics and proteomics I 14
www. .uni-rostock.de
Humans have
spliced genes…
Ulf Schmitz, Introduction to genomics and proteomics I 15
www. .uni-rostock.de
DNA makes RNA makes Protein
Ulf Schmitz, Introduction to genomics and proteomics I 16
www. .uni-rostock.de
Tree of life
Prokaryotes
Ulf Schmitz, Introduction to genomics and proteomics I 17
www. .uni-rostock.de
Genomics – Prokaryotes
• the genome of a prokaryote comes
as a single double-stranded DNA
molecule in ring-form
– in average 2mm long
– whereas the cells diameter is only
0.001mm
– < 5 Mb
• prokaryotic cells can have plasmids
as well (see next slide)
• protein coding regions have no
introns
• little non-coding DNA compared to
eukaryotes
– in E.coli only 11%
Ulf Schmitz, Introduction to genomics and proteomics I 18
www. .uni-rostock.de
Genomics - Plasmids
• Plasmids are circular double stranded DNA molecules that are separate
from the chromosomal DNA.
• They usually occur in bacteria, sometimes in eukaryotic organisms
• Their size varies from 1 to 250 kilo base pairs (kbp). There are from one
copy, for large plasmids, to hundreds of copies of the same plasmid
present in a single cell.
Ulf Schmitz, Introduction to genomics and proteomics I 19
www. .uni-rostock.de
Prokaryotic model organisms
E.coli (Escherichia coli)
Methanococcus jannaschii (archaeon)
Mycoplasma genitalium
(simplest organism known)
Ulf Schmitz, Introduction to genomics and proteomics I 20
www. .uni-rostock.de
Genomics
• DNA of higher organisms is organized into chromosomes
(human – 23 chromosome pairs)
• not all DNA codes for proteins
• on the other hand some genes exist in multiple copies
• that’s why from the genome size you can’t easily estimate
the amount of protein sequence information
Ulf Schmitz, Introduction to genomics and proteomics I 21
www. .uni-rostock.de
Genomes of eukaryotes
• majority of the DNA is in the nucleus, separated into
bundles (chromosomes)
– small amounts of DNA appear in organelles (mitochondria and
chloroplasts)
• within single chromosomes gene families are common
– some family members are paralogues (related)
• they have duplicated within the same genome
• often diverged to provide separate functions in descendants
(Nachkommen)
• e.g. human α and β globin
– orthologues genes
• are homologues in different species
• often perform the same function
• e.g. human and horse myoglobin
– pseudogenes
• lost their function
• e.g. human globin gene cluster
pseudogene
Ulf Schmitz, Introduction to genomics and proteomics I 22
www. .uni-rostock.de
Eukaryotic model organisms
• Saccharomyces cerevisiae (baker’s yeast)
• Caenorhabditis elegans (C.elegans)
• Drosophila melanogaster (fruit fly)
• Arabidopsis thaliana (flower)
• Homo sapiens (human)
Ulf Schmitz, Introduction to genomics and proteomics I 23
www. .uni-rostock.de
The human genome
• ~3.2 x 109 bp (thirty time larger than C.elegans or D.melongaster)
• coding sequences form only 5% of the human genome
• Repeat sequences over 50%
• Only ~32.000 genes
• Human genome is distributed over 22 chromosome pairs plus X and
Y chromosomes
• Exons of protein-coding genes are relatively small compared to
other known eukaryotic genomes
• Introns are relatively long
• Protein-coding genes span long stretches of DNA (dystrophin,
coding a 3.685 amino acid protein, is >2.4Mbp long)
• Average gene length: ~ 8,000 bp
• Average of 5-6 exons/gene
• Average exon length: ~200 bp
• Average intron length: ~2,000 bp
• ~8% genes have a single exon
• Some exons can be as small as 1 or 3 bp.
Ulf Schmitz, Introduction to genomics and proteomics I 24
www. .uni-rostock.de
0.03Enzyme activator
20.6
2.9
2.5
5.3
1.8
3242
457
403
839
295
Enzyme
Peptidase
Endopeptidase
Protein kinase
Protein phosphatase
3.8603Defense/immunity protein
0.8129Actin binding
0.585Motor
0.9154Chaperone
0.475Cell Cycle regulator
0.06Transcription factor binding
14.0
10.5
0.2
0.0
6.2
2.4
0.8
0.2
2207
1656
45
7
986
380
137
44
Nucleic acid binding
DNA binding
DNA repair protein
DNA replication factor
Transcription factor
RNA binding
Structural protein of ribosome
Translation factor
%NumberFunction
100.015683Total
30.64813Unclassified
0.05Tumor suppressor
9.7
0.2
0.3
1536
33
50
Ligand binding or carrier
Electron transfer
Cytochrome P450
4.3
1.7
0.1
682
269
19
Transporter
Ion channel
Neurotransmitter transporter
4.5
0.9
714
145
Structural protein
Cytoskeletal structural protein
1.2189Cell adhesion
0.07Storage protein
11.4
8.4
7.6
3.1
0.0
1790
1318
1202
489
71
Signal transduction
Receptor
Transmembrane receptor
G-protein link receptor
Olfactory receptor
0.8132Apoptosis inhibitor
%NumberFunction
The human genomeThe human genome
Top categories in a function classification:
Ulf Schmitz, Introduction to genomics and proteomics I 25
www. .uni-rostock.de
The human genome
• Repeated sequences comprise over 50% of the genome:
– Transposable elements, or interspersed repeats include LINEs and
SINEs (almost 50%)
– Retroposed pseudogenes
– Simple ‘stutters’ - repeats of short oligomers (minisatellites and
microsatellites)
– Segment duplication, of blocks of ~10 - 300kb
– Blocks of tandem repeats, including gene families
3300.00080-3000DNA Transposon fossils
8450.00015.000 -110.000Long Terminal Repeats
21850.0006000-8000Long Interspersed Nuclear
Elements (LINEs)
131.500.000100-300Short Interspersed Nuclear
Elements (SINEs)
Fraction of
genome %
Copy
number
Size (bp)Element
Ulf Schmitz, Introduction to genomics and proteomics I 26
www. .uni-rostock.de
The human genome
• All people are different, but the DNA of different
people only varies for 0.2% or less.
• So, only up to 2 letters in 1000 are expected to be
different.
• Evidence in current genomics studies (Single
Nucleotide Polymorphisms or SNPs) imply that on
average only 1 letter out of 1400 is different
between individuals.
• means that 2 to 3 million letters would differ
between individuals.
Ulf Schmitz, Introduction to genomics and proteomics I 27
www. .uni-rostock.de
TERTIARY STRUCTURE (fold)
TERTIARY STRUCTURE (fold)
Genome
Expressome
Proteome
Metabolome
Functional Genomics
From gene to functionFrom gene to function
Ulf Schmitz, Introduction to genomics and proteomics I 28
www. .uni-rostock.de
DNA makes RNA makes Protein:
Expression data
• More copies of mRNA for a gene leads to more
protein
• mRNA can now be measured for all the genes in a
cell at ones through microarray technology
• Can have 60,000 spots (genes) on a single gene
chip
• Color change gives intensity of gene expression
(over- or under-expression)
Ulf Schmitz, Introduction to genomics and proteomics I 29
www. .uni-rostock.de
Ulf Schmitz, Introduction to genomics and proteomics I 30
www. .uni-rostock.de
Genes and regulatory regions
regulatory mechanisms organize the
expression of genes
– genes may be turned on or off in response to
concentrations of nutrients or to stress
– control regions often lie near the segments
coding for proteins
– they can serve as binding sites for molecules
that transcribe the DNA
– or they bind regulatory molecules that can
block transcription
Ulf Schmitz, Introduction to genomics and proteomics I 31
www. .uni-rostock.de
Expression data
Ulf Schmitz, Introduction to genomics and proteomics I 32
www. .uni-rostock.de
Outlook – coming lecture
Proteomics
– Proteins
– post-translational modification
– Key technologies
• Maps of hereditary information
• SNPs (Single nucleotide polymorphisms)
• Genetic diseases
Ulf Schmitz, Introduction to genomics and proteomics I 33
www. .uni-rostock.de
Thanks for your
attention!
Ulf Schmitz, Introduction to genomics and proteomics II 1
www. .uni-rostock.de
BioinformaticsBioinformatics
Introduction to genomics and proteomics IIIntroduction to genomics and proteomics II
ulf.schmitz@informatik.uni-rostock.de
Bioinformatics and Systems Biology Group
www.sbi.informatik.uni-rostock.de
Ulf Schmitz, Introduction to genomics and proteomics II 2
www. .uni-rostock.de
Outline
1. Proteomics
• Motivation
• Post -Translational Modifications
• Key technologies
• Data explosion
2. Maps of hereditary information
3. Single nucleotide polymorphisms
Ulf Schmitz, Introduction to genomics and proteomics II 3
www. .uni-rostock.de
Protomics
Proteomics:
• is the large-scale study of proteins, particularly their structures
and functions
• This term was coined to make an analogy with genomics, and
is often viewed as the "next step",
• but proteomics is much more complicated than genomics.
• Most importantly, while the genome is a rather constant entity,
the proteome is constantly changing through its biochemical
interactions with the genome.
• One organism will have radically different protein expression in
different parts of its body and in different stages of its life cycle.
Proteome:
The entirety of proteins in existence in an organism are
referred to as the proteome.
Ulf Schmitz, Introduction to genomics and proteomics II 4
www. .uni-rostock.de
Proteomics
If the genome is a list of the instruments in an orchestra, the
proteome is the orchestra playing a symphony.
R.Simpson
Ulf Schmitz, Introduction to genomics and proteomics II 5
www. .uni-rostock.de
Proteomics
• Describing all 3D structures of proteins in the cell is called Structural
Genomics
• Finding out what these proteins do is called Functional Genomics
GENOME
PROTEOME
DNA Microarray Genetic Screens
Protein – Ligand
Interactions
Protein – Protein
Interactions
Structure
Ulf Schmitz, Introduction to genomics and proteomics II 6
www. .uni-rostock.de
Proteomics
• What kind of data would we like to measure?
• What mature experimental techniques exist to
determine them?
• The basic goal is a spatio-temporal description of
the deployment of proteins in the organism.
Motivation:
Ulf Schmitz, Introduction to genomics and proteomics II 7
www. .uni-rostock.de
Proteomics
• the rates of synthesis of different proteins vary among
different tissues and different cell types and states of activity
• methods are available for efficient analysis of transcription
patterns of multiple genes
• because proteins ‘turn over’ at different rates, it is also
necessary to measure proteins directly
• the distribution of expressed protein levels is a kinetic
balance between rates of protein synthesis and degradation
Things to consider:
Ulf Schmitz, Introduction to genomics and proteomics II 8
www. .uni-rostock.de
Ulf Schmitz, Introduction to genomics and proteomics II 9
www. .uni-rostock.de
Why do Proteomics?
• are there differences between amino acid sequences determined
directly from proteins and those determined by translation from
DNA?
– pattern recognition programs addressing this questions have following
errors:
• a genuine protein sequence may be missed entirely
• an incomplete protein may be reported
• a gene may be incorrectly spliced
• genes for different proteins may overlap
• genes may be assembled from exons in different ways in different tissues
– often, molecules must be modified to make a mature protein that differs
significantly from the one suggested by translation
• in many cases the missing post-translational- modifications are quite
important and have functional significance
• post-transitional modifications include addition of ligands, glycosylation,
methylation, excision of peptides, etc.
– in some cases mRNA is edited before translation, creating changes in
the amino acid sequence that are not inferrable from the genes
• a protein inferred from a genome sequence is a hypothetical object
until an experiment verifies its existence
Ulf Schmitz, Introduction to genomics and proteomics II 10
www. .uni-rostock.de
Post-translational modification
• a protein is a polypeptide chain composed of 20 possible amino acids
• there are far fewer genes that code for proteins in the human genome than there
are proteins in the human proteome (~33,000 genes vs ~200,000 proteins).
• each gene encodes as many as six to eight different proteins
– due to post-translational modifications such as phosphorylation, glycosylation or cleavage
(Spaltung)
• posttranslational modification extends the range of possible functions a protein can
have
– changes may alter the hydrophobicity of a protein and thus determine if the modified
protein is cytosolic or membrane-bound
– modifications like phosphorylation are part of common mechanisms for controlling the
behavior of a protein, for instance, activating or inactivating an enzyme.
Ulf Schmitz, Introduction to genomics and proteomics II 11
www. .uni-rostock.de
Post-translational modification
• phosphorylation is the addition of a phosphate (PO4) group to a protein
or a small molecule (usual to serine, tyrosine, threonine or histidine)
• In eukaryotes, protein phosphorylation is probably the most important
regulatory event
• Many enzymes and receptors are switched "on" or "off" by
phosphorylation and dephosphorylation
• Phosphorylation is catalyzed by various specific protein kinases,
whereas phosphatases dephosphorylate.
Phosphorylation
Acetylation
• Is the addition of an acetyl group, usually at the N-terminus of the protein
Farnesylation
• farnesylation, the addition of a farnesyl group
Glycosylation
• the addition of a glycosyl group to either asparagine, hydroxylysine,
serine, or threonine, resulting in a glycoprotein
Ulf Schmitz, Introduction to genomics and proteomics II 12
www. .uni-rostock.de
Proteomics
Ulf Schmitz, Introduction to genomics and proteomics II 13
www. .uni-rostock.de
Key technologies for proteomics
1. 1-D electrophoresis and 2-D electrophoresis
• are for the separation and visualization of proteins.
2. mass spectrometry, x-ray crystallography, and NMR
(Nuclear magnetic resonance )
• are used to identify and characterize proteins
3. chromatography techniques especially affinity
chromatography
• are used to characterize protein-protein interactions.
4. Protein expression systems like the yeast two-
hybrid and FRET (fluorescence resonance energy
transfer)
• can also be used to characterize protein-protein interactions.
Ulf Schmitz, Introduction to genomics and proteomics II 14
www. .uni-rostock.de
Key technologies for proteomics
Reference map of lympphoblastoid
cell linePRI, soluble proteins.
• 110 µg of proteins loaded
• Strip 17cm pH gradient 4-7, SDS
PAGE gels 20 x 25 cm, 8-18.5% T.
• Staining by silver nitrate method
(Rabilloud et al.,)
• Identification by mass spectrometry.
The pinks labels on the spots indicate
the ID in Swiss-prot database
browse the SWISS-2DPAGE database for more 2d PAGE images
High-resolution two-dimensional polyacrylamide gel
electrophoresis (2D PAGE) shows the pattern of
protein content in a sample.
Ulf Schmitz, Introduction to genomics and proteomics II 15
www. .uni-rostock.de
Proteomics
Typically, a sample is purified to
homogeneity, crystallized, subjected to an X-
ray beam and diffraction data are collected.
X-ray crystallography is a means to
determine the detailed molecular
structure of a protein, nucleic acid or
small molecule.
With a crystal structure we can explain the
mechanism of an enzyme, the binding of an
inhibitor, the packing of protein domains, the
tertiary structure of a nucleic acid molecule
etc..
Ulf Schmitz, Introduction to genomics and proteomics II 16
www. .uni-rostock.de
High-throughput Biological Data
• Enormous amounts of biological data are being
generated by high-throughput capabilities; even
more are coming
– genomic sequences
– gene expression data (microarrays)
– mass spec. data
– protein-protein interaction (chromatography)
– protein structures (x-ray christallography)
– ......
Ulf Schmitz, Introduction to genomics and proteomics II 17
www. .uni-rostock.de
Protein structural data explosion
Protein Data Bank (PDB): 33.367 Structures (1 November 2005)
28.522 x-ray crystallography, 4.845 NMR
Ulf Schmitz, Introduction to genomics and proteomics II 18
www. .uni-rostock.de
Maps of hereditary information
1. Linkage maps of
genes
mini- / microsatellites
2. Banding patterns of chromosomes
physical objects with visible landmarks called banding patterns
3. DNA sequences
Contig maps (contigous clone maps)
Sequence tagged site (STS)
SNPs (Single nucloetide polymorphisms)
Following maps are used to find out how hereditary information is
stored, passed on, and implemented.
Ulf Schmitz, Introduction to genomics and proteomics II 19
www. .uni-rostock.de
Linkage map
Ulf Schmitz, Introduction to genomics and proteomics II 20
www. .uni-rostock.de
Maps of hereditary information
• regions, 8-80bp long, repeated a variable number of times
• the distribution and the size of repeats is the marker
• inheritance of VNTRs can be followed in a family and
mapped to a pathological phenotype
• first genetic data used for personal identification
– Genetic fingerprints; in paternity and in criminal cases
Variable number tandem repeats (VNTRs, also minisatellites)
Short tandem repeat polymorphism (STRPs, also microsatellites)
• Regions of 2-7bp, repeated many times
– Usually 10-30 consecutive copies
Ulf Schmitz, Introduction to genomics and proteomics II 21
www. .uni-rostock.de
centromere
CGTCGTCGTCGTCGTCGTCGTCGT...
GCAGCAGCAGCAGCAGCAGCAGCA...
3bp
Ulf Schmitz, Introduction to genomics and proteomics II 22
www. .uni-rostock.de
Maps of hereditary information
Banding patterns of
chromosomes
Ulf Schmitz, Introduction to genomics and proteomics II 23
www. .uni-rostock.de
Maps of hereditary information
Banding patterns of chromosomes
petite – arm
centromere
queue - arm
Ulf Schmitz, Introduction to genomics and proteomics II 24
www. .uni-rostock.de
Maps of hereditary information
• Series of overlapping DNA clones of known
order along a chromosome from an organism
of interest, stored in yeast or bacterial cells as
YACs (Yeast Artificial Chromosomes) or
BACs (Bacterial Artificial Chromosomes)
• A contig map produces a fine mapping (high
resolution) of a genome
• YAC can contain up to 106bp, a BAC about
250.000bp
Contig map (also contiguous clone map)
Sequence tagged site (STS)
• Short, sequenced region of DNA, 200-600bp
long, that appears in a unique location in the
genome
• One type arises from an EST (expressed
sequence tag), a piece of cDNA
Ulf Schmitz, Introduction to genomics and proteomics II 25
www. .uni-rostock.de
Maps of hereditary information
1. if we know the protein involved, we can pursue
rational approaches to therapy
2. if we know the gene involved, we can devise
tests to identify sufferers or carriers
3. wereas the knowledge of the chromosomal
location of the gene is unnecessary in many
cases for either therapy or detection;
• it is required only for identifying the gene, providing a
bridge between the patterns of inheritance and the
DNA sequence
Imagine we know that a disease results from a specific
defective protein:
Ulf Schmitz, Introduction to genomics and proteomics II 26
www. .uni-rostock.de
Single nucleotide polymorphisms (SNPs)Single nucleotide polymorphisms (SNPs)
• SNP (pronounced ‘snip’) is a genetic
variation between individuals
• single base pairs that can be substituted,
deleted or inserted
• SNPs are distributed throughout the
genome
– average every 2000bp
• provide markers for mapping genes
• not all SNPs are linked to diseases
Ulf Schmitz, Introduction to genomics and proteomics II 27
www. .uni-rostock.de
Single nucleotide polymorphisms (SNPs)
• nonsense mutations:
– codes for a stop, which can truncate the
protein
• missense mutations:
– codes for a different amino acid
• silent mutations:
– codes for the same amino acid, so has no
effect
Ulf Schmitz, Introduction to genomics and proteomics II 28
www. .uni-rostock.de
Outlook – coming lecture
• Bioinformatics Information Resources And Networks
– EMBnet – European Molecular Biology Network
• DBs and Tools
– NCBI – National Center For Biotechnology Information
• DBs and Tools
– Nucleic Acid Sequence Databases
– Protein Information Resources
– Metabolic Databases
– Mapping Databases
– Databases concerning Mutations
– Literature Databases
Ulf Schmitz, Introduction to genomics and proteomics II 29
www. .uni-rostock.de
Thanks for your
attention!

Contenu connexe

Tendances

Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Kate Hertweck
 
Genome Sequencing Project
Genome Sequencing ProjectGenome Sequencing Project
Genome Sequencing Projectguestd53a1
 
Whole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thalianaWhole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thalianaBhavya Sree
 
Mitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and PhylogenyMitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and PhylogenyRachel Jacob
 
genome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirgenome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirKAUSHAL SAHU
 
Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Leighton Pritchard
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics Senthil Natesan
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomicsNikhil Aggarwal
 
The Human Genome Project - Part I
The Human Genome Project - Part IThe Human Genome Project - Part I
The Human Genome Project - Part Ihhalhaddad
 
"Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in...
"Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in..."Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in...
"Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in...Jonathan Eisen
 
Phylogenomics talk in 2000 at University of Maryland by J. Eisen
Phylogenomics talk in 2000 at University of Maryland by J. EisenPhylogenomics talk in 2000 at University of Maryland by J. Eisen
Phylogenomics talk in 2000 at University of Maryland by J. EisenJonathan Eisen
 
Lecture1 1 Perl for bioinformatics Davide Pisani & James Cotton
Lecture1 1 Perl for bioinformatics Davide Pisani & James CottonLecture1 1 Perl for bioinformatics Davide Pisani & James Cotton
Lecture1 1 Perl for bioinformatics Davide Pisani & James Cottonnathanlawless
 

Tendances (20)

Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
Evolution of transposons, genomes, and organisms (Hertweck Fall 2014)
 
GENOMICS
GENOMICSGENOMICS
GENOMICS
 
Genome &lt;imran>
Genome &lt;imran>Genome &lt;imran>
Genome &lt;imran>
 
Genomics types
Genomics typesGenomics types
Genomics types
 
Protocols for genomics and proteomics
Protocols for genomics and proteomics Protocols for genomics and proteomics
Protocols for genomics and proteomics
 
Genome Sequencing Project
Genome Sequencing ProjectGenome Sequencing Project
Genome Sequencing Project
 
Whole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thalianaWhole genome sequencing of arabidopsis thaliana
Whole genome sequencing of arabidopsis thaliana
 
Mitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and PhylogenyMitochondrial DNA in Taxonomy and Phylogeny
Mitochondrial DNA in Taxonomy and Phylogeny
 
Genomics
GenomicsGenomics
Genomics
 
THE human genome
THE human genomeTHE human genome
THE human genome
 
genome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sirgenome sequencing, types by kk sahu sir
genome sequencing, types by kk sahu sir
 
Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
 
Comparative genomics and proteomics
Comparative genomics and proteomicsComparative genomics and proteomics
Comparative genomics and proteomics
 
The Human Genome Project - Part I
The Human Genome Project - Part IThe Human Genome Project - Part I
The Human Genome Project - Part I
 
"Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in...
"Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in..."Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in...
"Phylogenomics: Combining Evolutionary Reconstructions and Genome Analysis in...
 
Phylogenomics talk in 2000 at University of Maryland by J. Eisen
Phylogenomics talk in 2000 at University of Maryland by J. EisenPhylogenomics talk in 2000 at University of Maryland by J. Eisen
Phylogenomics talk in 2000 at University of Maryland by J. Eisen
 
Genomic Data Analysis
Genomic Data AnalysisGenomic Data Analysis
Genomic Data Analysis
 
Lecture1 1 Perl for bioinformatics Davide Pisani & James Cotton
Lecture1 1 Perl for bioinformatics Davide Pisani & James CottonLecture1 1 Perl for bioinformatics Davide Pisani & James Cotton
Lecture1 1 Perl for bioinformatics Davide Pisani & James Cotton
 
Types of genomics ppt
Types of genomics pptTypes of genomics ppt
Types of genomics ppt
 

Similaire à Biotech 2011-01-intro

Sk microfluidics and lab on-a-chip-ch3
Sk microfluidics and lab on-a-chip-ch3Sk microfluidics and lab on-a-chip-ch3
Sk microfluidics and lab on-a-chip-ch3stanislas547
 
Genomicsandproteomicsii
GenomicsandproteomicsiiGenomicsandproteomicsii
GenomicsandproteomicsiiShyam Kodi
 
Genomics and proteomics II
Genomics and proteomics IIGenomics and proteomics II
Genomics and proteomics IINikolay Vyahhi
 
Chapter 7 genome structure, chromatin, and the nucleosome (1)
Chapter 7   genome structure, chromatin, and the nucleosome (1)Chapter 7   genome structure, chromatin, and the nucleosome (1)
Chapter 7 genome structure, chromatin, and the nucleosome (1)Roger Mendez
 
Genetics,study designs- Dr Harshavardhan Patwal
Genetics,study designs- Dr Harshavardhan PatwalGenetics,study designs- Dr Harshavardhan Patwal
Genetics,study designs- Dr Harshavardhan PatwalDr Harshavardhan Patwal
 
Prion Protein
Prion ProteinPrion Protein
Prion Proteinmazraara
 
Human Genome presentation.pptx
Human Genome presentation.pptxHuman Genome presentation.pptx
Human Genome presentation.pptxbeth951481
 
Chromosomes 1.02.23 pm
Chromosomes 1.02.23 pmChromosomes 1.02.23 pm
Chromosomes 1.02.23 pmAnjali Naik
 
Organellar genome
Organellar genomeOrganellar genome
Organellar genomesandeshGM
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencingShital Pal
 
Human genome project (2) converted
Human genome project (2) convertedHuman genome project (2) converted
Human genome project (2) convertedGAnchal
 
Human genome project
Human genome projectHuman genome project
Human genome projectShital Pal
 

Similaire à Biotech 2011-01-intro (20)

Biotech 2011-01-intro
Biotech 2011-01-introBiotech 2011-01-intro
Biotech 2011-01-intro
 
Sk microfluidics and lab on-a-chip-ch3
Sk microfluidics and lab on-a-chip-ch3Sk microfluidics and lab on-a-chip-ch3
Sk microfluidics and lab on-a-chip-ch3
 
Genomicsandproteomicsii
GenomicsandproteomicsiiGenomicsandproteomicsii
Genomicsandproteomicsii
 
Genomics and proteomics II
Genomics and proteomics IIGenomics and proteomics II
Genomics and proteomics II
 
Chapter 7 genome structure, chromatin, and the nucleosome (1)
Chapter 7   genome structure, chromatin, and the nucleosome (1)Chapter 7   genome structure, chromatin, and the nucleosome (1)
Chapter 7 genome structure, chromatin, and the nucleosome (1)
 
Genetics,study designs- Dr Harshavardhan Patwal
Genetics,study designs- Dr Harshavardhan PatwalGenetics,study designs- Dr Harshavardhan Patwal
Genetics,study designs- Dr Harshavardhan Patwal
 
Genomics
GenomicsGenomics
Genomics
 
Prion Protein
Prion ProteinPrion Protein
Prion Protein
 
Model organisms
Model organismsModel organisms
Model organisms
 
Human Genome presentation.pptx
Human Genome presentation.pptxHuman Genome presentation.pptx
Human Genome presentation.pptx
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
DNA ANALYSIS.pptx
DNA ANALYSIS.pptxDNA ANALYSIS.pptx
DNA ANALYSIS.pptx
 
Chromosomes 1.02.23 pm
Chromosomes 1.02.23 pmChromosomes 1.02.23 pm
Chromosomes 1.02.23 pm
 
The Human Genome Project
The Human Genome Project The Human Genome Project
The Human Genome Project
 
Organellar genome
Organellar genomeOrganellar genome
Organellar genome
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Human genome project (2) converted
Human genome project (2) convertedHuman genome project (2) converted
Human genome project (2) converted
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Microbial genetics notes
Microbial genetics notesMicrobial genetics notes
Microbial genetics notes
 
BIOCHEMISTRY
BIOCHEMISTRYBIOCHEMISTRY
BIOCHEMISTRY
 

Plus de Nikolay Vyahhi

Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishingNikolay Vyahhi
 
Molbiol 2011-13-organelles
Molbiol 2011-13-organellesMolbiol 2011-13-organelles
Molbiol 2011-13-organellesNikolay Vyahhi
 
Molbiol 2011-12-eukaryotic gene-expression
Molbiol 2011-12-eukaryotic gene-expressionMolbiol 2011-12-eukaryotic gene-expression
Molbiol 2011-12-eukaryotic gene-expressionNikolay Vyahhi
 
Molbiol 2011-10-proteins
Molbiol 2011-10-proteinsMolbiol 2011-10-proteins
Molbiol 2011-10-proteinsNikolay Vyahhi
 
Molbiol 2011-09-reparation-recombination
Molbiol 2011-09-reparation-recombinationMolbiol 2011-09-reparation-recombination
Molbiol 2011-09-reparation-recombinationNikolay Vyahhi
 
Molbiol 2011-08-epigenetics
Molbiol 2011-08-epigeneticsMolbiol 2011-08-epigenetics
Molbiol 2011-08-epigeneticsNikolay Vyahhi
 
Molbiol 2011-07-chromosomes-cell-cycle
Molbiol 2011-07-chromosomes-cell-cycleMolbiol 2011-07-chromosomes-cell-cycle
Molbiol 2011-07-chromosomes-cell-cycleNikolay Vyahhi
 
Molbiol 2011-06-transcription-translation
Molbiol 2011-06-transcription-translationMolbiol 2011-06-transcription-translation
Molbiol 2011-06-transcription-translationNikolay Vyahhi
 
Molbiol 2011-05-dna-rna-protein
Molbiol 2011-05-dna-rna-proteinMolbiol 2011-05-dna-rna-protein
Molbiol 2011-05-dna-rna-proteinNikolay Vyahhi
 
Molbiol 2011-04-metabolism
Molbiol 2011-04-metabolismMolbiol 2011-04-metabolism
Molbiol 2011-04-metabolismNikolay Vyahhi
 
Molbiol 2011-03-biochem
Molbiol 2011-03-biochemMolbiol 2011-03-biochem
Molbiol 2011-03-biochemNikolay Vyahhi
 
Molbiol 2011-02-biology
Molbiol 2011-02-biologyMolbiol 2011-02-biology
Molbiol 2011-02-biologyNikolay Vyahhi
 
Molbiol 2011-01-chemistry
Molbiol 2011-01-chemistryMolbiol 2011-01-chemistry
Molbiol 2011-01-chemistryNikolay Vyahhi
 
Molbiol 2011-11-role of-proteins
Molbiol 2011-11-role of-proteinsMolbiol 2011-11-role of-proteins
Molbiol 2011-11-role of-proteinsNikolay Vyahhi
 
Biotech 2011-08-recombinant-dna
Biotech 2011-08-recombinant-dnaBiotech 2011-08-recombinant-dna
Biotech 2011-08-recombinant-dnaNikolay Vyahhi
 
Biotech 2011-02-genetics
Biotech 2011-02-geneticsBiotech 2011-02-genetics
Biotech 2011-02-geneticsNikolay Vyahhi
 
Biotech 2011-10-methods
Biotech 2011-10-methodsBiotech 2011-10-methods
Biotech 2011-10-methodsNikolay Vyahhi
 
Biotech 2011-09-pcr and-in_situ_methods
Biotech 2011-09-pcr and-in_situ_methodsBiotech 2011-09-pcr and-in_situ_methods
Biotech 2011-09-pcr and-in_situ_methodsNikolay Vyahhi
 
Biotech 2011-07-finding-orf-etc
Biotech 2011-07-finding-orf-etcBiotech 2011-07-finding-orf-etc
Biotech 2011-07-finding-orf-etcNikolay Vyahhi
 

Plus de Nikolay Vyahhi (20)

Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
 
Molbiol 2011-wetlab
Molbiol 2011-wetlabMolbiol 2011-wetlab
Molbiol 2011-wetlab
 
Molbiol 2011-13-organelles
Molbiol 2011-13-organellesMolbiol 2011-13-organelles
Molbiol 2011-13-organelles
 
Molbiol 2011-12-eukaryotic gene-expression
Molbiol 2011-12-eukaryotic gene-expressionMolbiol 2011-12-eukaryotic gene-expression
Molbiol 2011-12-eukaryotic gene-expression
 
Molbiol 2011-10-proteins
Molbiol 2011-10-proteinsMolbiol 2011-10-proteins
Molbiol 2011-10-proteins
 
Molbiol 2011-09-reparation-recombination
Molbiol 2011-09-reparation-recombinationMolbiol 2011-09-reparation-recombination
Molbiol 2011-09-reparation-recombination
 
Molbiol 2011-08-epigenetics
Molbiol 2011-08-epigeneticsMolbiol 2011-08-epigenetics
Molbiol 2011-08-epigenetics
 
Molbiol 2011-07-chromosomes-cell-cycle
Molbiol 2011-07-chromosomes-cell-cycleMolbiol 2011-07-chromosomes-cell-cycle
Molbiol 2011-07-chromosomes-cell-cycle
 
Molbiol 2011-06-transcription-translation
Molbiol 2011-06-transcription-translationMolbiol 2011-06-transcription-translation
Molbiol 2011-06-transcription-translation
 
Molbiol 2011-05-dna-rna-protein
Molbiol 2011-05-dna-rna-proteinMolbiol 2011-05-dna-rna-protein
Molbiol 2011-05-dna-rna-protein
 
Molbiol 2011-04-metabolism
Molbiol 2011-04-metabolismMolbiol 2011-04-metabolism
Molbiol 2011-04-metabolism
 
Molbiol 2011-03-biochem
Molbiol 2011-03-biochemMolbiol 2011-03-biochem
Molbiol 2011-03-biochem
 
Molbiol 2011-02-biology
Molbiol 2011-02-biologyMolbiol 2011-02-biology
Molbiol 2011-02-biology
 
Molbiol 2011-01-chemistry
Molbiol 2011-01-chemistryMolbiol 2011-01-chemistry
Molbiol 2011-01-chemistry
 
Molbiol 2011-11-role of-proteins
Molbiol 2011-11-role of-proteinsMolbiol 2011-11-role of-proteins
Molbiol 2011-11-role of-proteins
 
Biotech 2011-08-recombinant-dna
Biotech 2011-08-recombinant-dnaBiotech 2011-08-recombinant-dna
Biotech 2011-08-recombinant-dna
 
Biotech 2011-02-genetics
Biotech 2011-02-geneticsBiotech 2011-02-genetics
Biotech 2011-02-genetics
 
Biotech 2011-10-methods
Biotech 2011-10-methodsBiotech 2011-10-methods
Biotech 2011-10-methods
 
Biotech 2011-09-pcr and-in_situ_methods
Biotech 2011-09-pcr and-in_situ_methodsBiotech 2011-09-pcr and-in_situ_methods
Biotech 2011-09-pcr and-in_situ_methods
 
Biotech 2011-07-finding-orf-etc
Biotech 2011-07-finding-orf-etcBiotech 2011-07-finding-orf-etc
Biotech 2011-07-finding-orf-etc
 

Dernier

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 

Dernier (20)

Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 

Biotech 2011-01-intro

  • 1. Ulf Schmitz, Introduction to genomics and proteomics I 1 www. .uni-rostock.de BioinformaticsBioinformatics Introduction to genomics and proteomics IIntroduction to genomics and proteomics I Ulf Schmitz ulf.schmitz@informatik.uni-rostock.de Bioinformatics and Systems Biology Group www.sbi.informatik.uni-rostock.de
  • 2. Ulf Schmitz, Introduction to genomics and proteomics I 2 www. .uni-rostock.de Outline Genomics/Genetics 1. The tree of life • Prokaryotic Genomes – Bacteria – Archaea • Eukaryotic Genomes – Homo sapiens 2. Genes • Expression Data
  • 3. Ulf Schmitz, Introduction to genomics and proteomics I 3 www. .uni-rostock.de Genomics - Definitions Genetics: is the science of genes, heredity, and the variation of organisms. Humans began applying knowledge of genetics in prehistory with the domestication and breeding of plants and animals. In modern research, genetics provides tools in the investigation of the function of a particular gene, e.g. analysis of genetic interactions. Genomics: attempts the study of large-scale genetic patterns across the genome for a given species. It deals with the systematic use of genome information to provide answers in biology, medicine, and industry. Genomics has the potential of offering new therapeutic methods for the treatment of some diseases, as well as new diagnostic methods. Major tools and methods related to genomics are bioinformatics, genetic analysis, measurement of gene expression, and determination of gene function.
  • 4. Ulf Schmitz, Introduction to genomics and proteomics I 4 www. .uni-rostock.deGenes • a gene coding for a protein corresponds to a sequence of nucleotides along one or more regions of a molecule of DNA • in species with double stranded DNA (dsDNA), genes may appear on either strand • bacterial genes are continuous regions of DNA bacterium: • a string of 3N nucleotides encodes a string of N amino acids • or a string of N nucleotides encodes a structural RNA molecule of N residues eukaryote: • a gene may appear split into separated segments in the DNA • an exon is a stretch of DNA retained in mRNA that the ribosomes translate into protein
  • 5. Ulf Schmitz, Introduction to genomics and proteomics I 5 www. .uni-rostock.de Genomics Genome size comparison 4.1 million5,0001Bacterium (E. coli) 19,000 14,000 14,000 31,000 22.5-30,000 28-35,000 Genes 97 million12Roundworm (C. elegans) 137 million8Fruit Fly (Drosophila melanogaster) 289 million6Malaria mosquito (Anopheles gambiae) 365 million44Puffer fish (Fugu rubripes) 2.7 billion40Mouse (Mus musculus) 3.1 billion46 (23 pairs) Human (Homo sapiens) Base pairsChrom.Species
  • 6. Ulf Schmitz, Introduction to genomics and proteomics I 6 www. .uni-rostock.de Genes exon: A section of DNA which carries the coding sequence for a protein or part of it. Exons are separated by intervening, non-coding sequences (called introns). In eukaryotes most genes consist of a number of exons. exon: A section of DNA which carries the coding sequence for a protein or part of it. Exons are separated by intervening, non-coding sequences (called introns). In eukaryotes most genes consist of a number of exons. intron: An intervening section of DNA which occurs almost exclusively within a eukaryotic gene, but which is not translated to amino-acid sequences in the gene product. The introns are removed from the pre-mature mRNA through a process called splicing, which leaves the exons untouched, to form an active mRNA. intron: An intervening section of DNA which occurs almost exclusively within a eukaryotic gene, but which is not translated to amino-acid sequences in the gene product. The introns are removed from the pre-mature mRNA through a process called splicing, which leaves the exons untouched, to form an active mRNA.
  • 7. Ulf Schmitz, Introduction to genomics and proteomics I 7 www. .uni-rostock.de Genes exon intron Globin gene – 1525 bp: 622 in exons, 893 in introns Ovalbumin gene - ~ 7500 bp: 8 short exons comprising 1859 bp Conalbumin gene - ~ 10,000 bp: 17 short exons comprising ~ 2,200 bp Examples of the exon:intron mosaic of genes
  • 8. Ulf Schmitz, Introduction to genomics and proteomics I 8 www. .uni-rostock.de Picking out genes in genomes • Computer programs for genome analysis identify ORFs (open reading frames) • An ORF begins with an initiation codon ATG (AUG) • An ORF is a potential protein-coding region • There are two approaches to identify protein coding regions…
  • 9. Ulf Schmitz, Introduction to genomics and proteomics I 9 www. .uni-rostock.de Picking out genes in genomes • Regions may encode amino acid sequences similar to known proteins • Or may be similar to ESTs (correspond to genes known to be expressed) • Few hundred initial bases of cDNA are sequenced to identify a gene 1. Detection of regions similar to known coding regions from other organisms 2. Ab initio methods, seek to identify genes from the properties of the DNA sequence itself • Bacterial genes are easy to identify, because they are contiguous • They have no introns and the space between genes is small • Identification of exons in higher organisms is a problem, assembling them another…
  • 10. Ulf Schmitz, Introduction to genomics and proteomics I 10 www. .uni-rostock.de Picking out genes in genomes • The initial (5´) exon starts with a transcription start point, preceded by a core promoter site such as the TATA box (~30bp upstream) – Free of stop codons – End immediately before a GT splice-signal Ab initio gene identification in eukaryotic genomes binds and directs RNA polymerase to the correct transcriptional start site
  • 11. Ulf Schmitz, Introduction to genomics and proteomics I 11 www. .uni-rostock.de Picking out genes in genomes 5' splice signal 3' splice signal
  • 12. Ulf Schmitz, Introduction to genomics and proteomics I 12 www. .uni-rostock.de Picking out genes in genomes • Internal exons are free of stop codons too – Begin after an AG splice signal – End before a GT splice signal Ab initio gene identification in eukaryotic genomes
  • 13. Ulf Schmitz, Introduction to genomics and proteomics I 13 www. .uni-rostock.de Picking out genes in genomes • The final (3´) exon starts after a an AG splice signal – Ends with a stop codon (TAA,TAG,TGA) – Followed by a polyadenylation signal sequence Ab initio gene identification in eukaryotic genomes
  • 14. Ulf Schmitz, Introduction to genomics and proteomics I 14 www. .uni-rostock.de Humans have spliced genes…
  • 15. Ulf Schmitz, Introduction to genomics and proteomics I 15 www. .uni-rostock.de DNA makes RNA makes Protein
  • 16. Ulf Schmitz, Introduction to genomics and proteomics I 16 www. .uni-rostock.de Tree of life Prokaryotes
  • 17. Ulf Schmitz, Introduction to genomics and proteomics I 17 www. .uni-rostock.de Genomics – Prokaryotes • the genome of a prokaryote comes as a single double-stranded DNA molecule in ring-form – in average 2mm long – whereas the cells diameter is only 0.001mm – < 5 Mb • prokaryotic cells can have plasmids as well (see next slide) • protein coding regions have no introns • little non-coding DNA compared to eukaryotes – in E.coli only 11%
  • 18. Ulf Schmitz, Introduction to genomics and proteomics I 18 www. .uni-rostock.de Genomics - Plasmids • Plasmids are circular double stranded DNA molecules that are separate from the chromosomal DNA. • They usually occur in bacteria, sometimes in eukaryotic organisms • Their size varies from 1 to 250 kilo base pairs (kbp). There are from one copy, for large plasmids, to hundreds of copies of the same plasmid present in a single cell.
  • 19. Ulf Schmitz, Introduction to genomics and proteomics I 19 www. .uni-rostock.de Prokaryotic model organisms E.coli (Escherichia coli) Methanococcus jannaschii (archaeon) Mycoplasma genitalium (simplest organism known)
  • 20. Ulf Schmitz, Introduction to genomics and proteomics I 20 www. .uni-rostock.de Genomics • DNA of higher organisms is organized into chromosomes (human – 23 chromosome pairs) • not all DNA codes for proteins • on the other hand some genes exist in multiple copies • that’s why from the genome size you can’t easily estimate the amount of protein sequence information
  • 21. Ulf Schmitz, Introduction to genomics and proteomics I 21 www. .uni-rostock.de Genomes of eukaryotes • majority of the DNA is in the nucleus, separated into bundles (chromosomes) – small amounts of DNA appear in organelles (mitochondria and chloroplasts) • within single chromosomes gene families are common – some family members are paralogues (related) • they have duplicated within the same genome • often diverged to provide separate functions in descendants (Nachkommen) • e.g. human α and β globin – orthologues genes • are homologues in different species • often perform the same function • e.g. human and horse myoglobin – pseudogenes • lost their function • e.g. human globin gene cluster pseudogene
  • 22. Ulf Schmitz, Introduction to genomics and proteomics I 22 www. .uni-rostock.de Eukaryotic model organisms • Saccharomyces cerevisiae (baker’s yeast) • Caenorhabditis elegans (C.elegans) • Drosophila melanogaster (fruit fly) • Arabidopsis thaliana (flower) • Homo sapiens (human)
  • 23. Ulf Schmitz, Introduction to genomics and proteomics I 23 www. .uni-rostock.de The human genome • ~3.2 x 109 bp (thirty time larger than C.elegans or D.melongaster) • coding sequences form only 5% of the human genome • Repeat sequences over 50% • Only ~32.000 genes • Human genome is distributed over 22 chromosome pairs plus X and Y chromosomes • Exons of protein-coding genes are relatively small compared to other known eukaryotic genomes • Introns are relatively long • Protein-coding genes span long stretches of DNA (dystrophin, coding a 3.685 amino acid protein, is >2.4Mbp long) • Average gene length: ~ 8,000 bp • Average of 5-6 exons/gene • Average exon length: ~200 bp • Average intron length: ~2,000 bp • ~8% genes have a single exon • Some exons can be as small as 1 or 3 bp.
  • 24. Ulf Schmitz, Introduction to genomics and proteomics I 24 www. .uni-rostock.de 0.03Enzyme activator 20.6 2.9 2.5 5.3 1.8 3242 457 403 839 295 Enzyme Peptidase Endopeptidase Protein kinase Protein phosphatase 3.8603Defense/immunity protein 0.8129Actin binding 0.585Motor 0.9154Chaperone 0.475Cell Cycle regulator 0.06Transcription factor binding 14.0 10.5 0.2 0.0 6.2 2.4 0.8 0.2 2207 1656 45 7 986 380 137 44 Nucleic acid binding DNA binding DNA repair protein DNA replication factor Transcription factor RNA binding Structural protein of ribosome Translation factor %NumberFunction 100.015683Total 30.64813Unclassified 0.05Tumor suppressor 9.7 0.2 0.3 1536 33 50 Ligand binding or carrier Electron transfer Cytochrome P450 4.3 1.7 0.1 682 269 19 Transporter Ion channel Neurotransmitter transporter 4.5 0.9 714 145 Structural protein Cytoskeletal structural protein 1.2189Cell adhesion 0.07Storage protein 11.4 8.4 7.6 3.1 0.0 1790 1318 1202 489 71 Signal transduction Receptor Transmembrane receptor G-protein link receptor Olfactory receptor 0.8132Apoptosis inhibitor %NumberFunction The human genomeThe human genome Top categories in a function classification:
  • 25. Ulf Schmitz, Introduction to genomics and proteomics I 25 www. .uni-rostock.de The human genome • Repeated sequences comprise over 50% of the genome: – Transposable elements, or interspersed repeats include LINEs and SINEs (almost 50%) – Retroposed pseudogenes – Simple ‘stutters’ - repeats of short oligomers (minisatellites and microsatellites) – Segment duplication, of blocks of ~10 - 300kb – Blocks of tandem repeats, including gene families 3300.00080-3000DNA Transposon fossils 8450.00015.000 -110.000Long Terminal Repeats 21850.0006000-8000Long Interspersed Nuclear Elements (LINEs) 131.500.000100-300Short Interspersed Nuclear Elements (SINEs) Fraction of genome % Copy number Size (bp)Element
  • 26. Ulf Schmitz, Introduction to genomics and proteomics I 26 www. .uni-rostock.de The human genome • All people are different, but the DNA of different people only varies for 0.2% or less. • So, only up to 2 letters in 1000 are expected to be different. • Evidence in current genomics studies (Single Nucleotide Polymorphisms or SNPs) imply that on average only 1 letter out of 1400 is different between individuals. • means that 2 to 3 million letters would differ between individuals.
  • 27. Ulf Schmitz, Introduction to genomics and proteomics I 27 www. .uni-rostock.de TERTIARY STRUCTURE (fold) TERTIARY STRUCTURE (fold) Genome Expressome Proteome Metabolome Functional Genomics From gene to functionFrom gene to function
  • 28. Ulf Schmitz, Introduction to genomics and proteomics I 28 www. .uni-rostock.de DNA makes RNA makes Protein: Expression data • More copies of mRNA for a gene leads to more protein • mRNA can now be measured for all the genes in a cell at ones through microarray technology • Can have 60,000 spots (genes) on a single gene chip • Color change gives intensity of gene expression (over- or under-expression)
  • 29. Ulf Schmitz, Introduction to genomics and proteomics I 29 www. .uni-rostock.de
  • 30. Ulf Schmitz, Introduction to genomics and proteomics I 30 www. .uni-rostock.de Genes and regulatory regions regulatory mechanisms organize the expression of genes – genes may be turned on or off in response to concentrations of nutrients or to stress – control regions often lie near the segments coding for proteins – they can serve as binding sites for molecules that transcribe the DNA – or they bind regulatory molecules that can block transcription
  • 31. Ulf Schmitz, Introduction to genomics and proteomics I 31 www. .uni-rostock.de Expression data
  • 32. Ulf Schmitz, Introduction to genomics and proteomics I 32 www. .uni-rostock.de Outlook – coming lecture Proteomics – Proteins – post-translational modification – Key technologies • Maps of hereditary information • SNPs (Single nucleotide polymorphisms) • Genetic diseases
  • 33. Ulf Schmitz, Introduction to genomics and proteomics I 33 www. .uni-rostock.de Thanks for your attention!
  • 34. Ulf Schmitz, Introduction to genomics and proteomics II 1 www. .uni-rostock.de BioinformaticsBioinformatics Introduction to genomics and proteomics IIIntroduction to genomics and proteomics II ulf.schmitz@informatik.uni-rostock.de Bioinformatics and Systems Biology Group www.sbi.informatik.uni-rostock.de
  • 35. Ulf Schmitz, Introduction to genomics and proteomics II 2 www. .uni-rostock.de Outline 1. Proteomics • Motivation • Post -Translational Modifications • Key technologies • Data explosion 2. Maps of hereditary information 3. Single nucleotide polymorphisms
  • 36. Ulf Schmitz, Introduction to genomics and proteomics II 3 www. .uni-rostock.de Protomics Proteomics: • is the large-scale study of proteins, particularly their structures and functions • This term was coined to make an analogy with genomics, and is often viewed as the "next step", • but proteomics is much more complicated than genomics. • Most importantly, while the genome is a rather constant entity, the proteome is constantly changing through its biochemical interactions with the genome. • One organism will have radically different protein expression in different parts of its body and in different stages of its life cycle. Proteome: The entirety of proteins in existence in an organism are referred to as the proteome.
  • 37. Ulf Schmitz, Introduction to genomics and proteomics II 4 www. .uni-rostock.de Proteomics If the genome is a list of the instruments in an orchestra, the proteome is the orchestra playing a symphony. R.Simpson
  • 38. Ulf Schmitz, Introduction to genomics and proteomics II 5 www. .uni-rostock.de Proteomics • Describing all 3D structures of proteins in the cell is called Structural Genomics • Finding out what these proteins do is called Functional Genomics GENOME PROTEOME DNA Microarray Genetic Screens Protein – Ligand Interactions Protein – Protein Interactions Structure
  • 39. Ulf Schmitz, Introduction to genomics and proteomics II 6 www. .uni-rostock.de Proteomics • What kind of data would we like to measure? • What mature experimental techniques exist to determine them? • The basic goal is a spatio-temporal description of the deployment of proteins in the organism. Motivation:
  • 40. Ulf Schmitz, Introduction to genomics and proteomics II 7 www. .uni-rostock.de Proteomics • the rates of synthesis of different proteins vary among different tissues and different cell types and states of activity • methods are available for efficient analysis of transcription patterns of multiple genes • because proteins ‘turn over’ at different rates, it is also necessary to measure proteins directly • the distribution of expressed protein levels is a kinetic balance between rates of protein synthesis and degradation Things to consider:
  • 41. Ulf Schmitz, Introduction to genomics and proteomics II 8 www. .uni-rostock.de
  • 42. Ulf Schmitz, Introduction to genomics and proteomics II 9 www. .uni-rostock.de Why do Proteomics? • are there differences between amino acid sequences determined directly from proteins and those determined by translation from DNA? – pattern recognition programs addressing this questions have following errors: • a genuine protein sequence may be missed entirely • an incomplete protein may be reported • a gene may be incorrectly spliced • genes for different proteins may overlap • genes may be assembled from exons in different ways in different tissues – often, molecules must be modified to make a mature protein that differs significantly from the one suggested by translation • in many cases the missing post-translational- modifications are quite important and have functional significance • post-transitional modifications include addition of ligands, glycosylation, methylation, excision of peptides, etc. – in some cases mRNA is edited before translation, creating changes in the amino acid sequence that are not inferrable from the genes • a protein inferred from a genome sequence is a hypothetical object until an experiment verifies its existence
  • 43. Ulf Schmitz, Introduction to genomics and proteomics II 10 www. .uni-rostock.de Post-translational modification • a protein is a polypeptide chain composed of 20 possible amino acids • there are far fewer genes that code for proteins in the human genome than there are proteins in the human proteome (~33,000 genes vs ~200,000 proteins). • each gene encodes as many as six to eight different proteins – due to post-translational modifications such as phosphorylation, glycosylation or cleavage (Spaltung) • posttranslational modification extends the range of possible functions a protein can have – changes may alter the hydrophobicity of a protein and thus determine if the modified protein is cytosolic or membrane-bound – modifications like phosphorylation are part of common mechanisms for controlling the behavior of a protein, for instance, activating or inactivating an enzyme.
  • 44. Ulf Schmitz, Introduction to genomics and proteomics II 11 www. .uni-rostock.de Post-translational modification • phosphorylation is the addition of a phosphate (PO4) group to a protein or a small molecule (usual to serine, tyrosine, threonine or histidine) • In eukaryotes, protein phosphorylation is probably the most important regulatory event • Many enzymes and receptors are switched "on" or "off" by phosphorylation and dephosphorylation • Phosphorylation is catalyzed by various specific protein kinases, whereas phosphatases dephosphorylate. Phosphorylation Acetylation • Is the addition of an acetyl group, usually at the N-terminus of the protein Farnesylation • farnesylation, the addition of a farnesyl group Glycosylation • the addition of a glycosyl group to either asparagine, hydroxylysine, serine, or threonine, resulting in a glycoprotein
  • 45. Ulf Schmitz, Introduction to genomics and proteomics II 12 www. .uni-rostock.de Proteomics
  • 46. Ulf Schmitz, Introduction to genomics and proteomics II 13 www. .uni-rostock.de Key technologies for proteomics 1. 1-D electrophoresis and 2-D electrophoresis • are for the separation and visualization of proteins. 2. mass spectrometry, x-ray crystallography, and NMR (Nuclear magnetic resonance ) • are used to identify and characterize proteins 3. chromatography techniques especially affinity chromatography • are used to characterize protein-protein interactions. 4. Protein expression systems like the yeast two- hybrid and FRET (fluorescence resonance energy transfer) • can also be used to characterize protein-protein interactions.
  • 47. Ulf Schmitz, Introduction to genomics and proteomics II 14 www. .uni-rostock.de Key technologies for proteomics Reference map of lympphoblastoid cell linePRI, soluble proteins. • 110 µg of proteins loaded • Strip 17cm pH gradient 4-7, SDS PAGE gels 20 x 25 cm, 8-18.5% T. • Staining by silver nitrate method (Rabilloud et al.,) • Identification by mass spectrometry. The pinks labels on the spots indicate the ID in Swiss-prot database browse the SWISS-2DPAGE database for more 2d PAGE images High-resolution two-dimensional polyacrylamide gel electrophoresis (2D PAGE) shows the pattern of protein content in a sample.
  • 48. Ulf Schmitz, Introduction to genomics and proteomics II 15 www. .uni-rostock.de Proteomics Typically, a sample is purified to homogeneity, crystallized, subjected to an X- ray beam and diffraction data are collected. X-ray crystallography is a means to determine the detailed molecular structure of a protein, nucleic acid or small molecule. With a crystal structure we can explain the mechanism of an enzyme, the binding of an inhibitor, the packing of protein domains, the tertiary structure of a nucleic acid molecule etc..
  • 49. Ulf Schmitz, Introduction to genomics and proteomics II 16 www. .uni-rostock.de High-throughput Biological Data • Enormous amounts of biological data are being generated by high-throughput capabilities; even more are coming – genomic sequences – gene expression data (microarrays) – mass spec. data – protein-protein interaction (chromatography) – protein structures (x-ray christallography) – ......
  • 50. Ulf Schmitz, Introduction to genomics and proteomics II 17 www. .uni-rostock.de Protein structural data explosion Protein Data Bank (PDB): 33.367 Structures (1 November 2005) 28.522 x-ray crystallography, 4.845 NMR
  • 51. Ulf Schmitz, Introduction to genomics and proteomics II 18 www. .uni-rostock.de Maps of hereditary information 1. Linkage maps of genes mini- / microsatellites 2. Banding patterns of chromosomes physical objects with visible landmarks called banding patterns 3. DNA sequences Contig maps (contigous clone maps) Sequence tagged site (STS) SNPs (Single nucloetide polymorphisms) Following maps are used to find out how hereditary information is stored, passed on, and implemented.
  • 52. Ulf Schmitz, Introduction to genomics and proteomics II 19 www. .uni-rostock.de Linkage map
  • 53. Ulf Schmitz, Introduction to genomics and proteomics II 20 www. .uni-rostock.de Maps of hereditary information • regions, 8-80bp long, repeated a variable number of times • the distribution and the size of repeats is the marker • inheritance of VNTRs can be followed in a family and mapped to a pathological phenotype • first genetic data used for personal identification – Genetic fingerprints; in paternity and in criminal cases Variable number tandem repeats (VNTRs, also minisatellites) Short tandem repeat polymorphism (STRPs, also microsatellites) • Regions of 2-7bp, repeated many times – Usually 10-30 consecutive copies
  • 54. Ulf Schmitz, Introduction to genomics and proteomics II 21 www. .uni-rostock.de centromere CGTCGTCGTCGTCGTCGTCGTCGT... GCAGCAGCAGCAGCAGCAGCAGCA... 3bp
  • 55. Ulf Schmitz, Introduction to genomics and proteomics II 22 www. .uni-rostock.de Maps of hereditary information Banding patterns of chromosomes
  • 56. Ulf Schmitz, Introduction to genomics and proteomics II 23 www. .uni-rostock.de Maps of hereditary information Banding patterns of chromosomes petite – arm centromere queue - arm
  • 57. Ulf Schmitz, Introduction to genomics and proteomics II 24 www. .uni-rostock.de Maps of hereditary information • Series of overlapping DNA clones of known order along a chromosome from an organism of interest, stored in yeast or bacterial cells as YACs (Yeast Artificial Chromosomes) or BACs (Bacterial Artificial Chromosomes) • A contig map produces a fine mapping (high resolution) of a genome • YAC can contain up to 106bp, a BAC about 250.000bp Contig map (also contiguous clone map) Sequence tagged site (STS) • Short, sequenced region of DNA, 200-600bp long, that appears in a unique location in the genome • One type arises from an EST (expressed sequence tag), a piece of cDNA
  • 58. Ulf Schmitz, Introduction to genomics and proteomics II 25 www. .uni-rostock.de Maps of hereditary information 1. if we know the protein involved, we can pursue rational approaches to therapy 2. if we know the gene involved, we can devise tests to identify sufferers or carriers 3. wereas the knowledge of the chromosomal location of the gene is unnecessary in many cases for either therapy or detection; • it is required only for identifying the gene, providing a bridge between the patterns of inheritance and the DNA sequence Imagine we know that a disease results from a specific defective protein:
  • 59. Ulf Schmitz, Introduction to genomics and proteomics II 26 www. .uni-rostock.de Single nucleotide polymorphisms (SNPs)Single nucleotide polymorphisms (SNPs) • SNP (pronounced ‘snip’) is a genetic variation between individuals • single base pairs that can be substituted, deleted or inserted • SNPs are distributed throughout the genome – average every 2000bp • provide markers for mapping genes • not all SNPs are linked to diseases
  • 60. Ulf Schmitz, Introduction to genomics and proteomics II 27 www. .uni-rostock.de Single nucleotide polymorphisms (SNPs) • nonsense mutations: – codes for a stop, which can truncate the protein • missense mutations: – codes for a different amino acid • silent mutations: – codes for the same amino acid, so has no effect
  • 61. Ulf Schmitz, Introduction to genomics and proteomics II 28 www. .uni-rostock.de Outlook – coming lecture • Bioinformatics Information Resources And Networks – EMBnet – European Molecular Biology Network • DBs and Tools – NCBI – National Center For Biotechnology Information • DBs and Tools – Nucleic Acid Sequence Databases – Protein Information Resources – Metabolic Databases – Mapping Databases – Databases concerning Mutations – Literature Databases
  • 62. Ulf Schmitz, Introduction to genomics and proteomics II 29 www. .uni-rostock.de Thanks for your attention!