12. Human Genome Project
NHGRI
Solicited RFAs were
First
pilot sought for
Publicati
proposal for full
on in
ENCODE ENCODE
2000
In October GWAS -
Finished 90% lies First Report
1990 Human ENCODE
paper in outside on Encode
Genome coding published
2003 Published in
project started 2005 2012
2007
13. What happens next?
You have 10 million characters – what to do with them?
Locate genes
Determine the function of the gene
By similarity search
By domain search
By Predicting signal peptide
By locating transmembrane region
Ref: http://www.nature.com/nature/journal/v406/n6797/pdf/406799a0.pdf
14. Genome Annotation
Run 6 frame Run Blastp
ATGAAGATAGACAG translation with nr
CATACTAGCAGCAT
AGAATAGATAAGAG
ATAGAAATAGAATA Matc
h
AATATAAGAGAGA found
N
o
Repeat
Finding, miRN Product found
A
Make an
finding, tRNAs
hmmsearch
can etc. N
O
Pathway analysis
Matc
Other analysis
h
found
Unknown
Genes Hypothesis
15. Genome Sizes
Gametic Nuclear DNA content
Represented as mass in pg(pico grams) or length in
mega bases
1 pg = 10^-12 gms
1mb = 10^6 bases
1 pg = 978 Mb
Ref: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1669731/
22. Identifying Human Disease genes
ref: http://www.ncbi.nlm.nih.gov/books/NBK7561/
Before 1980, very few genes were recognized
Reverse Genetics: Know gene product and go back to
gene and do a positional cloning
Genetic Redundancy: Multiple genes have the same
function