The document summarizes the progress and findings of the Human Genome Project from its inception in 1990 through its completion in 2003. It discusses how the project established the foundations for genetic mapping and sequencing chromosomes. After the working draft was announced in 2000 and the project was completed in 2003, subsequent research focused on fully sequencing and analyzing the remaining chromosomes. This led to insights into genetic diseases and variation in gene and chromatin distribution across chromosomes. The document also discusses subsequent projects like ENCODE and HapMap that built upon the human genome sequence to map functional elements and genetic variations respectively.
Describe in your own words the benefits, but also the problems of ha.pdf
Whats Beyond The Finished Human Genome Sequence
1. What’s Beyond the Finished
Human Genome Sequence?
Abstract
The Human Genome Project is a project that has changed the course of biomedicine and biotechnology, since its
conception in 1991, and its ‘completion’ in 2003. With an effervescent scientific community its reach has spread far
and wide, with a number of projects spawning from the initial results. Over the past decade research has built upon
the foundations of this project, and is now reaching a point of completing genomic analysis. From the data collected,
research is now looking to moving forward in to a more understudied area of science, an area that has a lot of untapped
potential…the proteome.
Baby Steps for Big Science
The year is 1990, the U.S. Department of
Environmental Energy (DOE), and the National Institute
of Health (NIH) have invested $3-billion into a project
that was to completely alter the course of modern
science. It is at this point that The Human Genome
Project (HGP) sprung into life. ‘The Five-Year Plan’ was
proposed members of congressional appropriations
committees, detailing the project aims from 1991-1995
(Human Genome Program, 1990). Those five years
encompassed the establishment of the building blocks
that would form the project core; the human gene
mapping repository (Pearson, 1991), and the
International IMAGE Consortium (Lennon, et al., 1996).
This was followed by the completion of the first round
of genetic mapping, and the publication of a number of
high- and moderate-resolution physical maps of
chromosomes 16, 19, 3, 11, 12, and 22 (Ashworth, et
al., 1995; Bell, et al., 1995; Doggett, et al., 1995;
Gemmill, et al., 1995; Krauter, et al., 1995;
Quackenbush, et al., 1995).
From 1996-2003 the sequencing of the genome
progressed in leaps and bounds, with the high-
resolution mapping of chromosomes 7 and X in early
1997 (Bouffard, et al., 1997; Nagaraja, et al., 1997).
These successes were quickly followed up in 1998 by
the DOE and NIH revealing another 5-year plan, aiming
to complete the HGP by 2003 (Collins, et al., 1998), as
well as publication of GeneMap’98, doubling the
number of known genes (Deloukas & et-al., 1998).
1999 encompassed the complete sequencing and
analysis of chromosome 22 (Dunham & et-al., 1999),
the creation of the first public SNP consortium by major
drug firms (Thorisson & Stein, 2003), and a milestone
in sequencing the first 1-billion base pairs.
Come 25th
June 2000, President Clinton and HGP
leaders announced the completion of a working draft
sequence of the human genome at “a historic White
House event” (The White House, 2000), later published
by the International Human Genome Sequencing
Consortium, and Celera Genomics in the February
(Consortium, 2001; Venter & et-al., 2001). Between
late 2001 and early 2003 saw the publications of the
SNP map of the human genome (International SNP
Map Working Group, 2001), as well as the complete
sequence analysis of chromosome 14 (Heilig, et al.,
2003).
The completion of the 13-year project was announced
in the April of 2003, with the genome being sequenced
to 99.9% accuracy (National Human Genome Research
Institute, 2003). The results obtained from the HGP
have acted as the basis of all human genome studies of
the past 11 years.
Main Body (1000)
Chromosomes, and Chromatin- Unzipping Your Genes
The first four years after completion of the HGP (2003-
2006) saw heavy focus on complete sequencing and
analysis of the remaining 19 un-sequenced
chromosomes (Table 1). The research painted an
interesting picture of genetic disease. Across the 23
human chromosomes, 16 possess links to a number of
diseases, including cancers, and hereditary syndromes
(Mungall, et al., 2003; Deloukas, et al., 2004; Dunham,
et al., 2004; Humphray, et al., 2004; Ross, et al., 2005;
2. Gregory, et al., 2006; Zody, et al., 2006). As well as this,
it has given insight into the gene, and chromatin
distribution across the 21 autosomes, with
chromosome 19 possessing the highest gene density in
the genome (Grimwood, et al., 2004), and
chromosome 18 the lowest (Nusbaum, et al., 2005).
However, there were still holes in the map of the
human genome, in the form of euchromatic sequences,
with 341 gaps being unaccounted for 10 years ago (The
International Genome Sequencing Consortium, 2004).
In recent years this numbers has reduced to 160
(Genovese, et al., 2013), but still to this date structural
variation is poorly understood. However, by single-
molecule sequencing, it has been implied that there is
a 3:1 insertional bias corresponding to complex
insertions, and long short tandem repeats (Chaisson, et
al., 2014), suggesting greater complexity of the human
genome than first thought. Although, there is potential
for these to be resolved by used of longer-read
sequencing technology.
Decoding ENCODE
Sequencing of the 23 human chromosomes initiated a
system of cataloguing of most human genes encoding
proteins, and other important elements (e.g. non-
coding regulatory RNAs) (Venter & et-al., 2001). This
list has become a keystone in systems biology, giving
insight into how different structures/systems are
connected, their dynamics, and how function relates to
this (Hood, 2008). This catalogue manifested itself in
the form of the ENCODE (Encyclopedia Of DNA
Elements) Project, launched in 2003, looking to identify
all functional elements in the human genome sequence
(The ENCODE Project Consortium, 2004). The pilot
project focused on a 30 megabase region of the human
genome sequence, the results of which were published
in mid-2007, with a number of revelations as to
microRNA transcripts within the genome (Saini, et al.,
2007). Primarily it was implied that the human genome
was pervasively transcribed, with many novel non-
protein-coding regions (The ENCODE Project
Consortium, 2007; Alexander, et al., 2010). However, it
a number of non-coding regions produce thousands of
specifically-regulated lincRNAs enriched for trait-
associated Single Nucleotide Polymorphisms (SNPs)
(Hangaur, et al., 2013), as well as offering landing spots
for proteins that influence gene activity (Pennisi, 2012).
These discoveries allow for interrogation of newly-
discovered novel intergenic functional elements, such
as loss of function alleles and variants in non-essential
genes (MacArthur, et al., 2012).
SNP-ing Hap Hazardly
A map published in 2001 indicated that the human
genome possesses roughly 1.42 million SNPs. Roughly
60,000 falling within exon regions (coding and UTRs),
85% of which are within 5kb of the nearest SNP
(Sachidanandam, et al., 2001). From initial
identification of these commonly-occurring DNA
sequence variations, a number of associations have
been made with them, including; population diversity,
individuality, disease susceptibility, and individual
response to medicine (Shastry, 2002). The
International HapMap Project was founded in 2003 as
a means to determine sequence variation within the
human genome, using populations, including those in;
Africa, Asia, and Europe (The International HapMap
Consortium, 2003). By October, the HapMap project
had produced a haplotype map of the entire genome
identifying complete genotypes for over a million of
the identified SNPs, including DNA variations across 4
populations (The International HapMap Consortium,
2005). The results obtained have helped significantly in
moving forward research in genetically inherited
diseases (Manolio, et al., 2008), cancers (Hung, et al.,
2008; Cao, et al., 2013), and syndromes (Nezos, et al.,
2014; Vattikuti, et al., 2012), allowing for identification
of loci that heavily influence their manifestation. A
Phase II HapMap was generated in 2007, doubling the
number of known SNPs. The results also showed novel
aspects of linkage disequilibrium and that 10-30% of
pairs of individuals within a population share at least
one region of recent-ancestry extended genetic
identity, with 1% of common variants being untaggable
(The International HapMap Consortium, 2007).
Evidence has also shown that certain races show
susceptibility loci for particular diseases, dictated by
haplotype, and copy-number variations (Jakobsson, et
al., 2008). For example Asians show increased
expression of SNP rs9485372 near the TAB2 gene,
increasing breast cancer susceptibility (Long, et al.,
2012). In addition to SNPs, deletion polymorphisms in
the human genome have been determined using this
database. By analysing SNP genotype data form
parent-offspring trios, high-resolution population
surveys of deletion polymorphisms were produced.
The results from the experiment identified 586 distinct
regions harbouring deletion polymorphisms, 278
3. observed in unrelated individuals (Conrad, et al., 2006;
McCarroll, et al., 2006).
Chromosome
Identity
Initial Sequence
& Analysis Primary Findings Additional Details
1 2006
3,141 protein-encoding genes, and
991 pseudogenes
Mutations and rearrangements prevalent in cancer and many other diseases.
2 2005
1,346 protein-coding genes, and
1,239 pseudogenes
Unique to human lineage; product of head-to-head fusion of two intermediate-sized
ancestral chromosomes
3 2006
1,425 protein-encoding genes, 8
novel genes, 27 novel transcripts, 3
putative genes, and 122 pseudogenes
Comprises of just four contigs
Lowest rate of segmental duplication in the genome
Chemokine receptor gene cluster
Numerous loci involved in human cancers, e.g. gene encoding FHIT
4 2005
796 protein-coding genes, and 778
pseudogenes
Genes associated with; Huntington’s disease, Wolf-Hirschhorn syndrome, polycystic
kidney disease, and muscular dystrophy
5 2004
923 manually curated protein-coding
genes
One of the largest human chromosomes, yet lowest gene density
Encode protocadherin and interleukin gene families
6 2003
1,557 genes identified, and 633
pseudogenes
(6% of genome)
Genes directly implicated in; cancer, schizophrenia, autoimmunity, and many more.
7 2003
1,150 protein-coding genes, and
additional 941 pseudogenes
Unusual amount of segmentally duplicated sequence
8 2006
793 protein-encoding genes, and 301
pseudogenes
15Mb region on distal 8p with high mutation rate, possessing genes related to;
innate immunity, the nervous system, and MCPH1 gene cluster
9 2004 1,149 genes, and 426 pseudogenes
identified
Largest autosomal block of heterochromatin
Genes implicated in; mate-to-female sex reversal, cancer, and neurodegenerative
disease
10 2004
816 protein-coding genes, 430
pseudogenes identified
Identified 67 antisense transcripts
PTEN tumour suppressor, and RET proto-oncogene identified
11 2006 1,524 protein-coding genes, and 765
pseudogenes
40% of olfactory receptor genes in human genome located in 28 gene clusters along
chromosome.
85 genes encode for disorders.
12 2006
1,400 coding genes, and 487 loci that
have direct implications with human
disease
The q arm contains one of the largest linkage disequilibrium in entire human
genome
13 2004
633 genes identified, 296
pseudogenes, and 105 putative non-
coding RNA genes
Genes of interest; BCRA2 gene, RB1 gene
DAOA locus (bipolar + schizophrenia)
14 2003
1,050 genes identified, 393
pseudogenes
Two loci of crucial importance for immune system.
>60 disease genes localised
15 2006
695 protein-encoding genes, and 250
pseudogenes
High rate of segmental duplication, can result in; Prader-Willi and Angelman
syndromes.
16 2004
880 protein-coding genes, 19 RNA
transfer genes, 341 pseudogenes,
and 3 RNA pseudogenes
Genes identified; metallothionein, cadherin, and Iroquois families
Disease genes; polycystic kidney disease, acute myelomonocytic leukaemia.
17 2006 1,226 protein-encoding genes, and
274 pseudogenes
Second highest gene density in genome.
Implicated in wide range of genetic diseases, including; BRCA1, NF1, TP53, NAHR,
HNPP, SMS, and CMT1A
18 2005
377 protein-encoding genes, and 171
pseudogenes
Lowest gene density of any human chromosome.
Number of genetic disorders from trisomy and aneuploidy of gene
19 2004 1,461 protein-coding genes, and 321
pesudogenes
Highest density of all human chromosomes
Genes implicated in; Mendelian disorders (hypercholesterolaemia, insulin-resistant
diabetes)
20 2001
727 protein-encoding genes, and 168
pseudogenes
Genes encoding; protease inhibitors with antibacterial and antiviral activities,
reproductive proteins SEMG1+2
21 2000
127 protein-encoding genes, 98
predicted genes, and 59 pseudogenes
Several anonymous loci for monogenic disorders and predispositions for common
complex disorders mapped. Loss of heterozygosity observed in regions associated
with solid tumours.
X 2005
99 protein-encoding genes, 113 X-
linked genes
Number of genes expressed in various tumour types.
10% encode for Mendelian diseases.
Y 2003
MSY region is 95% of chromosomes’
length
78 protein-coding genes
Mosaic of heterochromatic sequences, 3 classes of euchromatic (X-transposed, X-
degenerate, and ampliconic).
Table 1: The results obtained from sequencing chromosomes 1-21, X, and Y, both before, and after, completion of the HGP (2001-2006).
The number of functional genes associated with each were shown to be highly variable, and a number were shown to possess links to a
number of genetic diseases (Hattori, et al., 2000; Deloukas, et al., 2001; Heilig, et al., 2003; Hillier, et al., 2003; Mungall, et al., 2003;
Skaletsky, et al., 2003; Deloukas, et al., 2004; Dunham, et al., 2004; Grimwood, et al., 2004; Humphray, et al., 2004; Martin, et al., 2004;
Schmutz, et al., 2004; Hillier, et al., 2005; Nusbaum, et al., 2005; Nusbaum, et al., 2005; Ross, et al., 2005; Scherer, et al., 2005; Gregory, et
al., 2006; Muzny, et al., 2006; Taylor, et al., 2006; Zody, et al., 2006; Zody, et al., 2006).
4. Methylation- How Well Does Our DNA Age?
DNA cytosine methylation is a stable epigenetic
modification integral to genome regulation,
development, and disease, via modulation of the
transcriptional plasticity of the genome (Eckhardt, et
al., 2006). It achieves this by interfering with the
transcription of genes by directly impeding binding of
transcription factor binding motifs (Choy, et al., 2010),
or by recruitment of histone deacetylases via methyl-
CpG-binding domain proteins (Esteller, 2006). Single-
base-resolution maps were generated for features of
the mammalian genome, including embryonic stem
cells (ESCs) and fetal fibroblasts (Lister, et al., 2009).
This has shown that the mechanism of methylation is
variable, and dependent on cell type. Whereas
methylation in non-CG contexts result in gene body
enrichment, resulting in depletion in protein binding
sites and enhancers. There is also evidence that de
novo methyltransferase activity is used to maintain
cellular pluripotency (Lister, et al., 2009). DNA
methylation is variable between species, a defining
factor for differentiation of humans from other
mammalian species (Pai, et al., 2011). However, upon
comparison of methylation patters with chimpanzees,
the T-DMR patterns were conserved between humans
and chimpanzees. However, levels of methylation
dictates the distinction between the two species, a
subset of genes underlies 12-18% of differences in
gene expression levels between the two species (Pai, et
al., 2011). As it stands, DNA methylation has taken us
closer to understanding variation that affects gene
expression between primate species. It has been
shown that DNA methylation age is a potential
measure of the cumulative effect of an epigenetic
maintenance system, and could be used to address
gaps in the knowledge when it comes to;
developmental biology, cancer and aging (Horvath,
2013).
Genomes Through The Ages
2007 saw the sequencing of the first full genome of an
individual human, consisting of ~32 million random
DNA fragments. It was sequenced b Sanger dideoxy
technology (Levy, et al., 2007). Upon comparison to the
National Center for Biotechnology Information (NCBI)
database it revealed the presence of over 4.1 million
DNA variants, encompassing 12.3 Mb, 22% of events
were non-SNP DNA variation, indication of how
important non-SNP genetic alterations are in the
diploid gene structure. The research, ended up
providing a base for future genome comparisons, and
facilitating the era of individualised genomic
information. This study prompted an explosion of
research into the genomes of past cultures and species.
The major contender in this project was the genome of
Neanderthals. In 2010, a draft sequence of our 30,000
year old relatives was produced (Green, et al., 2010),
followed up in 2013 with a complete sequence (Prufer,
et al., 2013). This high-quality sequence, gave insights
into the gene flow events that occurred between
Neanderthals, Denisovans and early humans, giving
interbreeding models that provide an insight into the
loci ancestry, and genome haplotypes that have given
rise to modern humans (Sankararaman, et al., 2014).
Historical, religious, and cultural traditions also have an
influence on geneflow and distribution, with ethno-
religious communities, sharing common traits (SNPs)
and phenotypic characteristics, which we may follow
back to old world populations (Behar, et al., 2010;
Hellenthal, et al., 2014). We can also extrapolate, to
produce global migration patterns, and monitor rise of
new, unique populations, such as Native Americans,
and Inuits (Rasmussen, et al., 2010; Rasmussen, et al.,
2014). Distinctive evidence has also shown that
Khosian and Bantu genomes are significantly different
from those previously mentioned on the nuclear
marker and mitochondrial levels (Schuster, et al., 2010).
How VARY Interesting…..
The human genome is an extremely complex system,
built upon the principle of variation, giving rise to
individuality. With most SNP’s having been assessed,
research has begun to focus on heritable components
of complex traits, and the variation that leads to their
manifestation (Frazer, et al., 2009). Genetic variation in
expression can be attributed to a number of things,
primarily meiosis, where recombination rates have
been seen to vary tremendously across the genome,
occurring in narrow ‘hotspots’, shown through linkage
disequilibrium (LD) and sperm-typing studies (Coop, et
al., 2008). These have also shown links to between-sex
variation. However, a number of structural variations
arise in DNA greater than 1kilobase in length, dictating
a number of insertion and deletion variances between
individuals (Conrad, et al., 2010; Kidd, et al., 453). The
variations observed within the genome range from
common and inconsequential to rare and deleterious
(Pelak, et al., 2010; Robinson, et al., 2014). Although
5. there has been a great deal of progress in identifying
disease variants, a large number remain unexplained,
and progress is being made in order to develop a high-
resolution map of functional human genetic variation
by studying numerous, geographically different
populations (International HapMap 3 Consortium,
2010; The 1000 Genomes Project Consortium, 2012;
Lappalainen, et al., 2013).
Proteins and You, the Future
With the human genome nearing complete sequencing,
there was only one logical step to take, to attempt to
sequence the human proteome. The Human Proteome
Project (HPP) was established in 2011, with the
intention of mapping the entire human proteome
(Legrain, et al., 2011). It aims to observe all of the
proteins produced by sequences translated from the
human genome, with about 30% of the estimated
20,300 protein-coding genes lacking sufficient protein-
level evidence in 2011 (Legrain, et al., 2011). However,
since this a draft map of the human proteome has been
generated by use of high-resolution Fourier-transform
mass spectrometry (Kim, et al., 2014). The map
constitutes of over 84% of the total 20,300 protein-
coding genes. For the remaining 16% of the proteome,
a number of further proteomic methodologies should
be employed, including; multiple protease analysis, N-
termini capture, Pot-translational enrichment of
modified peptides, fractionation, and technologies
such as; top-down mass spectrometry, and electron
transfer dissociation. As well as this, broadening the
tissue types tested.
References
Alexander, R. P. et al., 2010. Annotating non-coding regions of the
genome.. Nature Reviews. Genetics, 11(8), pp. 559-571.
Ashworth, L. K. et al., 1995. An integrated metric physical map of
human chromosome 19.. Nature Genetics, 11(4), pp. 422-427.
Behar, D. M. et al., 2010. The genome-wide structure of the
Jewish people. Nature, 466(7303), pp. 238-242.
Bell, C. J. et al., 1995. Integration of physical, breakpoint and
genetic maps of chromosome 22. Localization of 587 yeast
artificial chromosomes with 238 mapped markers. Human
Molecular Genetics, 4(1), pp. 59-69.
Bouffard, G. G. et al., 1997. A physical map of human chromosome
7: an integrated YAC contig map with average STS spacing of 79kb.
Genome Research, 7(7), pp. 673-92.
Cao, X. et al., 2013. RRM1 and RMM2 pharmacogenetics:
asociation with phenotypes in HapMap cell lines and acute
myeloid leukaemia patients. Pharmacogenomics, 14(2), pp. 1449-
1466.
Chaisson, M. J. P. et al., 2014. Resolving the complexity of the
human genome using single-molecule sequencing. Nature, 000(0),
pp. 1-11.
Choy, M.-K.et al., 2010. Genome-wide conserved concensus
transcription factor binding motifs are hyper-methylated. BMC
Genomics, 519(11), pp. 1-10.
Collins, F. S. et al., 1998. New goals for the U.S. Human Genome
Project 1998-2003. Science, 282(5389), pp. 682-689.
6. Conrad, D. F. et al., 2006. A high-resolution survey of deletion
polymorpism in the human genome. Nature Genetics, Volume 38,
pp. 75-81.
Conrad, D. F. et al., 2010. Origins and functional impact of copy
number variation in the human genome. Nature, Volume 464, pp.
704-712.
Consortium, 2001. Initial sequencing and analysis of the human
genome. Nature, 409(6822), pp. 860-921.
Coop, G. et al., 2008. High-Resolution Mapping of Crossovers
Reveals Extensive Variation in Fine-Scale Recombination Patterns
Among Humans. Science, 319(5868), pp. 1395-1398.
Deloukas, P. et al., 2004. The DNA sequence and comparative
analysis of human chromosome 10. Nature, Volume 429, pp. 375-
381.
Deloukas, P. & et-al., 1998. A physical map of 30,000 human
genes.. Science, 282(5389), pp. 744-746.
Deloukas, P. et al., 2001. The DNA sequence and comparative
analysis of human chromosome 20. Nature, Volume 414, pp. 865-
871.
Doggett, N. A. et al., 1995. An integrated physical map of human
chromosome 16.. Nature, 377(4), pp. 335-65.
Dunham, A. et al., 2004. The DNA sequence and analysis of human
chromosome 13.. Nature, 428(6982), pp. 522-528.
Dunham, I. & et-al., 1999. The DNA sequence of human
chromosome 22. Nature, Volume 402, pp. 489-495.
Eckhardt, F. et al., 2006. DNA methylation profiling of human
chromosomes 6, 20, and 22. Nature Genetics, 38(12), pp. 1378-
1385.
Esteller, M., 2006. CpG island methylation and histone
modifications: biology and clinical significance.. Ernst Schering
Research Foundation Workshop, Volume 57, pp. 115-126.
Frazer, K. A., Murray, S. S., Schork, N. J. & Topol, E. J., 2009.
Human genetic variation and its contribution to complex traits..
Nature Reviews. Genetics, 10(4), pp. 241-251.
Gemmill, R. M. et al., 1995. A second-generation YAC contig map
of human chromosome 3.. Nature, 337(4), pp. 299-319.
Genovese, G. et al., 2013. Using population admixture to help
complete maps of the human genome. Nature Genetics, Volume
45, pp. 406-414.
Green, R. E. et al., 2010. A Draft Sequence of the Neanderthal
Genome. Science, 328(5979), pp. 710-722.
Gregory, S. G. et al., 2006. The DNA sequence and biological
annotation of human chromosome 1. Nature, Volume 441, pp.
315-321.
Grimwood, J. et al., 2004. The DNA sequence and biology of
human chromosome 19. Nature, Volume 428, pp. 529-535.
Hangaur, M. J., Vaughn, I. W. & McManus, M. T., 2013. Pervasive
Transcription of the Human Genome Produces Thousands of
Previously Unidentified Long Intergenic Noncoding RNAs. PLOS
Genetics, 9(6), pp. 1-13.
Hattori, M. et al., 2000. The DNA sequence of human chromosome
21. Nature, Volume 405, pp. 311-319.
Heilig, R. et al., 2003. The DNA sequence and analysis of human
chromosome 14. Nature, 421(6923), pp. 601-607.
Hellenthal, G. et al., 2014. A Genetic Atlas of Human Admixture
History. Science, 343(6172), pp. 747-751.
Hillier, L. D. et al., 2005. Generation and annotation of the DNA
sequences of human chromosomes 2 and 4. Nature, Volume 434,
pp. 724-731.
Hillier, L. W. et al., 2003. The DNA sequence of human
chromosome 7. Nature, 424(6945), pp. 157-164.
Hood, L., 2008. A personal journey of discovery: developing
technology and changing biology. Annual Review of ANalytical
Chemistry, Volume 1, pp. 1-43.
Horvath, S., 2013. DNA methylation age of human tissues and cell
types. Genome Biology, 14(10), pp. 2-19.
Human Genome Program, 1990. Five-Year Plan Goes to Capitol
Hill. Human Genome News, 2(1).
Humphray, S. J. et al., 2004. DNA sequence and analysis of human
chromosome 9. Nature, 429(6990), pp. 369-374.
Hung, R. J. et al., 2008. A susceptibility locus for lung cancer maps
to nicotinic acetylcholine receptor subunit genes on 15q25.
Nature, Volume 452, pp. 633-637.
International HapMap 3 Consortium, 2010. Integrating common
and rare genetic variation in diverse human populations.. Nature,
467(7311), pp. 52-58.
International SNP Map Working Group, 2001. A map of human
genome sequence variation containing 1.42 million single
nucleotide polymorphisms.. Nature, 409(6822), pp. 928-933.
Jakobsson, M. et al., 2008. Genotype, haplotype and copy-number
variation in worldwide human populations.. Nature, 451(7181),
pp. 998-1003.
Kidd, J. M. et al., 453. Mapping and sequencing of structural
variation from eight human genomes. Nature, Volume 453, pp.
56-64.
Kim, M. S. et al., 2014. A draft map of the human proteome..
Nature, 509(7502), pp. 575-81.
Krauter, K. et al., 1995. A second-generation YAC contig map of
human chromosome 12.. Nature, 377(4), pp. 321-333.
Lappalainen, T. et al., 2013. Transcriptome and genome
sequencing uncovers functional variation in humans. Nature,
501(7468), pp. 506-511.
7. Legrain, P. et al., 2011. The human proteome project: current
state and future direction.. Molecular Cell Proteomics, 10(7).
Lennon, G., Auffray, C., Polymeropoulous, M. & Soares, M. B.,
1996. The I.M.A.G.E. Consortium: an integrated molecular analysis
of genomes and their expression.. Genomics, 33(1), pp. 151-152.
Levy, S. et al., 2007. The Diploid Genome Sequence of an
Individual Human. PLOS Biology, 5(10), pp. 2113-2144.
Lister, R. et al., 2009. Human DNA metylomes at base resolution
show widespread epigenomic differences. Nature, Volume 462,
pp. 315-322.
Long, J. et al., 2012. Genome-Wide Association Study in East
Asians Identifies Novel Susceptibility Loci for Breast Cancer. PLOS
Genetics, 8(2), pp. 1-10.
MacArthur, D. G. et al., 2012. A systematic survey of loss-of-
function variants in human protein-coding genes.. Science,
335(6670), pp. 823-828.
Manolio, T. A., Brooks, L. D. & Collins, F. S., 2008. A HapMap
harvest of insights into the genetics of common disease. The
Journal of Clinical Investigation, 118(5), pp. 1590-1605.
Martin, J. et al., 2004. The sequence and analysis of duplication-
rich huma chromosome 16. Nature, 432(7020), pp. 988-994.
McCarroll, S. a. et al., 2006. Common deletion polymorphisms in
the human genome. Nature Genetics, 38(1), pp. 86-92.
Mungall, A. J. et al., 2003. The DNA sequence and analysis of
human chromosome 6. Nature, Volume 425, pp. 805-811.
Muzny, D. M. et al., 2006. The DNA sequence, annotation and
analysis of human chromosome 3. Nature, Volume 440, pp. 1194-
1198.
Nagaraja, R. et al., 1997. X chromosome map at 75-kb STS
resolution, revealing extremes of recombination and GC content..
Genome Research, 7(3), pp. 210-222.
National Human Genome Research Institute, 2003. All Goals
Achieved; New Vision for Genome Research Unveiled. [Online]
Available at: www.genome.gov/11006929
[Accessed 26 October 2014].
Nezos, A. et al., 2014. B-cell activating factor genetic variants in
lymphomaghenesis associated with primary Sjorgen's Syndrome.
Journal of Autoimmunity, Volume 51, pp. 89-98.
Nusbaum, C. et al., 2005. DNA sequence and analysis of human
chromosome 8. Nature, Volume 439, pp. 331-335.
Nusbaum, C. et al., 2005. DNA sequence and analysis of human
chromosome 18. Nature, Volume 437, pp. 551-555.
Pai, A. A. et al., 2011. A Genome-wide Study of DNA Methylation
Patterns and Gene Expression Levels in Multiple Human and
Chimpanzee Tissues. PLOS Genetics, 7(2), pp. 1-11.
Pearson, P. L., 1991. The genome data base (GDB)--a human gene
mapping repository.. Nucleic Acid Research, Volume 19`, pp. 2237-
2239.
Pelak, K. et al., 2010. The Characterization of Twenty Sequenced
Human Genomes. PLOS Genetics, 6(9), pp. 1-10.
Pennisi, E., 2012. ENCODE Project Writes Eulogy for Junk DNA.
Science , 337(6099), pp. 1159-1161.
Prufer, K. et al., 2013. The complete genome sequence of a
Neanderthal from the Altai Mountains. Nature, 505(7481), pp. 43-
49.
Quackenbush, J. et al., 1995. An STS content map of human
chromosome 11: localization of 910 YAC clones and 109 islands..
Genomics, 29(2), pp. 512-25.
Rasmussen, M. et al., 2014. The genome of a Late Pleistocene
human from a Clovis burial site in western Montana. Nature,
Volume 506, pp. 225-229.
Rasmussen, M. et al., 2010. Ancient human genome sequence of
an extinct Palaeo-Eskimo. Nature, Volume 463, pp. 757-762.
Robinson, M. R., Wray, N. R. & Visscher, P. M., 2014. Explaining
additional genetic variation in complex traits. Trends in Genetics,
30(4), pp. 124-132.
Ross, M. T. et al., 2005. The DNA sequence of the human X
chromosome. Nature, 434(7031), pp. 325-337.
Sachidanandam, R. et al., 2001. A map of human genome
sequence variation containing 1.42 million single nucleotide
polymorphisms.. Nature, 409(6822), pp. 928-933.
Saini, H. K., Griffiths-Jones, S. & Enright, A. J., 2007. Genomic
analysis of human microRNA transcripts. Proceedings of the
National Academy of Sciences of the United States of America,
104(45), pp. 17719-17724.
Sankararaman, S. et al., 2014. The genomic landscape of
Neanderthal ancestry in present-day humans. Nature, 507(7492),
pp. 354-357.
Scherer, S. e. et al., 2005. The finished DNA sequence of human
chromosome 12. Nature, Volume 440, pp. 346-351.
Schmutz, J. et al., 2004. The DNA sequence and comparative
analysis of human chromosome 5. Nature, Volume 431, pp. 268-
274.
Schuster, S. C. et al., 2010. Complete Khoisan and Bantu genomes
from southern Africa. Nature, Volume 463, pp. 943-947.
Shastry, B. S., 2002. Jornal of human Genetics. SNP alleles in
human disease and evolution, 47(11), pp. 561-566.
Skaletsky, H. et al., 2003. The male-specific region of the human Y
chromosome is a mosaic of discrete sequence classes. Nature,
Volume 423, pp. 825-537.
8. Taylor, T. D. et al., 2006. Human chromosome 11 DNA sequence
and analysis including novel gene identification. Nature,
400(7083), pp. 497-500.
The 1000 Genomes Project Consortium, 2012. An integrated map
of genetic variation from 1,092 human genomes. Nature,
491(7422), pp. 56-65.
The ENCODE Project Consortium, 2004. The ENCODE
(ENCyclopedia Of DNA Elements) Project. Science, Volume 306,
pp. 636-640.
The ENCODE Project Consortium, 2007. Identification and analysis
of functional elements in 1% of the human genome by the
ENCODE pilot project. Nature, Volume 447, pp. 799-816.
The International Genome Sequencing Consortium, 2004.
Finishing the euchromatic sequence of teh human genome.
Nature, Volume 431, pp. 931-945.
The International HapMap Consortium, 2003. The International
HapMap Project. Nature, Volume 426, pp. 789-796.
The International HapMap Consortium, 2005. A haplotype map of
the human genome. Nature, Volume 437, pp. 1299-1320.
The International HapMap Consortium, 2007. A second
genertation human halotype map of over 1.3 million SNPs. Nature,
Volume 449, pp. 851-861.
The White House, 2000. PRESIDENT CLINTON ANNOUNCES THE
COMPLETION OF THE FIRST SURVEY OF THE ENTIRE HUMAN
GENOME Hails Public and Private Efforts Leading to This Historic
Achievement, Washington DC: The White House Briefing Room.
Thorisson, G. A. & Stein, L. D., 2003. The SNP Consortium website:
past, present and future. Nucleic Acid Research, 31(1), pp. 124-
127.
Vattikuti, S., Guo, J. & Chow, C. C., 2012. Heritability and Genetic
Correlations Explained by Common SNPs for Metabolic Syndrome
Traits. PLOS Genetics, 8(3), pp. 1-8.
Venter, J. C. & et-al., 2001. The Sequence of the Human Genome.
Science, 291(5507), pp. 1304-1351.
Zody, M. C. et al., 2006. DNA sequence of human chomosome 17
and analysis of rearrangement in the human lineage. Nature,
Volume 440, pp. 1045-1049.
Zody, M. C. et al., 2006. Analysis of the DNA sequence and
duplication history of human chromosome 15. Nature, Volume
440, pp. 671-675.