2. FLOW OF PRESENTATION
1. Genomics
2. Types of genomics
3. History of genomics
4. Rice genomics
5. Pre- Genome sequencing era in Rice
6. Whole genome sequencing of Rice
7. Post – Genome sequencing era in Rice
8. Applications
9. Case study
10.Conclusion
3. (Genomics word was coined by Thomas Roderick in 1986.)
(Study of structure & function of entire genome of a living organism)
(Study of the structure of
entire genome of an organism)
(Study of the function of
entire genome of an organism)
GENOMICS
Structural Genomics Functional Genomics
Comparative Genomics
(Study of the relationship of genome structure and
function across different biological species or strains)
4. • 1980 – DNA markers (RFLP)
• 1983 – Karry Mullis invented the PCR technique, and
• Several PCR based markers developed i.e. RAPD, AFLP, SSR,
SNP, CAPS, STS, SCAR, EST, DFP, etc..
• 1986 – Leroy Hood and Lloyd Smith developed the first semi-
automatic DNA sequencer
• 1990 – Development of Pyrosequencing by Pal Nyren
• 1990 – The U.S. National Institutes of Health (NIH) begins
large-scale sequencing trials on Mycoplasma , Escherichia coli,
Caenorhabditis elegans, and Saccharomyces cerevisiae
• 1995 – Craig Venter, Hamilton Smith at The Institute for
Genomic Research (TIGR) publish the first complete genome
of a free-living organism, the bacterium Haemophilus
influenzae
• 1996 – Sequence of saccharomyces cervisiae genome completed.
5. • 1998 – The genome of the 1st multi-cellular organism of the Round
worm (Caenorhabditis elegans) was completed.
• 1999 – Sequence of first human chromosome (chromosome 22nd)
• 2000 – The first plant to be completely sequenced is that of the
Arabidopsis thaliana.
• 2001 – A draft sequence of the human genome is published.
• 2002 – Rice genome sequencing was completed
• 2003 – Human genome sequencing was completed
• 2004 – 454 Life Sciences markets a version of pyrosequencing
machine developed. ‘454 Sequencing’ used in Barley and many others
sequencing .
The first version of their machine reduced sequencing costs 6-fold
compared to automated Sanger sequencing methods.
Recently,
• 2012 – Wheat genome was sequenced – R. Brenchley et al.
• 2012 – Pigeonpea genome was sequenced – RKV et al.
• 2013 – Chickpea genome was sequenced – RKV et al.
Nature Review, 2010 & Nature Biotechnology, 2001-2013
7. Rice is a model cereal plant
• The small size of its genome (430 Mb)
• Its relatively short generation time
• Its relative genetic simplicity (it is diploid, or has two copies of
each chromosome).
• Easy to transform genetically.
• Belongs to the grass family
• Its greatest biodiversity among the cereal crops
8. Developmental Milestones in Rice Genomics
Development of the first saturated (RFLP) map.
The application of PCR based markers such as SSR markers.
Identification of QTLs for many agronomically important traits
and marker assisted breeding.
Development of efficient techniques for genetic transformation
which makes rice the easiest cereal to transform.
Complete sequencing and annotation of indica and japonica rice
genomes and development of new generation markers.
Synteny between genomes of rice and other cereals.
9. In the early to mid 1990s,
RFLP and RAPD
Sequence Tagged Sites (STS) markers
Simple Sequence Repeats (SSR)
Pre- Genome sequencing era in Rice
10. The first SSRs were reported in 1996 (O. Panaud et al.,).
By 1997, there were 121 validated SSRs, were reported but they
had limited use for MAS, due limited genome coverage.
By 2001, a total of ~500 SSR that were developed from 57.8 Mb
of available rice genome data, which further increased the utility of
these markers.
12. Institute which sequence the particular chromosome
Sr no. Rice sequence participant Chromosomes
1 Rice Genome Research Program (RGP)Japan 1,6,7,8
2 Korea Rice Genome Research Program (Korea) 1
3 CCW(US) CUG(Clemson university)
Cold spring Harbor University
3,10
4 TIGR –US 3,10
5 PGIR-US 10
6 University of Wisconsin-US 11
7 National Center of Gene Research
Chinese Academy of science -china
4
8 Indian rice genome program-university of Delhi 11
9 Academia sinica plant genomic center (Taiwan) 5
10 Universidad fedral de Pelotas -Brazil 12
11 Kasetsant University –Thailand 9
12 MG Gill University –Canada 9
13 John innescenter –U.K 2
13. Milestone in rice genome sequencing
2)Feb 1998-IRGSP launched
under coordination of RGP
1)Sept 1997 – Sequencing of the
rice genome was initiated as an
international collaboration
among 10 countries
5)Dec 2002 – IRGSP
finished high-quality
draft sequence (clone-
by-clone approach) with
a sequence length,
excluding overlaps, of
366 Mb corresponding
to ~92% -RG
3)April 2000 – Monsanto
Co. produced a draft
sequence of BAC
covering 260 Mb of the
rice genome; 95% of rice
genes were identify
identified
4)Feb 2001 – Syngenta produced a
draft sequence & identified 32,000
to 50,000 genes, 99.8% sequence
accuracy & identified 99% of the
rice genes
6)Dec 2004-
IRGSP produce
the high-quality’
sequence -entire
rice genome; with
99.99% accuracy
& without any
sequence gap
14. Indian complete work on RG sequence
India joined -IRGSP in June 2000 and chose to sequence a part of chromosome 11.
India has invested Rs.48.83 crores for the "Indian Initiative for Rice Genome
Sequencing (IIRGS)".
The initiative is a joint effort by the Department of Plant Molecular Biology (DPMB),
University of Delhi South Campus (UDSC) and the National Research Centre on Plant
Biotechnology (NRCPB) and the Indian Agricultural Research Institute (IARI), New
Delhi.
Findings…………..
The chr. 11 was known to carry several diseases resistant gene including
Xac1bacterial blight resistant gene.
Chromosome segment sequence by IARI involve ~6.825 million bp & predicted 1005
genes with unknown function.
15. IRGSP
The IRGSP effort evolved around a few basic points:
The sequencing strategy.
The rice cultivar to be sequenced.
The accuracy of sequence and the sequence release policy.
16. Nipponbare ???????????
Rice Genome Research Program (RGP), Japan, used it as a source
of EST sequencing and constructed a dense linkage and YAC
physical map.
The guidelines for the method of sequencing, sequence quality and
release policy were developed largely on the same lines as the
Human Genome Project
17. Sequence-ready physical map developed by PAC library
comprising of 71,040 clones and a BAC library consisting of 48,960
clones.
BAC library (~90,000 clones) made at Clemson University
Genomics Institute (CUGI) and BAC libraries made by Monsanto.
Two libraries for each clone having an insert size of 2 and 5 kb,
respectively.
Backbone of IRGSP
18. The IRGSP had set the target to finish the rice genome sequence
by 2008. This goal changed when Monsanto released the draft
sequence of ‘japonica’ in 2000.
Two other groups, Syngenta and BGI published drafts of
‘japonica’ and ‘indica’ simultaneously in 2002.
The draft sequence was released by the consortium at a meeting held
in Japan in December 2002. This task was speeded by Monsanto’s
decision to provide its BAC libraries sequenced up to 5X coverage to
IRGSP.
20. Clone by clone sequencing
Clone by clone sequencing also called as the directed sequencing of the BAC contigs.
The chromosomes were mapped
Then split up into sections
A rough map was drawn for each of these sections
Then the sections themselves were split into smaller bits.
Each of these smaller bits would be sequenced.
(* BAC clones (80-100 kb long DNA fragments ) arranged in contigs.)
21. In this approach, genomic DNA is cut into pieces
Inserted into BAC vectors
Transformed into E. coli where they are replicated
The BAC inserts are isolated
Mapped to determine the order of each cloned fragment.
Each BAC fragment in the Golden Path is fragmented randomly into smaller pieces
Each piece is cloned into a plasmid
Sequenced on both strands.
These sequences are aligned so that identical sequences are overlapping.
THE HIERARCHICAL SHOTGUN SEQUENCING METHOD
This is referred
to as the Tiling
Path.
22. Rice Genome Annotation
The accuracy of genome sequence should be evaluated by the
quality of annotation, i.e. assignment of biological function to the
sequence.
Gene modeling for a given sequence using gene prediction and
similarity search programmes facilitates gene discovery in a
systematic and comprehensive manner.
23. Rice GAAS (Rice Genome Automated Annotation System) has been
developed by combining
Coding region prediction programmes
Splice site prediction programmes (Sakata et al., 2002)
T-RNA gene prediction programme
Similarity msearch analysis programmes.
The interpretation of the coding region is though fully automated,
gene modeling is accomplished with manual evaluation
24. What does Rice genome sequencing reveals?
The map-based sequence covered 95% of the 389 Mb rice
genome.
A total of 37,544 genes with an average gene density of one gene
per 9.9 kb and average gene length of 2,699 bp.
Chromosomes 1 and 3 have the highest gene density.
Chromosomes 11 and 12 have the lowest gene density.
25. Transposable elements was maximum for chromosome 8 (38%) and
12 (38.3%) and least for chromosome 1 (31%), 2 (29.8%) and 3 (29%).
Contains at least 35% repeat elements.
japonica genome sequence showed almost 60% of the genome is
duplicated.
421 chloroplast and 909 mitochondrial DNA insertions contributing
to ~0.2% each of the nuclear genome.
GC content of 43.6% with 54.2% and 38.3% of exons and introns
respectively.
26. Post - Genome sequencing era in Rice
The post – Genome sequencing era made an opening for the
“treasure chest” of new rice markers. They are:
SSR
SNPs
INDELS
Custom made markers
27. SSRs:
Using more than 2200 validated SSRs were released in 2002.
18,828 class 1 SSRs were released after the completion of
Nipponbare genome sequence in 2005.
The extremely high density of SSRs(approx. 51 SSRs per Mb).
Now in Rice there are around 24,000(approx.) SSR markers
available in the database.
28. Single nucleotide polymorphisms (SNPs)
They are most abundant and ubiquitous polymorphisms.
Lower levels of SNP marker polymorphism are detected in
(indica x indica or japonica x japonica derived material), when
compared with the japonica-indica reference genotypes.
The frequency of SNPs between subspecies was 0.68% to 0.70%,
whereas, it was 0.03% to 0.05% between japonica cultivars and
0.49% between indica cultivars.
Currently the total number of collected SNPs are 2,34,58,338 in 17
accessions/cultivars.
29. Indels (Insertion/deletion):
Identified in silico by direct comparison of japonica and indica
genome sequences.
A large No. of Indels were reported in indica × japonica populations.
Introns “tolerate” insertion/deletion mutations compared with exons.
Many Indels have been identified and been exploited by the
development of Intron Length Polymorphic (ILP)markers.
Majority of them are reliable and co-dominant and also
polymorphic between varieties within both subspecies.
30. “Custom-made” markers
The information of genome sequences, permits development of
markers that are tightly linked to target loci i.e.“custom-made” or
“tailor-made”.
The number of markers are generated using the rice genome
sequence.
Candidate gene (CG) identification can be integrated with
customized marker design that are usually more tightly linked to the
gene or QTL controlling the trait.
31. Comparison of Rice with cereal genomes
Analysis of rice genome sequence draft showed that homologues
of almost 98% wheat, barley and maize proteins could be
identified in rice.
32. Wheat–rice synteny was done using 4,485 wheat ESTs. revealed
that there was a general conservation of genes and their order in the
two species.
Rice and Maize revealed 656 putative orthologs with several
breaks in co-linearity.
Similar sequence-based alignments of rice done with sorghum and
barley revealed that there were some rearrangements along with a
general conservation of synteny.
33. Comparative genomics based on the syntenic relationship of rice
with other cereals has helped in, such as
QTL for malting quality in barley
Major heading date QTL in perennial ryegrass
Liguleless region in sorghum
Ror2, to powdery mildew disease in barley.
34. The 3K Rice Genome project
Rice is known for tremendous within-species and within genus
genetic diversity.
Exploring this diversity at the sequence level has been, until
recently, only a dream of rice scientists.
“The 3,000 (3K) Rice Genomes Project” is the answer for it….
35. Joint organizers of the project:
1. The Chinese Academy of Agricultural Sciences (CAAS),
2. the Beijing Genomics Institute (BGI)
3. The International Rice Research Institute (IRRI)
It is a major step towards revealing the genomic diversity in all of
the world’s rice germplasm collections.
36. Current status and Plans
Sequencing of 3,000 rice genomes has completed .
Which contains ………
Diverse set accessions originating from 89 countries.
Accessions from the ~180,000 rice accessions conserved in the
International Rice Genebank Collection (IRGC) at IRRI and the
China National Crop Genebank (CNCG).
400 parental lines of popular varieties and genome-wide
introgression lines for multiple complex traits.
37. Outcomes of 3K rice project
The outcome of the 3K Rice Genomes Project :
1. New population-specific genotyping arrays useful to a wide
range of genetic and breeding applications.
2. Population structures that have been shaped by evolution,
domestication, selection.
3. Identification of unique cryptic structural genomic variants
across the rice genome
38. Sequencing- based GWAS in rice
The efficient detection of the genetic diversity of germplasms
for mapping of agronomically important traits.
GWAS in rice showed that the integrated approach of sequence-
based GWAS and functional genome annotation can be used as a
complementary strategy to classical biparental cross mapping for
dissecting complex traits in rice.
39. Rice breeding in Post Genomics era
To achieve ‘Green Revolution’ (GR) in 1960s, which doubled rice
productivity under the modern high-input agricultural conditions extensive
efforts were made.
The successful commercialization of hybrid rice in China since late
1970s resulted in a second leap in rice productivity.
However, the world rice production has to be doubled again by 2030 to
meet the projected demand of the increasing world population and much of
this increase has to come from improved rice cultivars.
40. ‘Super inbred and hybrid rice’ cultivars produced by
‘Ideotype’ breeding .
Exploiting inter-sub specific heterosis.
But,
‘super rice’ or ‘super hybrid rice’ cultivars require very high inputs
to realize their yield potentials.
Resulting in serious environmental pollution and related
problems.
41. Modern semi-dwarf rice cultivars have rarely achieved their yield
potentials in farmers’ fields because of many abiotic and biotic
stresses.
To achieve sustainable yield increases of rice, there has been a call
for developing ‘Green Super Rice’ (GSR) cultivars that can
produce high and stable yields under lesser inputs.
In addition, high iron and zinc contents have become important
objectives in many breeding programs.
42. The future rice breeding would require breeders that improve many
‘green’ traits in addition to high yield potential and desirable
quality.
This can be achieved by……………………
Improving the conventional breeding methodologies with the high
throughput genomic techniques.
44. In conventional breeding normally presence of genes can be
identified only when they are expressed.
45. By the use of genomic research now we can
easily identify the presence or absence of
gene in early stage
46. 1.Genotype identity testing
For simple F1 hybrids
Seed purity or intra-variety variation
Hybrid rice lines
SSRs from mitochondrial genes have been targeted for the
development of markers to study maternally inherited traits such as
cytoplasmic male sterility or the maternal origin of rice accessions.
47. 2. Genetic diversity analysis of breeding material
Hybrid rice breeding
3.Gene surveys in parental material
An example of this was demonstrated by Wang et al.(2007) who
used a set of dominant allele specific markers for surveying
markers to detect the presence of the Pi-ta resistance gene for rice
blast in a large germplasm collection.
48. Some important genes tagged using molecular
markers
Trait Genes Markers Chromosome
reference
Blast resistance Pi-1 RZ 536, RG303, NpB 181 11
Pi-2 RG 64.XNpb 294 6
Bacterial blight
resistant
Xa-1 XNpb 235, XNpb 120 4
Xa-21 Y03700 4
Gall midge Gm(2) RG 329, RG476 4 etc.
49. 4. Marker-assisted backcrossing (MABC)
5. Pyramiding
6. Use in Trans-genes: For example, transgenic rice (southern
U.S. japonica-type varieties) with inherent ability to produce beta-
carotene developed by Syngenta is available at IRRI and in several
other national programs.
GR1 events (GR1-146, GR1-309, and GR1-652) as donor
parents, while 2 IRRI-bred mega varieties (IR64 and IR36) and a
popular Bangladeshi variety (BR29) were used as recurrent
parents.
56. IRRI and China ever since 1998 for developing GSR cultivars.
This strategy contains two major well integrated components in
three steps:
1) Developing trait specific introgression lines(IL’s).
2)Large scale gene/QTL discovery and allele mining.
3) Developing GSR cultivars with multiple green traits by designed
QTL pyramiding (DQP) or by Marker assisted Recurrent
selection(MARS).
Large scale rice Molecular breeding- an example
57. Large scale backcross breeding activities were conducted using 25
best commercial varieties and hybrid parents as the recipients and
203 mini-core germplasm accessions from worldwide as donors.
Advanced backcross populations developed from crosses were
screened and progeny tested for a wide range of many abiotic and
biotic stresses, resulting in the development of multiple sets of trait-
specific ILs.
Step-1
58. Results obtained from the massive introgression breeding activities gave
the information :
1. Tremendous amounts of useful genetic diversity in the gene pool of O.
sativa for all complex target traits, which are hidden in the exotic
germplasm accessions.
2. Backcross breeding with strong phenotypic selection is a powerful way
to exploit this rich source of hidden genetic diversity.
3. Selection of parental lines for breeding based on target phenotype(s)
practiced by most breeders is a poor way in exploiting this hidden
genetic variation for complex traits.
59. Selected ILs will be progeny tested in replicated experiments
for the selected target traits and important non-target traits along
with genotyping to detect QTL and QTL networks.
The generated genetic information is used for characterizing
genome wide responses to strong phenotypic selection in the
ILs.
Step-II
60. Superior ILs carrying favorable alleles from different donors are
selected based on accurate genetic information generated in step II
to cross with one another.
Segregating populations from these crosses will be subjected to
strong phenotypic selection and/or GS for developing new
cultivars.
Selected progeny will be characterized for target and non-target
traits in genotyping and phenotyping experiments to verify loci for
target traits identified in step II and to remove undesirable genetic
drags.
Step-III
62. Conclusion
The recent integration of advances in molecular biology,
genomic research, transgenic breeding and molecular marker
applications with conventional plant breeding practices has created
the foundation for molecular plant breeding or ‘precision’
breeding.
Rice genomics can play a significant role in enhancing the
quantity and quality of rice production in order to feed more of the
world’s population.