SlideShare une entreprise Scribd logo
1  sur  90
Genomics: Organization of Genome, Strategies of
Genome Sequencing, Model Plant Genome Project,
Functional Analysis of Genes
Promila Sheoran
PhD Biotechnology
GJU S&T Hisar
Genome Organization
•The word “genome,” coined by German botanist Hans Winkler
in 1920, was derived simply by combining gene and the final
syllable of chromosome.
• If not specified, “genome” usually refers to the nuclear
genome!
•An organism’s genome is defined as the complete haploid
genetic complement of a typical cell.
• The genetic content of the organelles in the cell is not
considered part of the nuclear genome.
• In diploid organisms, sequence variations exist between the
two copies of each chromosome present in a cell.
•The genome is the ultimate source of information about an
organism.
Continue…
•The number of genomes sequenced in their entirety is
now in the thousands and includes organisms ranging from
bacteria to mammals.
•The first complete genome to be sequenced was that of
the bacterium Haemophilus influenzae, in 1995.
•The first eukaryotic genome sequence, that of the yeast
Saccharomyces cerevisiae, followed in 1996.
• The genome sequence for the bacterium Escherichia coli
became available in 1997 .
Hierarchy of gene organization
Gene – single unit of genetic function
Operon – genes transcribed in single transcript
Regulon – genes controlled by same
regulator
Modulon – genes modulated by
same stimulus
Element – plasmid, chrom-
osome,phage
Genome
** order of ascending
complexity
Prokaryotes and Eukaryotes genome
Prokaryotes Eukaryotes
Single cell Single or multi cell
No nucleus Nucleus
One piece of circular DNA Chromosomes
No mRNA post transcriptional
modification
Exons/Introns splicing
Prokaryotic Genome Organization
Prokaryotes
 The genome of E. coli contains 4X106
base pairs
 > 90% of DNA encode protein
 Lacks a membrane-bound nucleus.
 Circular DNA and supercoiled
domain
 Histones not present
o Prokaryotic genomes generally contain
one large circular piece of DNA
referred to as a "chromosome" (not a
true chromosome in the eukaryotic
sense).
o Some bacteria have linear
"chromosomes".
o Many bacteria have small circular DNA
structures called plasmids which can
be swapped between neighbors and
across bacterial species.
Continue…
o The term plasmid was first introduced
by the American molecular biologist
Joshua Lederberg in 1952.
o A plasmid is separate from, and can
replicate independently of, the
chromosomal DNA.
o Plasmid size varies from 1 to over 1,000
(kbp).
Plasmid
Eukaryotic genome organization
More about the nuclear genome:
• Multiple linear chromosomes, 5000 to 50000 genes
• Mono-cistronic transcription units
• Discontinuous coding regions (introns and exons)
• Large amounts of non-coding DNA
• Transcription and translation take place in different compartments
• Variety of RNA genes: rRNA, tRNA, snRNA (small nuclear), sno (small
nucleolar), microRNAs, etc.
• Often diploid genomes and obligatory sexual reproduction
• Standard mechanism of recombination: meiosis
• Multiple genomes: nuclear, plastid genome, mitochondria, chloroplasts
• Plastid genomes resemble prokaryotic genomes
EUKARYOTIC GENOME
‘The nucleus is heart of the cell, which serves as the main distinguishing
feature of the eukaryotic cells. It is an organelle submerged in its sea of
turbulent cytoplasm which has the genetic information encoding the past history
and future prospects of the cell. Nucleus contains many thread like coiled
structures which remain suspended in the nucleoplasm which are known as
chromatin substance’
Chromatin is the complex combination of DNA and proteins that makes up
chromosomes.
The major proteins involved in chromatin are histone proteins; although many
other chromosomal proteins have prominent roles too.
.
The functions of chromatin is to package DNA into smaller volume to fit in the
cell, to strengthen the DNA to allow mitosis and meiosis and to serve as a
mechanism to control gene expression and DNA replication.
ORGANIZATION OF CHROMATIN
In resting non-dividing eukaryotic cells, the genome is in the form
of nucleoprotein-complex- the chromatin.
(randomly dispersed in the nuclear matrix as interwoven network of fine chromatin threads)
The information stored in DNA is organized, replicated and read with the help
of a variety of DNA-binding proteins:
Structural Proteins- Histones(Packing proteins):
Main structural proteins found in eukaryotic cells
Low molecular weight basic proteins with high proportion of
positively charged amino acids,
Bound to DNA along most of its length,
The positive charge helps histones to bind to DNA and play a
crucial role in packing of long DNA molecules.
Functional Proteins- Non- Histones:
Associated with gene regulation and other functions of chromatin.
Hierarchy of Chromatin Organization in the Cell Nucleus:
Nuclear Matrix Associated Chromatin Loops
Next Generation Sequencing
• DNA sequencing is the process of determining the precise
order of nucleotides within a DNA molecule
• Refers to non-Sanger-based high-throughput DNA sequencing
technologies
ILLUMINA SEQUENCING
• Step 1: Sample Preparation
• Steps 2-6: Cluster Generation by Bridge Amplification
• Steps 7-12: Sequencing by Synthesis
1. Solid-phase amplification can produce 100-200 million spatially separated
clusters, providing free ends to which a universal sequencing primer can be
hybridized to initiate the NGS reaction
454 Sequencing
• Emulsion-based sample preparation (emPCR)
• Pyrosequencing: non-electrophoretic, bioluminescence
method that measures the release of inorganic
pyrophosphate by proportionally converting it into visible light
using a series of enzymatic reaction
Step 1:
Step 2: Loading DNA Sample onto
Beads
Step 3: Sequencing
Sequence Assembly
• Sequence assembly refers to aligning and merging fragments
of a much longer DNA sequence in order to reconstruct the
original sequence.
• First sequence assemblers began to appear in the late 1980s
and early 1990s
Why We Need genome assemblers
• Terabytes of sequencing data which need processing
on computing clusters
• Identical and nearly identical sequences increase the time and
space complexity of algorithms exponentially;
• Errors in the fragments from the sequencing instruments
Basic Principles Of Assembly
• Sequence and quality data are read and the reads are
cleaned.
• Overlaps are detected between reads. False overlaps,
duplicate reads, chimeric reads and reads with self-matches
are also identified
• The reads are grouped to form a contig layout of the finished
sequence.
• A multiple sequence alignment of the reads is performed,
and a consensus sequence is constructed for each contig
layout
• Possible sites of mis-assembly are identified by combining
manual inspection with quality value validation.
Types of Sequence
assembly
Mapping Assembly De-novo assembly
Mapping Assembly
• Assembles reads against an existing backbone sequence,
building a sequence that is similar but not necessarily
identical to the backbone sequence
• Compared to de novo assembly, the mapping of resequenced
reads to a template genome is a computationally easier
problem
• Use seeding techniques
• Seeds of fixed length allow for not more than one or two
mismatches. In addition, the capability to detect insertions
and deletions is very limited and most programs can only
detect indels in subsequent alignment runs
Tools for Mapping Assembly
• MAQ-Particularly designed for Illumina
• SOAP-program for efficient gapped and ungapped alignment
of short oligonucleotides onto reference sequences
• SHRiMP -Developed with Applied Biosystem
• SOCS - Aligns SOLiD data
• Eland -Efficient Large-Scale Alignment of Nucleotide
Databases
• GMAP- Genomic Mapping and Alignment Program for mRNA
and EST Sequences
Other examples
• Bowtie
• BWA
• BFAST
• GenomeMapper
• Novocraft
• PASS
• SeqMap
• SXOligoSearch
• Zoom
De-novo assembly
• Assembles short reads to create full-length sequences.
• De novo assembly software must deal with sequencing errors,
repeat structures, and the computational complexity of
processing large volumes of data.
De-novo assembly tools
• ABySS - Assembly By Short Sequences designed for very short
reads
• ALLPATHS-De novo assembly of whole-genome shotgun
microreads
• Velvet -designed for short read sequencing technologies
• Edena- Exact DE Novo Assembler
• MIRA2- Mimicking Intelligent Read Assembly is able to
perform true hybrid de-novo assembly
Other examples
• EULER-SR
• SEQAN
• SHARCGS
• SSAKE
• SOAPdenovo
• VCAKE
• Newbler,
• CAP3,
• IDBA,
• PE-Assembly,
• Telescope
Overview of de novo short reads assemblers.
Recommended assemblers for different genome assembly
Plant Genome Project
• Arabidopsis
• Rice
• Tomato
• Chickpea
• Poplar
Arabidopsis thaliana genome project
Arabidopsis: The Model Plant
• Relative genetic simplicity
• Fast life cycle
• Susceptibility to manipulation through genetic engineering
• Convenience and abundance
• Basic similarities to other crops
Arabidopsis genome
• Contains about 125 Mb of sequence
• Contains 25,500 genes
• 5 chromosomes
• Has 35% unique genes
Arabidopsis Genome Initiative (AGI)
• Collaboration of the U.S. Department of Energy and the U.S.
Department of Agriculture, The European Union, the
Government of France, and the Chiba Prefectural Government
in Japan
• August 1996- National Science Foundation (NSF) in Arlington,
VA
Major Highlights of Genome Project
•1990- Arabidosis genome project initiated
•1995 standard bac and p1 libraries constructed
•1996- Arabidopsis genome initiative organized
•1997-physical maps of all chromosomes completed
•1999- chromosomes 2 and 4 sequenced
•2000- completion of genome sequence
Applications
• Understanding Photosensitivity
• Creating Healthier Edible
• Manufacturing Biodegradable Plastics.
• Making Vegetables and Fruits Cheaper and Hardier
• Improving Erosion Resistance
• Understanding How Plants Flower
Rice Genome Project
Rice genome
• Smallest among grass genomes (Wheat, oat, rye, Barley, corn)
• Size: 430 Mbp (3.3 X Arabidopsis)
• 12 chromosomes
• Approximately 62,435 genes
• Repetitive elements: Most in intergenic regions versus in
introns in humans
IRGSP (International Rice Genome Sequencing
Project)
• Established in 1997
• Comprised of ten members: Japan, the United States of
America, China, Taiwan, Korea, India, Thailand, France, Brazil,
and the United Kingdom
• IRGSP adopts the clone-by-clone shotgun sequencing strategy
Milestones
• 1997- sequencing of rice genome was initiated as an international
collaboration among 10 countries
• 1998- IRGSP (International Rice Genome Sequencing Project) was
launched under the coordination of the Rice Genome Project (RGP) of
genome
• 2000- Monsanto Co produced a draft sequence of BAC contigs covering
260 Mb of rice geome; 95% of rice genes were identified
• 2001- syngenta produced a draft sequence and identified 32000 to 50000
genes with 99.8% accuracy and identified 99% of rice genes
• 2002- IRGSP finished high quality draft sequence (clone-by-clone
approach) with a sequence length excluding overlaps , of 366 Mb
corresponding to ~92% of rice genome
• 2004- IRGSP produce the high-quality sequence of entire rice genome with
99.99% accuracy and without any sequence gap
Applications
• First crop plant to be sequenced, therefore have a great
impact in agriculture
• Useful in understanding the genome of other crops in the
grass family including corn, wheat, barley, rye and sorghum
• Identification of agronomically important traits - genes that
affect growth habit to promote yield and photoperiod genes
to extend the range of elite cultivars.
Tomato Genome Project
• Tomato (Solanum Lycopersicon)
– economically important crop worldwide,
– intensively investigated and
– model system for genetic studies in plants.
• Characteristics:
– Simple diploid genetics: 12 chromosome pairs and
950 Mb genome size.
– Short generation time
– Routine transformation technology
– Rich genetic and genomic resources.
International Tomato Genome Sequencing
Project
• Started in 2004
• Participants were Korea, China, the United Kingdom, India,
the Netherlands, France, Japan, Spain, Italy and the United
States
• The initial approach was to sequence only the euchromatic
sequence using a BAC-by-BAC approach
• In 2009, a complementary whole-genome shotgun approach
was initiated and finally sequenced in 2012.
Applications
• Tomato as a reference genome sequence
• Understanding Diversification & Adaptation
• Exploring the Role of Natural Diversity in the Genetic
Improvement of Crops
Chickpea Genome Project
• Second most widely grown legume crop after soybean
• Approximately 28,269 genes of chickpea were identified
• Approximately 738 Mb genomic sequence
• Half (49.41%) of the chickpea genome is composed of
transposable elements and unclassified repeats
International Chickpea Genome Sequencing
Consortium
• Role of ICGSC:
1. To ensure data and information on the chickpea is
readily available to all researchers,
2. To help avoid duplication of research efforts,
3. To provide a framework for accessing national and
international collaboration,
4. To help keep chickpea research at the cutting edge of
genetic research.
Applications
• The sequencing would help reduce the time to breed new
chick pea varieties as plant breeders would now have access
to genes with the required traits.
• The availability of these genome sequences facilitate de novo
assembly of the genomes of other important but less-studied
legume crops.
Poplar genome project
• First tree DNA to be sequenced because of relatively compact
genetic complement
• Genome sequence was published in 2006.
• Third plant genome to be published
• Contains a whole genome duplication
• Includes ~370 megabases of sequence
• 19 chromosomes
• 41,377 protein coding genes
International Populus Genome Consortium
Goals
• Examine the suite of genetic resources in Populus that are
currently available to the scientific community,
• Integrate genomics with physiology and ecology in an effort to
understand and manipulate tree growth, development and
function
• Develop the ability to attain predictive understanding of tree
growth, development, and complex function.
Applications
• Offers the opportunity and modify to study genes related to
commercial important traits
• Opportunity to better understand the distribution of genes
across the landscape
• Poplar genome project covers the promise and possibility of
uncovering and understanding mechanisms uniquely
associated with perennial woody plant growth, development
and ecology.
• Able to address issues related to interpret annual cycling of
nutrients, water movement up dozens of meters in height,
perennial crown development and wood formation.
Function analysis of genes
Different tools
1. Virus-induced gene silencing (VIGS)
2. CRES-T
3. RNA Interference
Virus-induced gene silencing (VIGS)
• Effective strategy for rapid functional analysis of genes in
plant tissues
• Elegant tool for functional characterization of genes
associated with abiotic stress response
• VIGS is rapid (3–4 weeks from infection to silencing)
• Does not require development of stable transformants
• Allows characterization of phenotypes that might be lethal in
stable lines
• Offers the potential to silence either individual or multiple
members of a gene family
Example
• Knockdown of TaNAC1 with barley stripe mosaic virus-induced
gene silencing (BSMV-VIGS) enhanced stripe rust resistance
CRES-T
• Chimeric REpressor Gene-Silencing Technology (CRES-T)
• Chimeric repressor produced by fusion of a transcription
factor to the plant-specific repression domain (SRDX)
suppresses the target genes of a transcription factor
• Useful tool for functional analysis of redundant plant
transcription factors and the manipulation of plant traits
About RNAi
• RNA interference (RNAi) is a system within living cells that
takes part in controlling which genes are active and how
active they are. Two types of small RNA molecules –
microRNA (miRNA) and small interfering RNA (siRNA) – are
central to RNA interference.
• RNAs are the direct products of genes, and these small RNAs
can bind to other specific RNAs (mRNA) and either increase or
decrease their activity, for example by preventing a
messenger RNA from producing a protein.
• RNA interference has an important role in defending cells
against parasitic genes – viruses and transposons – but also in
directing development as well as gene expression in general.
The Mechanism of RNA Interference
 The long dsRNAs enter a cellular pathway that is
commonly referred to as the RNA interference
(RNAi) pathway.
 First, the dsRNAs get processed into 20-25
nucleotide (nt) small interfering RNAs (siRNAs) by
an RNase III-like enzyme called Dicer.
 Then, the siRNAs assemble into endoribonuclease-
containing complexes known as RNA-induced
silencing complexes (RISCs), unwinding in the
process.
 The siRNA strands subsequently guide the RISCs to
complementary RNA molecules, where they cleave
and destroy the cognate RNA
 Cleavage of cognate RNA takes place near the
middle of the region bound by the siRNA strand.
Approaches for candidate gene discovery
Approaches for
candidate gene
discovery
Traditional
candidate gene
approach
Position
dependent
strategy
Comparative
genomics
strategy
Function
dependent
strategy
Combined
strategy
Digital
candidate gene
approach
Traditional candidate gene approach
Position dependent strategy
• Identification of candidate gene is based on the physical
linkage information in a QTL-identified chromosomal segment
• Example – position of QTLs controlling field blast resistance in
rice
• Isolation of Arabidopsis AB13 gene
Comparative genomics strategy
• Includes comparative functional genomics strategy and
comparative structural genomics strategy
• Candidate genes may be functionally conserved or structurally
homologous genes
Function dependent strategy
• Results in the functional candidate gene approach, in which a
putative candidate gene is the one that could be statistically
detected from the genes controlling large components of
inheritable gene expression variation.
• Example- identification of new disease resistance genes in
Tobacco
Combined strategy
• Combines at least two strategies
• Genetical genomic approach originating from function-
dependent strategy provides powerful means to identify
candidate genes.
• Example- selection of candidate genes for grape
proanthocyanidin pathway
Digital candidate gene approach
(DigiCGA)
• Novel web resource-based candidate gene identification approach.
• DigiCGA can be defined as an approach that objectively extract,
filter, (re)assemble, or (re)analyze all possible resources available
derived from the public web databases mainly in accordance with
the principles of biological ontology and complex statistical
methods to make computational identification of the potential
candidate genes of specific interest.
• A combination of RNA-seq and DGE analysis based on the next
generation sequencing technology was shown to be a powerful
method for identifying candidate genes encoding enzymes
responsible for the biosynthesis of novel secondary metabolites in a
non-model plant. Seven CYP450s and five UDPGs were selected as
potential candidates involved in mogrosides biosynthesis. The
transcriptome data from this study provides an important resource
for understanding the formation of major bioactive constituents in
the fruit extract from S. grosvenorii.
Deciphering the function of gene in plant
secondary metabolism
• To complete the metabolic map for an entire class of
compounds, it is essential to identify gene-metabolite
correlations of a metabolic pathway
• Effective approach to predict genes involved in the same
metabolic pathway is the co-expression analysis.
• Co-expression analysis can be conducted using datasets from
RNA-seq or microarray obtained in expressly designed
experiments or also by comparing already existing data
publicly available
Example
• Comparative coexpression analysis between tomato and
potato coupled with chemical profiling revealed an array of 10
genes that partake in SGA biosynthesis. Following systematic
functional analysis, a revised SGA biosynthetic pathway
starting from cholesterol up to the tetrasaccharide moiety
linked to the tomato SGA aglycone. Silencing GLYCOALKALOID
METABOLISM 4 prevented accumulation of SGAs in potato
tubers and tomato fruit. This may provide a means for
removal of unsafe, antinutritional substances present in these
widely used food crops.
Gene Inactivation
•The ability to manipulate gene expression levels has been
essential to the study of gene function and biological
processes.
• Classically, whole body deletions of genes were generated
via homologous recombination.
• The last few years have seen a revolution in the
approaches scientists use to inactivate gene expression,
such as the development of highly efficient ribonucleic acid
interference (RNAi) delivery systems, Gene knock out and
anti-sense.
Gene Knockout
•A gene knockout (abbreviation: KO) is a genetic technique in
which one of an organism's genes is made inoperative
("knocked out" of the organism).
•Also known as knockout organisms or simply knockouts, they
are used in learning about a gene that has been sequenced,
but which has an unknown or incompletely known function.
•Researchers draw inferences from the difference between
the knockout organism and normal individuals.
KNOCK OUT MICE
• A mouse in which a gene has been deleted/mutated
(gene is inactivated)
• Specific gene is targeted
• The loss of gene activity often causes changes in a
mouse's phenotype and thus provides valuable
information on the function of the gene.
Researchers who developed the technology for the
creation of knockout mice won Nobel Prize in the
year 2007
• The Nobel Prize in Physiology or Medicine 2007 was awarded
jointly to Mario R. Capecchi, Sir Martin J. Evans and Oliver
Smithies "for their discoveries of principles for introducing
specific gene modifications in mice by the use of embryonic
stem cells".
GENERATION OF KNOCKOUT MICE BY
HOMOLOGOUS RECOMBINATION
• Creating a knockout construct
• Introduce the knockout construct into mouse embryonic stem
cells (ES) in culture
• Screen ES cells and select those whose DNA includes the new
genes
• Implant selected cells into normal mouse embryos , making
“chimeras”
• Implant chimeric embryos in pseudopregnant females
• Females give birth to chimeric offsprings, which are
subsequently bred to verify transmission of the new gene,
producing a mutant mouse line
Knockout construct:
• The gene to be knocked out is isolated from a mouse gene
library. Then a new DNA sequence is engineered which is very
similar to the original gene and its immediate neighbour
sequence, except that it is changed sufficiently to make the
gene inoperable. Usually, the new sequence is also given
a marker gene, a gene that normal mice don't have and that
confers resistance to a certain toxic agent or that produces an
observable change (e.g. colour or fluorescence).
Knockout Mice to study genetic diseases
• Knockout mice make good model systems for investigating the
nature of genetic diseases and the efficacy of different types
of treatment and for developing effective gene therapies to
cure these often devastating diseases
• For instance, the knockout mice for CFTR gene show
symptoms similar to those of humans with cystic fibrosis
Drawbacks of knockout mice
• About 15% of gene knockouts are developmentally lethal and
therefore cannot grow into adult mice. Thus it becomes
difficult to determine the gene function in adults.
• Many genes that participate in interesting gene pathways are
essential for either mouse development, viability or fertility.
Therefore , a traditional knock out of the gene can never lead
to the establishment of knockout mouse strain for analysis
Antisense RNA-Technology
•Antisense RNA is a single-stranded RNA that is
complementary to a messenger RNA (mRNA) strand
transcribed within a cell.
•They are introduced in a cell to inhibit the translation
machinery by base pairing with the sense RNA and
activating the RNase H, to develop a particular novel
transgenic.
mRNA sequence(sense)
Antisense RNA UACUUUGGGCAC
AUGAAACCCGUG
How it Differ from RNAi
•The intended effect of the both technique is same but the
processing is a little bit different in both.
•Antisense technology degrade the mRNA by RNaseH while
RNAi employed enzyme Dicer for degradation.
•RNAi are twice larger than antisense oligonucleotide.
Nature’s Antisense System
•There is a HOK (host killing)/SOK(suppress killing) system of
postsegregational killing employed by R1 plasmid in E.Coli.
•When E.Coli cell undergo cell division the daughter cell
inherit the hok toxin gene and sok gene from the parents but
due to the short half life the sok gets degraded quickly.
•So in a normal cell hok protein get over expressed and cell
die. But if the cell inherit a R1 plasmid which has a sok gene
and sok specific promoter to transcribe sok gene then sok
over expressed the hok and by base pairing with hok, it inhibit
the translation of hok protein
Flavr-Savr
•Flavr-Savr the first FDA approved GM food developed by
Calgene in 1992.
•Licensed in may 17, 1994.
•Ripening of tomato causes production of an enzyme
Polygalactouronase in a gradual increasing level, which is
responsible for softening of the tomato and which becomes
the cause of rottening.
•So, tomato never last for few extra days in ripening condition
without rottening.
•Calgene introduced a gene in plant which synthesize a
complementary mRNA to PG gene and inhibiting the synthesis
of PG enzyme.
INDIAN CONTRIBUTION
•NIPGR, (National institute of Plant Genome Research) in
feb,2010 has developed a tomato by antisense technology
which can last long upto 45 days. So no need to pick up the
green tomatoes and forcefully ripen them with ethylene and no
longer to take tension whether they are going to reach the
market shelves or no need hurry up in your kitchen before they
go meshy.
•NIPGR scientist had silenced the expression of two important
gene which are responsible for loss in firmness and textures
during ripening.
The two gene silenced are
alpha-man and beta-hex of
Glycosyl hydrolase, a kind of
enzyme that breaks the
chemical bond holding a sugar
to either another sugar or
some other molecule, like a
protein.
Challenges to antisense technology…
1. One major challenge to antisense technology (and RNAi) is
the difficulty of getting it into the body. Delivery of the
treatment to the brain, for use in diseases like HD, is
especially challenging because it must cross the blood-brain
barrier.
2. The second major challenge to antisense technology is its
inevitable toxic effects. Although antisense technology is
engineered to be very specific, it can still cause unintended
damage because it would regulate both the mutant and
normal Huntington alleles.
Thank You

Contenu connexe

Tendances

Tendances (20)

chloroplast genome ppt.
chloroplast genome ppt.chloroplast genome ppt.
chloroplast genome ppt.
 
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
Sanger sequencing (DNA sequencing by ENZYMATIC METHOD)
 
RESTRICTION MAPPING
RESTRICTION MAPPINGRESTRICTION MAPPING
RESTRICTION MAPPING
 
Microsatellite
MicrosatelliteMicrosatellite
Microsatellite
 
Forward and reverse genetics
Forward and reverse geneticsForward and reverse genetics
Forward and reverse genetics
 
C value
C value C value
C value
 
Formation and expression ofpseudogenes
Formation and expression ofpseudogenesFormation and expression ofpseudogenes
Formation and expression ofpseudogenes
 
Genome organisation
Genome organisationGenome organisation
Genome organisation
 
Nucleosomes
NucleosomesNucleosomes
Nucleosomes
 
Transposones
TransposonesTransposones
Transposones
 
Mitochondria and chloroplast structure and genome organisation
Mitochondria and chloroplast structure and genome organisationMitochondria and chloroplast structure and genome organisation
Mitochondria and chloroplast structure and genome organisation
 
Restriction enzymes
Restriction enzymesRestriction enzymes
Restriction enzymes
 
Gene isolation methods
Gene isolation methodsGene isolation methods
Gene isolation methods
 
C value paradox
C value paradoxC value paradox
C value paradox
 
Gene transfer methods
Gene transfer methodsGene transfer methods
Gene transfer methods
 
Site directed mutagenesis
Site directed mutagenesisSite directed mutagenesis
Site directed mutagenesis
 
Mitochondrial genome and its manipulation
Mitochondrial genome and its manipulationMitochondrial genome and its manipulation
Mitochondrial genome and its manipulation
 
C value paradox unit-ii
C value paradox unit-iiC value paradox unit-ii
C value paradox unit-ii
 
Repetitive sequences in the eukaryotic genome
Repetitive sequences in the eukaryotic genomeRepetitive sequences in the eukaryotic genome
Repetitive sequences in the eukaryotic genome
 
Dna sequencing
Dna sequencingDna sequencing
Dna sequencing
 

Similaire à Genomics: Organization of Genome, Strategies of Genome Sequencing, Model Plant Genome Project, Functional Analysis of Genes

Advance in plant biotechhnology
Advance in plant biotechhnologyAdvance in plant biotechhnology
Advance in plant biotechhnology
U108
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
Senthil Natesan
 
Biotechnology and its fundamental final (1).ppt
Biotechnology and its fundamental final (1).pptBiotechnology and its fundamental final (1).ppt
Biotechnology and its fundamental final (1).ppt
breenaawan
 

Similaire à Genomics: Organization of Genome, Strategies of Genome Sequencing, Model Plant Genome Project, Functional Analysis of Genes (20)

1.introduction to genetic engineering and restriction enzymes
1.introduction to genetic engineering and restriction enzymes1.introduction to genetic engineering and restriction enzymes
1.introduction to genetic engineering and restriction enzymes
 
genetic_enginnering_merged.pdf
genetic_enginnering_merged.pdfgenetic_enginnering_merged.pdf
genetic_enginnering_merged.pdf
 
Advance in plant biotechhnology
Advance in plant biotechhnologyAdvance in plant biotechhnology
Advance in plant biotechhnology
 
Molecular biology
Molecular biologyMolecular biology
Molecular biology
 
Genetic engineering and Recombinant DNA
Genetic engineering and Recombinant DNAGenetic engineering and Recombinant DNA
Genetic engineering and Recombinant DNA
 
Mutation, repair, recombination
Mutation, repair, recombinationMutation, repair, recombination
Mutation, repair, recombination
 
Recombinant dna technology
Recombinant dna technologyRecombinant dna technology
Recombinant dna technology
 
Human Genome presentation.pptx
Human Genome presentation.pptxHuman Genome presentation.pptx
Human Genome presentation.pptx
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
 
Organization of genetic materials in eukaryotes and prokaryotes
Organization of genetic materials in eukaryotes and prokaryotesOrganization of genetic materials in eukaryotes and prokaryotes
Organization of genetic materials in eukaryotes and prokaryotes
 
Genetic Engineering by Kailash Sontakke Botany Sem-VI Unit-IV all
Genetic Engineering by Kailash Sontakke Botany Sem-VI Unit-IV allGenetic Engineering by Kailash Sontakke Botany Sem-VI Unit-IV all
Genetic Engineering by Kailash Sontakke Botany Sem-VI Unit-IV all
 
introduction to Genomics
introduction to Genomics introduction to Genomics
introduction to Genomics
 
Recombinant dna technology
Recombinant dna technologyRecombinant dna technology
Recombinant dna technology
 
biotech.ppt
biotech.pptbiotech.ppt
biotech.ppt
 
Biotechnology and its fundamental final (1).ppt
Biotechnology and its fundamental final (1).pptBiotechnology and its fundamental final (1).ppt
Biotechnology and its fundamental final (1).ppt
 
CoE-WEBINAR-2_042117v3.pptx
CoE-WEBINAR-2_042117v3.pptxCoE-WEBINAR-2_042117v3.pptx
CoE-WEBINAR-2_042117v3.pptx
 
Recombinent DNA
Recombinent DNA Recombinent DNA
Recombinent DNA
 
cloning, sudan 2016.pdf
cloning, sudan 2016.pdfcloning, sudan 2016.pdf
cloning, sudan 2016.pdf
 
RDT, HGP, GENE THERAPY power point presentation
RDT, HGP, GENE THERAPY power point presentationRDT, HGP, GENE THERAPY power point presentation
RDT, HGP, GENE THERAPY power point presentation
 
Microbial physiology in genomic era
Microbial physiology in genomic eraMicrobial physiology in genomic era
Microbial physiology in genomic era
 

Plus de Promila Sheoran

Plus de Promila Sheoran (20)

Tumor immunology
Tumor immunologyTumor immunology
Tumor immunology
 
Molecular evolution
Molecular evolutionMolecular evolution
Molecular evolution
 
How to write a research proposal
How to write a research proposalHow to write a research proposal
How to write a research proposal
 
Bio business and biosafety
Bio business and biosafetyBio business and biosafety
Bio business and biosafety
 
Gene concept
Gene conceptGene concept
Gene concept
 
Pcr
PcrPcr
Pcr
 
Lcr and molecular probe
Lcr and molecular probeLcr and molecular probe
Lcr and molecular probe
 
Chromosome walking jumping transposon tagging map based cloning
Chromosome walking jumping transposon tagging map based cloningChromosome walking jumping transposon tagging map based cloning
Chromosome walking jumping transposon tagging map based cloning
 
Genomic and c dna library
Genomic and c dna libraryGenomic and c dna library
Genomic and c dna library
 
Dna sequencing techniques
Dna sequencing techniquesDna sequencing techniques
Dna sequencing techniques
 
Organochemical gene synthesis, blotting techniques
Organochemical gene synthesis, blotting techniques Organochemical gene synthesis, blotting techniques
Organochemical gene synthesis, blotting techniques
 
Viruses as vector, binary, shuttle vector
Viruses as vector, binary, shuttle vectorViruses as vector, binary, shuttle vector
Viruses as vector, binary, shuttle vector
 
Phagemid and bac vectors
Phagemid and bac vectorsPhagemid and bac vectors
Phagemid and bac vectors
 
P1, mac and pac vector
P1, mac and pac vectorP1, mac and pac vector
P1, mac and pac vector
 
Cloning and expression vectors
Cloning and expression vectorsCloning and expression vectors
Cloning and expression vectors
 
Expression vector, baculovirus expression vector
Expression vector, baculovirus expression vectorExpression vector, baculovirus expression vector
Expression vector, baculovirus expression vector
 
Molecular mechanism of suppression, somatic mutations
Molecular mechanism of suppression, somatic mutationsMolecular mechanism of suppression, somatic mutations
Molecular mechanism of suppression, somatic mutations
 
Molecular mechanism of spontaneous mutations
Molecular mechanism of spontaneous mutationsMolecular mechanism of spontaneous mutations
Molecular mechanism of spontaneous mutations
 
Molecular mechanism of induced mutations
Molecular mechanism of induced mutationsMolecular mechanism of induced mutations
Molecular mechanism of induced mutations
 
Dna repair
Dna repairDna repair
Dna repair
 

Dernier

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
Lokesh Kothari
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 

Dernier (20)

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
GUIDELINES ON SIMILAR BIOLOGICS Regulatory Requirements for Marketing Authori...
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 

Genomics: Organization of Genome, Strategies of Genome Sequencing, Model Plant Genome Project, Functional Analysis of Genes

  • 1. Genomics: Organization of Genome, Strategies of Genome Sequencing, Model Plant Genome Project, Functional Analysis of Genes Promila Sheoran PhD Biotechnology GJU S&T Hisar
  • 2. Genome Organization •The word “genome,” coined by German botanist Hans Winkler in 1920, was derived simply by combining gene and the final syllable of chromosome. • If not specified, “genome” usually refers to the nuclear genome! •An organism’s genome is defined as the complete haploid genetic complement of a typical cell. • The genetic content of the organelles in the cell is not considered part of the nuclear genome. • In diploid organisms, sequence variations exist between the two copies of each chromosome present in a cell. •The genome is the ultimate source of information about an organism.
  • 3. Continue… •The number of genomes sequenced in their entirety is now in the thousands and includes organisms ranging from bacteria to mammals. •The first complete genome to be sequenced was that of the bacterium Haemophilus influenzae, in 1995. •The first eukaryotic genome sequence, that of the yeast Saccharomyces cerevisiae, followed in 1996. • The genome sequence for the bacterium Escherichia coli became available in 1997 .
  • 4. Hierarchy of gene organization Gene – single unit of genetic function Operon – genes transcribed in single transcript Regulon – genes controlled by same regulator Modulon – genes modulated by same stimulus Element – plasmid, chrom- osome,phage Genome ** order of ascending complexity
  • 5. Prokaryotes and Eukaryotes genome Prokaryotes Eukaryotes Single cell Single or multi cell No nucleus Nucleus One piece of circular DNA Chromosomes No mRNA post transcriptional modification Exons/Introns splicing
  • 6. Prokaryotic Genome Organization Prokaryotes  The genome of E. coli contains 4X106 base pairs  > 90% of DNA encode protein  Lacks a membrane-bound nucleus.  Circular DNA and supercoiled domain  Histones not present
  • 7. o Prokaryotic genomes generally contain one large circular piece of DNA referred to as a "chromosome" (not a true chromosome in the eukaryotic sense). o Some bacteria have linear "chromosomes". o Many bacteria have small circular DNA structures called plasmids which can be swapped between neighbors and across bacterial species. Continue…
  • 8. o The term plasmid was first introduced by the American molecular biologist Joshua Lederberg in 1952. o A plasmid is separate from, and can replicate independently of, the chromosomal DNA. o Plasmid size varies from 1 to over 1,000 (kbp). Plasmid
  • 9. Eukaryotic genome organization More about the nuclear genome: • Multiple linear chromosomes, 5000 to 50000 genes • Mono-cistronic transcription units • Discontinuous coding regions (introns and exons) • Large amounts of non-coding DNA • Transcription and translation take place in different compartments • Variety of RNA genes: rRNA, tRNA, snRNA (small nuclear), sno (small nucleolar), microRNAs, etc. • Often diploid genomes and obligatory sexual reproduction • Standard mechanism of recombination: meiosis • Multiple genomes: nuclear, plastid genome, mitochondria, chloroplasts • Plastid genomes resemble prokaryotic genomes
  • 10. EUKARYOTIC GENOME ‘The nucleus is heart of the cell, which serves as the main distinguishing feature of the eukaryotic cells. It is an organelle submerged in its sea of turbulent cytoplasm which has the genetic information encoding the past history and future prospects of the cell. Nucleus contains many thread like coiled structures which remain suspended in the nucleoplasm which are known as chromatin substance’ Chromatin is the complex combination of DNA and proteins that makes up chromosomes. The major proteins involved in chromatin are histone proteins; although many other chromosomal proteins have prominent roles too. . The functions of chromatin is to package DNA into smaller volume to fit in the cell, to strengthen the DNA to allow mitosis and meiosis and to serve as a mechanism to control gene expression and DNA replication.
  • 11. ORGANIZATION OF CHROMATIN In resting non-dividing eukaryotic cells, the genome is in the form of nucleoprotein-complex- the chromatin. (randomly dispersed in the nuclear matrix as interwoven network of fine chromatin threads) The information stored in DNA is organized, replicated and read with the help of a variety of DNA-binding proteins: Structural Proteins- Histones(Packing proteins): Main structural proteins found in eukaryotic cells Low molecular weight basic proteins with high proportion of positively charged amino acids, Bound to DNA along most of its length, The positive charge helps histones to bind to DNA and play a crucial role in packing of long DNA molecules. Functional Proteins- Non- Histones: Associated with gene regulation and other functions of chromatin.
  • 12. Hierarchy of Chromatin Organization in the Cell Nucleus: Nuclear Matrix Associated Chromatin Loops
  • 13. Next Generation Sequencing • DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule • Refers to non-Sanger-based high-throughput DNA sequencing technologies
  • 14.
  • 15.
  • 16. ILLUMINA SEQUENCING • Step 1: Sample Preparation • Steps 2-6: Cluster Generation by Bridge Amplification • Steps 7-12: Sequencing by Synthesis
  • 17. 1. Solid-phase amplification can produce 100-200 million spatially separated clusters, providing free ends to which a universal sequencing primer can be hybridized to initiate the NGS reaction
  • 18.
  • 19.
  • 20.
  • 21. 454 Sequencing • Emulsion-based sample preparation (emPCR) • Pyrosequencing: non-electrophoretic, bioluminescence method that measures the release of inorganic pyrophosphate by proportionally converting it into visible light using a series of enzymatic reaction
  • 23. Step 2: Loading DNA Sample onto Beads
  • 24.
  • 26.
  • 27. Sequence Assembly • Sequence assembly refers to aligning and merging fragments of a much longer DNA sequence in order to reconstruct the original sequence. • First sequence assemblers began to appear in the late 1980s and early 1990s
  • 28. Why We Need genome assemblers • Terabytes of sequencing data which need processing on computing clusters • Identical and nearly identical sequences increase the time and space complexity of algorithms exponentially; • Errors in the fragments from the sequencing instruments
  • 29. Basic Principles Of Assembly • Sequence and quality data are read and the reads are cleaned. • Overlaps are detected between reads. False overlaps, duplicate reads, chimeric reads and reads with self-matches are also identified • The reads are grouped to form a contig layout of the finished sequence. • A multiple sequence alignment of the reads is performed, and a consensus sequence is constructed for each contig layout • Possible sites of mis-assembly are identified by combining manual inspection with quality value validation.
  • 30. Types of Sequence assembly Mapping Assembly De-novo assembly
  • 31. Mapping Assembly • Assembles reads against an existing backbone sequence, building a sequence that is similar but not necessarily identical to the backbone sequence • Compared to de novo assembly, the mapping of resequenced reads to a template genome is a computationally easier problem • Use seeding techniques • Seeds of fixed length allow for not more than one or two mismatches. In addition, the capability to detect insertions and deletions is very limited and most programs can only detect indels in subsequent alignment runs
  • 32. Tools for Mapping Assembly • MAQ-Particularly designed for Illumina • SOAP-program for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences • SHRiMP -Developed with Applied Biosystem • SOCS - Aligns SOLiD data • Eland -Efficient Large-Scale Alignment of Nucleotide Databases • GMAP- Genomic Mapping and Alignment Program for mRNA and EST Sequences
  • 33. Other examples • Bowtie • BWA • BFAST • GenomeMapper • Novocraft • PASS • SeqMap • SXOligoSearch • Zoom
  • 34. De-novo assembly • Assembles short reads to create full-length sequences. • De novo assembly software must deal with sequencing errors, repeat structures, and the computational complexity of processing large volumes of data.
  • 35. De-novo assembly tools • ABySS - Assembly By Short Sequences designed for very short reads • ALLPATHS-De novo assembly of whole-genome shotgun microreads • Velvet -designed for short read sequencing technologies • Edena- Exact DE Novo Assembler • MIRA2- Mimicking Intelligent Read Assembly is able to perform true hybrid de-novo assembly
  • 36. Other examples • EULER-SR • SEQAN • SHARCGS • SSAKE • SOAPdenovo • VCAKE • Newbler, • CAP3, • IDBA, • PE-Assembly, • Telescope
  • 37. Overview of de novo short reads assemblers.
  • 38. Recommended assemblers for different genome assembly
  • 39. Plant Genome Project • Arabidopsis • Rice • Tomato • Chickpea • Poplar
  • 40. Arabidopsis thaliana genome project Arabidopsis: The Model Plant • Relative genetic simplicity • Fast life cycle • Susceptibility to manipulation through genetic engineering • Convenience and abundance • Basic similarities to other crops
  • 41. Arabidopsis genome • Contains about 125 Mb of sequence • Contains 25,500 genes • 5 chromosomes • Has 35% unique genes
  • 42. Arabidopsis Genome Initiative (AGI) • Collaboration of the U.S. Department of Energy and the U.S. Department of Agriculture, The European Union, the Government of France, and the Chiba Prefectural Government in Japan • August 1996- National Science Foundation (NSF) in Arlington, VA
  • 43. Major Highlights of Genome Project •1990- Arabidosis genome project initiated •1995 standard bac and p1 libraries constructed •1996- Arabidopsis genome initiative organized •1997-physical maps of all chromosomes completed •1999- chromosomes 2 and 4 sequenced •2000- completion of genome sequence
  • 44. Applications • Understanding Photosensitivity • Creating Healthier Edible • Manufacturing Biodegradable Plastics. • Making Vegetables and Fruits Cheaper and Hardier • Improving Erosion Resistance • Understanding How Plants Flower
  • 45. Rice Genome Project Rice genome • Smallest among grass genomes (Wheat, oat, rye, Barley, corn) • Size: 430 Mbp (3.3 X Arabidopsis) • 12 chromosomes • Approximately 62,435 genes • Repetitive elements: Most in intergenic regions versus in introns in humans
  • 46. IRGSP (International Rice Genome Sequencing Project) • Established in 1997 • Comprised of ten members: Japan, the United States of America, China, Taiwan, Korea, India, Thailand, France, Brazil, and the United Kingdom • IRGSP adopts the clone-by-clone shotgun sequencing strategy
  • 47. Milestones • 1997- sequencing of rice genome was initiated as an international collaboration among 10 countries • 1998- IRGSP (International Rice Genome Sequencing Project) was launched under the coordination of the Rice Genome Project (RGP) of genome • 2000- Monsanto Co produced a draft sequence of BAC contigs covering 260 Mb of rice geome; 95% of rice genes were identified • 2001- syngenta produced a draft sequence and identified 32000 to 50000 genes with 99.8% accuracy and identified 99% of rice genes • 2002- IRGSP finished high quality draft sequence (clone-by-clone approach) with a sequence length excluding overlaps , of 366 Mb corresponding to ~92% of rice genome • 2004- IRGSP produce the high-quality sequence of entire rice genome with 99.99% accuracy and without any sequence gap
  • 48. Applications • First crop plant to be sequenced, therefore have a great impact in agriculture • Useful in understanding the genome of other crops in the grass family including corn, wheat, barley, rye and sorghum • Identification of agronomically important traits - genes that affect growth habit to promote yield and photoperiod genes to extend the range of elite cultivars.
  • 49. Tomato Genome Project • Tomato (Solanum Lycopersicon) – economically important crop worldwide, – intensively investigated and – model system for genetic studies in plants. • Characteristics: – Simple diploid genetics: 12 chromosome pairs and 950 Mb genome size. – Short generation time – Routine transformation technology – Rich genetic and genomic resources.
  • 50. International Tomato Genome Sequencing Project • Started in 2004 • Participants were Korea, China, the United Kingdom, India, the Netherlands, France, Japan, Spain, Italy and the United States • The initial approach was to sequence only the euchromatic sequence using a BAC-by-BAC approach • In 2009, a complementary whole-genome shotgun approach was initiated and finally sequenced in 2012.
  • 51. Applications • Tomato as a reference genome sequence • Understanding Diversification & Adaptation • Exploring the Role of Natural Diversity in the Genetic Improvement of Crops
  • 52. Chickpea Genome Project • Second most widely grown legume crop after soybean • Approximately 28,269 genes of chickpea were identified • Approximately 738 Mb genomic sequence • Half (49.41%) of the chickpea genome is composed of transposable elements and unclassified repeats
  • 53. International Chickpea Genome Sequencing Consortium • Role of ICGSC: 1. To ensure data and information on the chickpea is readily available to all researchers, 2. To help avoid duplication of research efforts, 3. To provide a framework for accessing national and international collaboration, 4. To help keep chickpea research at the cutting edge of genetic research.
  • 54. Applications • The sequencing would help reduce the time to breed new chick pea varieties as plant breeders would now have access to genes with the required traits. • The availability of these genome sequences facilitate de novo assembly of the genomes of other important but less-studied legume crops.
  • 55. Poplar genome project • First tree DNA to be sequenced because of relatively compact genetic complement • Genome sequence was published in 2006. • Third plant genome to be published • Contains a whole genome duplication • Includes ~370 megabases of sequence • 19 chromosomes • 41,377 protein coding genes
  • 56. International Populus Genome Consortium Goals • Examine the suite of genetic resources in Populus that are currently available to the scientific community, • Integrate genomics with physiology and ecology in an effort to understand and manipulate tree growth, development and function • Develop the ability to attain predictive understanding of tree growth, development, and complex function.
  • 57. Applications • Offers the opportunity and modify to study genes related to commercial important traits • Opportunity to better understand the distribution of genes across the landscape • Poplar genome project covers the promise and possibility of uncovering and understanding mechanisms uniquely associated with perennial woody plant growth, development and ecology. • Able to address issues related to interpret annual cycling of nutrients, water movement up dozens of meters in height, perennial crown development and wood formation.
  • 58. Function analysis of genes Different tools 1. Virus-induced gene silencing (VIGS) 2. CRES-T 3. RNA Interference
  • 59. Virus-induced gene silencing (VIGS) • Effective strategy for rapid functional analysis of genes in plant tissues • Elegant tool for functional characterization of genes associated with abiotic stress response • VIGS is rapid (3–4 weeks from infection to silencing) • Does not require development of stable transformants • Allows characterization of phenotypes that might be lethal in stable lines • Offers the potential to silence either individual or multiple members of a gene family Example • Knockdown of TaNAC1 with barley stripe mosaic virus-induced gene silencing (BSMV-VIGS) enhanced stripe rust resistance
  • 60. CRES-T • Chimeric REpressor Gene-Silencing Technology (CRES-T) • Chimeric repressor produced by fusion of a transcription factor to the plant-specific repression domain (SRDX) suppresses the target genes of a transcription factor • Useful tool for functional analysis of redundant plant transcription factors and the manipulation of plant traits
  • 61. About RNAi • RNA interference (RNAi) is a system within living cells that takes part in controlling which genes are active and how active they are. Two types of small RNA molecules – microRNA (miRNA) and small interfering RNA (siRNA) – are central to RNA interference. • RNAs are the direct products of genes, and these small RNAs can bind to other specific RNAs (mRNA) and either increase or decrease their activity, for example by preventing a messenger RNA from producing a protein. • RNA interference has an important role in defending cells against parasitic genes – viruses and transposons – but also in directing development as well as gene expression in general.
  • 62. The Mechanism of RNA Interference  The long dsRNAs enter a cellular pathway that is commonly referred to as the RNA interference (RNAi) pathway.  First, the dsRNAs get processed into 20-25 nucleotide (nt) small interfering RNAs (siRNAs) by an RNase III-like enzyme called Dicer.  Then, the siRNAs assemble into endoribonuclease- containing complexes known as RNA-induced silencing complexes (RISCs), unwinding in the process.  The siRNA strands subsequently guide the RISCs to complementary RNA molecules, where they cleave and destroy the cognate RNA  Cleavage of cognate RNA takes place near the middle of the region bound by the siRNA strand.
  • 63. Approaches for candidate gene discovery Approaches for candidate gene discovery Traditional candidate gene approach Position dependent strategy Comparative genomics strategy Function dependent strategy Combined strategy Digital candidate gene approach
  • 64. Traditional candidate gene approach Position dependent strategy • Identification of candidate gene is based on the physical linkage information in a QTL-identified chromosomal segment • Example – position of QTLs controlling field blast resistance in rice • Isolation of Arabidopsis AB13 gene
  • 65. Comparative genomics strategy • Includes comparative functional genomics strategy and comparative structural genomics strategy • Candidate genes may be functionally conserved or structurally homologous genes
  • 66. Function dependent strategy • Results in the functional candidate gene approach, in which a putative candidate gene is the one that could be statistically detected from the genes controlling large components of inheritable gene expression variation. • Example- identification of new disease resistance genes in Tobacco
  • 67. Combined strategy • Combines at least two strategies • Genetical genomic approach originating from function- dependent strategy provides powerful means to identify candidate genes. • Example- selection of candidate genes for grape proanthocyanidin pathway
  • 68. Digital candidate gene approach (DigiCGA) • Novel web resource-based candidate gene identification approach. • DigiCGA can be defined as an approach that objectively extract, filter, (re)assemble, or (re)analyze all possible resources available derived from the public web databases mainly in accordance with the principles of biological ontology and complex statistical methods to make computational identification of the potential candidate genes of specific interest. • A combination of RNA-seq and DGE analysis based on the next generation sequencing technology was shown to be a powerful method for identifying candidate genes encoding enzymes responsible for the biosynthesis of novel secondary metabolites in a non-model plant. Seven CYP450s and five UDPGs were selected as potential candidates involved in mogrosides biosynthesis. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the fruit extract from S. grosvenorii.
  • 69. Deciphering the function of gene in plant secondary metabolism • To complete the metabolic map for an entire class of compounds, it is essential to identify gene-metabolite correlations of a metabolic pathway • Effective approach to predict genes involved in the same metabolic pathway is the co-expression analysis. • Co-expression analysis can be conducted using datasets from RNA-seq or microarray obtained in expressly designed experiments or also by comparing already existing data publicly available
  • 70.
  • 71. Example • Comparative coexpression analysis between tomato and potato coupled with chemical profiling revealed an array of 10 genes that partake in SGA biosynthesis. Following systematic functional analysis, a revised SGA biosynthetic pathway starting from cholesterol up to the tetrasaccharide moiety linked to the tomato SGA aglycone. Silencing GLYCOALKALOID METABOLISM 4 prevented accumulation of SGAs in potato tubers and tomato fruit. This may provide a means for removal of unsafe, antinutritional substances present in these widely used food crops.
  • 72. Gene Inactivation •The ability to manipulate gene expression levels has been essential to the study of gene function and biological processes. • Classically, whole body deletions of genes were generated via homologous recombination. • The last few years have seen a revolution in the approaches scientists use to inactivate gene expression, such as the development of highly efficient ribonucleic acid interference (RNAi) delivery systems, Gene knock out and anti-sense.
  • 73. Gene Knockout •A gene knockout (abbreviation: KO) is a genetic technique in which one of an organism's genes is made inoperative ("knocked out" of the organism). •Also known as knockout organisms or simply knockouts, they are used in learning about a gene that has been sequenced, but which has an unknown or incompletely known function. •Researchers draw inferences from the difference between the knockout organism and normal individuals.
  • 74. KNOCK OUT MICE • A mouse in which a gene has been deleted/mutated (gene is inactivated) • Specific gene is targeted • The loss of gene activity often causes changes in a mouse's phenotype and thus provides valuable information on the function of the gene.
  • 75. Researchers who developed the technology for the creation of knockout mice won Nobel Prize in the year 2007 • The Nobel Prize in Physiology or Medicine 2007 was awarded jointly to Mario R. Capecchi, Sir Martin J. Evans and Oliver Smithies "for their discoveries of principles for introducing specific gene modifications in mice by the use of embryonic stem cells".
  • 76. GENERATION OF KNOCKOUT MICE BY HOMOLOGOUS RECOMBINATION • Creating a knockout construct • Introduce the knockout construct into mouse embryonic stem cells (ES) in culture • Screen ES cells and select those whose DNA includes the new genes • Implant selected cells into normal mouse embryos , making “chimeras” • Implant chimeric embryos in pseudopregnant females • Females give birth to chimeric offsprings, which are subsequently bred to verify transmission of the new gene, producing a mutant mouse line
  • 77. Knockout construct: • The gene to be knocked out is isolated from a mouse gene library. Then a new DNA sequence is engineered which is very similar to the original gene and its immediate neighbour sequence, except that it is changed sufficiently to make the gene inoperable. Usually, the new sequence is also given a marker gene, a gene that normal mice don't have and that confers resistance to a certain toxic agent or that produces an observable change (e.g. colour or fluorescence).
  • 78.
  • 79.
  • 80. Knockout Mice to study genetic diseases • Knockout mice make good model systems for investigating the nature of genetic diseases and the efficacy of different types of treatment and for developing effective gene therapies to cure these often devastating diseases • For instance, the knockout mice for CFTR gene show symptoms similar to those of humans with cystic fibrosis
  • 81. Drawbacks of knockout mice • About 15% of gene knockouts are developmentally lethal and therefore cannot grow into adult mice. Thus it becomes difficult to determine the gene function in adults. • Many genes that participate in interesting gene pathways are essential for either mouse development, viability or fertility. Therefore , a traditional knock out of the gene can never lead to the establishment of knockout mouse strain for analysis
  • 82. Antisense RNA-Technology •Antisense RNA is a single-stranded RNA that is complementary to a messenger RNA (mRNA) strand transcribed within a cell. •They are introduced in a cell to inhibit the translation machinery by base pairing with the sense RNA and activating the RNase H, to develop a particular novel transgenic. mRNA sequence(sense) Antisense RNA UACUUUGGGCAC AUGAAACCCGUG
  • 83. How it Differ from RNAi •The intended effect of the both technique is same but the processing is a little bit different in both. •Antisense technology degrade the mRNA by RNaseH while RNAi employed enzyme Dicer for degradation. •RNAi are twice larger than antisense oligonucleotide.
  • 84. Nature’s Antisense System •There is a HOK (host killing)/SOK(suppress killing) system of postsegregational killing employed by R1 plasmid in E.Coli. •When E.Coli cell undergo cell division the daughter cell inherit the hok toxin gene and sok gene from the parents but due to the short half life the sok gets degraded quickly. •So in a normal cell hok protein get over expressed and cell die. But if the cell inherit a R1 plasmid which has a sok gene and sok specific promoter to transcribe sok gene then sok over expressed the hok and by base pairing with hok, it inhibit the translation of hok protein
  • 85.
  • 86. Flavr-Savr •Flavr-Savr the first FDA approved GM food developed by Calgene in 1992. •Licensed in may 17, 1994. •Ripening of tomato causes production of an enzyme Polygalactouronase in a gradual increasing level, which is responsible for softening of the tomato and which becomes the cause of rottening. •So, tomato never last for few extra days in ripening condition without rottening. •Calgene introduced a gene in plant which synthesize a complementary mRNA to PG gene and inhibiting the synthesis of PG enzyme.
  • 87. INDIAN CONTRIBUTION •NIPGR, (National institute of Plant Genome Research) in feb,2010 has developed a tomato by antisense technology which can last long upto 45 days. So no need to pick up the green tomatoes and forcefully ripen them with ethylene and no longer to take tension whether they are going to reach the market shelves or no need hurry up in your kitchen before they go meshy. •NIPGR scientist had silenced the expression of two important gene which are responsible for loss in firmness and textures during ripening.
  • 88. The two gene silenced are alpha-man and beta-hex of Glycosyl hydrolase, a kind of enzyme that breaks the chemical bond holding a sugar to either another sugar or some other molecule, like a protein.
  • 89. Challenges to antisense technology… 1. One major challenge to antisense technology (and RNAi) is the difficulty of getting it into the body. Delivery of the treatment to the brain, for use in diseases like HD, is especially challenging because it must cross the blood-brain barrier. 2. The second major challenge to antisense technology is its inevitable toxic effects. Although antisense technology is engineered to be very specific, it can still cause unintended damage because it would regulate both the mutant and normal Huntington alleles.