By Florian Maumus and Hadi Quesneville
We present our opinions, recent developments and perspectives regarding whole-genome repeatome annotation.
This talk was presented by Florian Maumus at the Barbados Workshop on the Computational Identification and Analysis of Transposable Elements, Holetown, Barbados, April 18-24 2014
With the development of Next Generation
Sequencing Technology (NGS), the field of hominin
paleogenetics has transformed significantly from
studying specific DNA markers to revealing whole
genome information. However, ancient DNA of
interest is usually highly fragmented so an NGS
library preparation protocol optimized to capture
short DNA fragments (40bp to 200bp) was
developed. The improved workflow includes the
use of column-based DNA purification and
concentration and automated gel-based sizeselection.
This workflow permitted production of
“shotgun” genomic libraries from very limited input
DNA (6ng to 39ng). Methods that permit the use
of such low input, degraded DNA enable the
partitioning of exceedingly rare samples into
multiple analytical workflows.
Data from two orthogonal sequencing platforms for
these ancient Bulgarian samples demonstrated
very similar base-substitution profiles with C>T and
G>A variants accounting for ~75-80% of all SNPs
called in both datasets. With such orthogonal
verification, we expect to be able to reduce the
false positive rate and generate a “truth” list of
SNPs that will enhance our understanding of
ancient population genomics and migrations. In
summary, we have demonstrated a library
preparation and semiconductor-based NGS
workflow that is applicable for processing
contaminated and degraded samples and can be
used for ancient DNA research.
Currently, human papillomavirus (HPV) DNA tests validated
in large trials and epidemiological studies are the hybrid
capture second-generation (HC2) HPV DNA assay and
a variety of polymerase chain reaction (PCR) protocols employing
degenerate or consensus primers. This article describes
the currently available technology for HPV detection
and discusses novel technologies and their potential for
large-scale screening. Ideally, an HPV test should allow detection
of multiple HPV types, identify individual types, and
provide quantitative information about the viral load of each
individual type found. Moreover, it should be easy to perform,
be highly reproducible, with a high specificity and
sensitivity, and amenable for high throughput analysis and
automation. Because we do not yet fully understand the true
value of viral load and the biological relevance of the different
HPV types, any HPV test should be able to detect the
clinically relevant high-risk types with a sufficient sensitivity
of at least 10 000 genome copies per sample. To validate the
different current and future test systems and to compare
inter-laboratory performance we urgently need reference
samples, validated reagents, and standardized protocols.
It highlights the various methods of gene transfer in plants, characterization of plants by PCR and qRTPCR. Different types of PCR and Real time PCR have been described
Lecture on the annotation of transposable elementsfmaumus
Lecture on the annotation of transposable elements at the CNRS school "BioinfoTE" in 2020 (Fréjus, France). https://bioinfote.sciencesconf.org/
ORGANIZING COMITEE
Emmanuelle Lerat (LBBE – CNRS Université Lyon 1),
Anna-Sophie Fiston-Lavier (ISEM – Université de Montpellier)
Florian Maumus (URGI – INRAe Versailles)
François Sabot (DIADE – IRD Montpellier)
With the development of Next Generation
Sequencing Technology (NGS), the field of hominin
paleogenetics has transformed significantly from
studying specific DNA markers to revealing whole
genome information. However, ancient DNA of
interest is usually highly fragmented so an NGS
library preparation protocol optimized to capture
short DNA fragments (40bp to 200bp) was
developed. The improved workflow includes the
use of column-based DNA purification and
concentration and automated gel-based sizeselection.
This workflow permitted production of
“shotgun” genomic libraries from very limited input
DNA (6ng to 39ng). Methods that permit the use
of such low input, degraded DNA enable the
partitioning of exceedingly rare samples into
multiple analytical workflows.
Data from two orthogonal sequencing platforms for
these ancient Bulgarian samples demonstrated
very similar base-substitution profiles with C>T and
G>A variants accounting for ~75-80% of all SNPs
called in both datasets. With such orthogonal
verification, we expect to be able to reduce the
false positive rate and generate a “truth” list of
SNPs that will enhance our understanding of
ancient population genomics and migrations. In
summary, we have demonstrated a library
preparation and semiconductor-based NGS
workflow that is applicable for processing
contaminated and degraded samples and can be
used for ancient DNA research.
Currently, human papillomavirus (HPV) DNA tests validated
in large trials and epidemiological studies are the hybrid
capture second-generation (HC2) HPV DNA assay and
a variety of polymerase chain reaction (PCR) protocols employing
degenerate or consensus primers. This article describes
the currently available technology for HPV detection
and discusses novel technologies and their potential for
large-scale screening. Ideally, an HPV test should allow detection
of multiple HPV types, identify individual types, and
provide quantitative information about the viral load of each
individual type found. Moreover, it should be easy to perform,
be highly reproducible, with a high specificity and
sensitivity, and amenable for high throughput analysis and
automation. Because we do not yet fully understand the true
value of viral load and the biological relevance of the different
HPV types, any HPV test should be able to detect the
clinically relevant high-risk types with a sufficient sensitivity
of at least 10 000 genome copies per sample. To validate the
different current and future test systems and to compare
inter-laboratory performance we urgently need reference
samples, validated reagents, and standardized protocols.
It highlights the various methods of gene transfer in plants, characterization of plants by PCR and qRTPCR. Different types of PCR and Real time PCR have been described
Lecture on the annotation of transposable elementsfmaumus
Lecture on the annotation of transposable elements at the CNRS school "BioinfoTE" in 2020 (Fréjus, France). https://bioinfote.sciencesconf.org/
ORGANIZING COMITEE
Emmanuelle Lerat (LBBE – CNRS Université Lyon 1),
Anna-Sophie Fiston-Lavier (ISEM – Université de Montpellier)
Florian Maumus (URGI – INRAe Versailles)
François Sabot (DIADE – IRD Montpellier)
Unlike DNA replication in the cell, PCR uses heat to separate DNA st.pdftemperaturejeans
Unlike DNA replication in the cell, PCR uses heat to separate DNA strands. (T/F)
TRUE
=======================================================
Restriction endonucleases are enzymes that cut DNA at specific double-stranded sequences.
(T/F)
TRUE
================================================
Toq is a type of DNA polymerase used in PCR because it is thermo-stable (stable at high heat).
(T/F)
TRUE
=================================================
A bacteriophage is a kind of virus that can be used as a cloning vector. (T/F)
TRUE
============================================================
An ori site a region of plasmid DNA containing antibiotic resistance
False
============================================================
The GFP is an example of \"reporter protein\" naturally found in bacteria. (T/F)
True
===================================================================
Reverse transcriptaseis an enzyme that generates \"sticky ends.\"
True
=====================================================================(cid:13)(cid:10)
DNA ligase is an enzyme that generates \"sticky ends.\" (T/F)
true
=====================================================================(cid:13)(cid:10)
A polylinker is a segment of plasmid DNA that contains multiple restriction enzyme sites. (T/F)
True
=====================================================================(cid:13)(cid:10)
Antibiotic resistance gene can be identify bacterial colonies containing recombinant DNA
FALSE
=====================================================================
===
Eukaryotic genes can be expressed in bacteria using an appropriate promotor
True
=====================================================================
==
The initial phase in PCR is to warm the blend to a high temperature, as a rule 94 to 95 °C, for
around five minutes. The hydrogen securities that hold together the two strands of a twofold
helix are broken at these temperatures, and the DNA isolates into single strands. This procedure
is named denaturation.
=======================================================
A confinement chemical or limitation endonuclease is a compound that cuts DNA at or close
particular acknowledgment nucleotide arrangements known as limitation locales. Limitation
compounds are ordinarily ordered into four sorts, which vary in their structure and whether they
cut their DNA substrate at their acknowledgment site, or if the acknowledgment and cleavage
locales are separate from each other. To cut DNA, all confinement catalysts make two entry
points, once through every sugar-phosphate spine (i.e. every strand) of the DNA twofold helix.
These chemicals are found in microorganisms and archaea and give a guard component against
attacking viruses.Inside a prokaryote, the confinement proteins specifically cut up outside DNA
in a procedure called limitation; then, have DNA is ensured by an alteration compound (a
methyltransferase) that adjusts the prokaryotic DNA and squares cleavage. Together, these two
pro.
In situ hybridization methods and techniques course slides Pat Heslop-HarrisonPat (JS) Heslop-Harrison
Methods and techniques for chromosomal in situ hybridization and molecular cytogenetics. Fixations, chromosomes preparation, mostly using plant chromosomes, hybridiziation mixtures, stringency calculations and fluorescent microscopy.Trude Schwarzacher and Pat Heslop-Harrison
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...Pat (JS) Heslop-Harrison
An overview of plant molecular cytogenetics. The lecture Trude Schwarzacher presented to the ECA conference Strasbourg in July 2015 is http://www.slideshare.net/PatHeslopHarrison/trude-schwarzacher
Genome evolution - tales of scales DNA to crops,months to billions of years, ...Pat (JS) Heslop-Harrison
Pat Heslop-Harrison: Lecture to University of Malaya, Kuala Lumpur, Malaysia December 2013
Some DNA sequences are recognizable in all organisms and originated with the start of life. Others are unique to a single species. Some sequences are present in single copies in genomes, while others are present as millions of copies. The total amount of DNA in cells of an advanced eukaryotic species can vary over three orders of magnitude, and chromosome number can vary similarly. How can such huge variations be accommodated within the constraints of organism growth, development and reproduction? What are the evolutionary implications of these huge variations? How can we use the information to understand plant evolution, cytogenetics, genetics and epigenetics? What are the implications for future evolution, biodiversity and responses of plants during plant breeding or climate change?
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Unlike DNA replication in the cell, PCR uses heat to separate DNA st.pdftemperaturejeans
Unlike DNA replication in the cell, PCR uses heat to separate DNA strands. (T/F)
TRUE
=======================================================
Restriction endonucleases are enzymes that cut DNA at specific double-stranded sequences.
(T/F)
TRUE
================================================
Toq is a type of DNA polymerase used in PCR because it is thermo-stable (stable at high heat).
(T/F)
TRUE
=================================================
A bacteriophage is a kind of virus that can be used as a cloning vector. (T/F)
TRUE
============================================================
An ori site a region of plasmid DNA containing antibiotic resistance
False
============================================================
The GFP is an example of \"reporter protein\" naturally found in bacteria. (T/F)
True
===================================================================
Reverse transcriptaseis an enzyme that generates \"sticky ends.\"
True
=====================================================================(cid:13)(cid:10)
DNA ligase is an enzyme that generates \"sticky ends.\" (T/F)
true
=====================================================================(cid:13)(cid:10)
A polylinker is a segment of plasmid DNA that contains multiple restriction enzyme sites. (T/F)
True
=====================================================================(cid:13)(cid:10)
Antibiotic resistance gene can be identify bacterial colonies containing recombinant DNA
FALSE
=====================================================================
===
Eukaryotic genes can be expressed in bacteria using an appropriate promotor
True
=====================================================================
==
The initial phase in PCR is to warm the blend to a high temperature, as a rule 94 to 95 °C, for
around five minutes. The hydrogen securities that hold together the two strands of a twofold
helix are broken at these temperatures, and the DNA isolates into single strands. This procedure
is named denaturation.
=======================================================
A confinement chemical or limitation endonuclease is a compound that cuts DNA at or close
particular acknowledgment nucleotide arrangements known as limitation locales. Limitation
compounds are ordinarily ordered into four sorts, which vary in their structure and whether they
cut their DNA substrate at their acknowledgment site, or if the acknowledgment and cleavage
locales are separate from each other. To cut DNA, all confinement catalysts make two entry
points, once through every sugar-phosphate spine (i.e. every strand) of the DNA twofold helix.
These chemicals are found in microorganisms and archaea and give a guard component against
attacking viruses.Inside a prokaryote, the confinement proteins specifically cut up outside DNA
in a procedure called limitation; then, have DNA is ensured by an alteration compound (a
methyltransferase) that adjusts the prokaryotic DNA and squares cleavage. Together, these two
pro.
In situ hybridization methods and techniques course slides Pat Heslop-HarrisonPat (JS) Heslop-Harrison
Methods and techniques for chromosomal in situ hybridization and molecular cytogenetics. Fixations, chromosomes preparation, mostly using plant chromosomes, hybridiziation mixtures, stringency calculations and fluorescent microscopy.Trude Schwarzacher and Pat Heslop-Harrison
Plant Chromosomes: European Cytogeneticists outline: Trude Schwarzacher and P...Pat (JS) Heslop-Harrison
An overview of plant molecular cytogenetics. The lecture Trude Schwarzacher presented to the ECA conference Strasbourg in July 2015 is http://www.slideshare.net/PatHeslopHarrison/trude-schwarzacher
Genome evolution - tales of scales DNA to crops,months to billions of years, ...Pat (JS) Heslop-Harrison
Pat Heslop-Harrison: Lecture to University of Malaya, Kuala Lumpur, Malaysia December 2013
Some DNA sequences are recognizable in all organisms and originated with the start of life. Others are unique to a single species. Some sequences are present in single copies in genomes, while others are present as millions of copies. The total amount of DNA in cells of an advanced eukaryotic species can vary over three orders of magnitude, and chromosome number can vary similarly. How can such huge variations be accommodated within the constraints of organism growth, development and reproduction? What are the evolutionary implications of these huge variations? How can we use the information to understand plant evolution, cytogenetics, genetics and epigenetics? What are the implications for future evolution, biodiversity and responses of plants during plant breeding or climate change?
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...Wasswaderrick3
In this book, we use conservation of energy techniques on a fluid element to derive the Modified Bernoulli equation of flow with viscous or friction effects. We derive the general equation of flow/ velocity and then from this we derive the Pouiselle flow equation, the transition flow equation and the turbulent flow equation. In the situations where there are no viscous effects , the equation reduces to the Bernoulli equation. From experimental results, we are able to include other terms in the Bernoulli equation. We also look at cases where pressure gradients exist. We use the Modified Bernoulli equation to derive equations of flow rate for pipes of different cross sectional areas connected together. We also extend our techniques of energy conservation to a sphere falling in a viscous medium under the effect of gravity. We demonstrate Stokes equation of terminal velocity and turbulent flow equation. We look at a way of calculating the time taken for a body to fall in a viscous medium. We also look at the general equation of terminal velocity.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Lateral Ventricles.pdf very easy good diagrams comprehensive
Bits of the Green Junk
1. Barbados Workshop on the Computational Identification
and Analysis of Transposable Elements
April 18th - 25th, 2014
Florian Maumus with Hadi Quesneville (URGI-INRA, Versailles, France)
4. De novo repeatome detection
Deep repeatome annotation
Repeat annotation in large genomes
5. De novo repeatome detection
Deep repeatome annotation
Repeat annotation in large genomes
6.
7. Repeat complement = Repeatome
The Repeatome includes:
Transposable elements
Endogenous viruses
Tandem repeats
Ribozymes
Genes
…
7
= What you get with repeat-finders!
9. Dark matter, the genomic humus
« Repeats » Old repeats Dark matter
Detected Detectable? Background Noise
Burst Decay Melt
10. Complexity of the repeatome
Turnover ++
Recent activity +++
Turnover -
Recent activity -
young
old
11. Different history, different challenges
Maize
2.3 Gb genome
About 85% repeats
Human
3.2 Gb genome
About 50% repeats
12. LECA:
Core eukaryotic genes +
Copia, Gypsy, LINEs,
DNA transposons…
TEs have been jumping around genes over evolutionary times
13.
14. Contents include:
Professional Tool Roll
Archaeology Margin Trowel
Battiferro Leaf & Square
Battiferro forged ornamental tools lance
Battiferro Trowel and Square
Aluminium scale rulers
Small Tools Set
Hand Shovel
Small Brush
Mason Line*
Line Pegs
Line Level
Plumb Bob
Retractable
Hi-Viz Grip Knife
Battiferro Trowel*
*Optional.
Archeology toolbox
26. TEdenovo RepeatModeler RepeatScout
35
30
25
20
15
10
5
0
Genome coverage increase (%)
REPET, RepeatScout, and RepeatModeler employ
complementary computational methods that together
enable to better represent repeatome complexity.
27. Conclusions I
TEdenovo outcompetes RepeatModeler and RepeatScout
Greater coverage with
Less consensus
Larger consensus
Larger copies
Complementarity of TEdenovo, RepeatModeler and RepeatScout
Comprehensive annotation of complex repeatomes
28. De novo repeatome detection
Deep repeatome annotation
Repeat annotation in large genomes
29. Arabidopsis
120 Mb
Experimental model
CDS Repeatome Dark matter
0% 100%
Three strategies with REPET:
Annotate genome with genomic copies
Use relaxed parameters for HSP detection
Use P-clouds to detect short repeat fragments
35. AA
AC
AG
AT
CA
CC
CG
CT
GA
GC
GG
TA
TC
GT
TG
TT
0,15
0,05
-0,05
CDS
TEdenovo
delta_2vs1
delta_3vs2
delta_4vs3
Dinucleotide composition
36. Relevance
Genome annotation using the delta_2vs1 copies
masks as much as 23 Mb (19.5%) of the genome
Covers 66% of the reference annotation
and 56% of the TEdenovo annotation
The supplementary annotations from
TEdenovo_2 are highly representative of the A.
thaliana repeatome.
42. Deep annotation of the A. thaliana repeatome
RepeatScout
RepeatModeler
TEdenovo
Repbase
(+Buisine et al.)
Remove
redundancy
Bundle library
TEannot
Consensus size
43. Deep annotation of the A. thaliana repeatome
selected
not
selected
TEannot
P-clouds
Complete
bundle
annotation
48. • Bundle + P-clouds
=> Repeated and repeat-derived sequences contribute
at least 30% to the A. thaliana genome
Enhanced repeat detection in gene-rich regions
49. Arabidopsis repeats browser
Genes
Buisine et al.
RepeatScout
RepeatModeler
REPET
Deep annotations
24-nt sRNA
50. Conclusions II
Innovative approaches for deep repeatome annotation
About one third of the A. thaliana genome of repetitive origin (vs 24%)
Increased sensitivity and detection of old repeat remnants
Improved genome evolution and epigenetic analyses
Continuum between repeatome and genomic dark matter
Time
51. De novo repeatome detection
Deep repeatome annotation
Repeat annotation in large genomes
52. All genomes should benefit the greater quality of
TEdenovo
Adapted from Nina V. Fedoroff (2012) and Steven M. Carr
53. Limitations with REPET
All-by-all genome comparison => LOTS (Gb) of high scoring pairs (HSPs)
HSP files > 1 Gb are not handled by Piler
Grouper can last for weeks
Impossible to run TEdenovo on whole large and/or highly
repeated genomes until recently
54. Solutions
Use a sample of whole genome as input for TEdenovo (e.g. 300Mb)
(As recommended for RepeatModeler)
59. De novo repeat annotation in large genomes
Future developments
Parallelize Grouper
Parallelize the “Long join” procedure
Establish phyla-specific approaches
Develop strategies to annotate genomes with different
composition
old, complex repeatomes as compared to large plant
genomes
60. De novo repeat annotation in large genomes
Future challenges & perspectives
Propose TEdenovo and TEannot pipelines on GALAXY
Deliver REPET compilation for use on a cloud
61. Véronique
Jamilloux
Tina Alaeitabar
Timothée
Chaumier
Olivier Inizan
Mark Moissette
Hadi
Quesneville
THANK YOU !