Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
1. Rice Galaxy : an open resource
for rice science
13th International Conference on Genomics
October 24, 2018
Venice Margarette B. Juanillas
Bioinformatics Cluster
Strategic Innovation Platform
2. Open source and free bioinformatics
software are …
• typically run in command-line environment (no mouse , no graphics!)
• Example: blat (BLAST-Like alignment tool), performing local
alignment between 2 (multi) FASTA files…
Type in your
command…
3. $ blat
blat - Standalone BLAT v. 34 fast sequence search command line
tool
usage:
blat database query [-tileSize=8] [–maxIntron=3000] [-
out=psl] output.psl
where:
database and query are each either a .fa , .nib or .2bit
file,
or a list these files one file name per line.
-maxIntron=N Sets maximum intron size. Default is 750000
-out=type Controls output file format. Type is one of:
psl - Default. Tab separated format, no
sequence
pslx - Tab separated format with sequence
-tileSize=N sets the size of match that triggers an alignment.
Usually between 8 and 12
Default is 11 for DNA and 5 for protein.
output.psl is where to put the output.
& so many other parameters to set …
4. What if you can design a GUI for blat?
$blat database query [-tileSize=8] [–maxIntron=3000]
[out=psl] output.psl
6. Analyses are often sequential…
Commonly called analyses workflow or pipeline
– Use software1 with its own input file and generate
<output file 1> , then..
– Manipulate the text of <output file 1> so that it can be
used as input file <manipulated outfile2> of software2
– Use software2 with <manipulated outfile2> as input
and generate <outfile3>
– Use <outfile3> as input for software3, then generate
final output of analysis…
8. https://galaxyproject.org
The Galaxy Project is supported in part by NSF,
NHGRI, The Huck Institutes of the Life Sciences, The
Institute for CyberScience at Penn State, and Johns
Hopkins University.
9. Galaxy has features that fit our needs
“Open, web-based platform for accessible,
reproducible, and transparent computational
biomedical research”
• Accessible: Users w/o programming experience can
easily specify parameters and run tools and workflows
• Reproducible: Galaxy captures info so that any user
can repeat and understand a complete computational
analysis
• Transparent: Users share and publish analyses via
the web and create interactive, web-based documents
that describe a complete analysis.
12. Rice Galaxy Project
Rice Galaxy: Bioinformatics tools, datasets, and
reusable workflows for rice genomic and genetic
analyses
Collaboration of Institutions
IRRI : Philippines
IRD, CIRAD : France
Colorado State University, Texas A&M University, Indiana University: USA
Advanced Institute of Science and Technology: Japan
16. Data accessible/integrated into Rice Galaxy
• 3,000 genomes SNP / indel & phenotype data
• Rice HDRA genotyping and phenotyping data
• 7 (+2 older Nipponbare) published rice genomes
and annotations
17. Integrated Tools into Rice Galaxy
Workflows/tools dedicated for rice from both
bioinformatics platform (South Green
Bioinformatics and IRRI platform):
1. 3k RG and HDRA Toolkit (IRRI)
2. SNP Data Analysis Tools
3. TASSEL bioinformatics (South Green, IRRI) for GBS data management
4. OGHMA genomic prediction tool (IRRI)
5. RAVE (Rapid Allelic Variant extractor) to extract variants from 3000
genomes
6. SNiPlay workflows (7) for diversity and population structure analysis and
GWAS studies (South Green)
7. Uniqprimer microbial pathogen diagnostic design toolkit (CSU/USDA/South
Green/IRRI)
18.
19.
20.
21.
22. Genomic Prediction Tool suite
Aim: Tools that will decipher the genotypes to understand
how it affects phenotype on rice using machine learning
algorithms
26. Basic Use Case
• Find the gene position from Nipponbare to IR8
Nipponbare : chr01 11218-12435
How about in IR8??
1. Get gene sequence from Nipponbare (GD->Get Gene
Sequence)
2. Align to another reference genome ,IR8 (SDT->Find-
seq)
3. Post-process, clean up alignment
– Cut col 14,16,17 (TM->cut columns)
– Remove 1st 5 lines (TM ->remove beginning…)
4. Extract liftover sequences (SDT -> batch-get-subseq)
28. Rice Galaxy Open Access
All meaningful data objects must have a globally unique
and persistent identifier (PID) for CGIAR open access
compliance
29. Rice Galaxy Tool shed
• Allow other researchers to use the tools
• Allow tool shed enrichment by hosting tools from other
researchers in the rice community
• Rice Galaxy Tool shed: http: //52.76.88.51:8081/
30. Conclusion
• Rice Galaxy is a federated Galaxy resource tailored for
rice genetics, genomics and breeding
• Rice Galaxy integrates publicly available rice datasets
and tools from other researchers in the rice community
31. Thank You!
• Alexis Dereeper
• Nicolas Beaume
• Gaetan Droc
• Joshua Dizon
• John Robert Mendoza
• Jon Peter Perdon
• Locedie Mansueto
• Lindsay Triplett
• Jillian Lang
• Gabriel Zhou
• Jay Santos
• Dennis Diaz
• DOST-ASTI
• Kunalan Ratharanjan
• Beth Plale
• Jason Haga
• Jan E. Leach
• Manuel Ruiz
• Michael Thomson
• Nickolai Alexandrov
• Pierre Larmande
• Ramil P. Mauleon