SlideShare une entreprise Scribd logo
1  sur  7
Bioinformatics Data Manipulation:
Molecular Online Tools & BioExtract Server
Theme: FXN Gene and Pancreatic Cancer.
Lab #1
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
Etienne.gnimpieba@usd.edu
Context
0. Specification & Aims
.
Statement of problem / Case study: The FXN gene provides instructions for making a protein called frataxin. This protein is found in cells throughout the body, with the highest levels in the heart,
spinal cord, liver, pancreas, and muscles. The protein is used for voluntary movement (skeletal muscles). Within cells, frataxin is found in energy-producing structures called mitochondria. Although
its function is not fully understood, frataxin appears to help assemble clusters of iron and sulfur molecules that are critical for the function of many proteins, including those needed for energy
production. Mutations in the FXN gene cause Friedreich ataxia. Friedreich ataxia is a genetic condition that affects the nervous system and causes movement problems. Most people with Friedreich
ataxia begin to experience the signs and symptoms of the disorder around puberty.
Bioinformatics Molecular Online Tools and Server
Keywords:
Bio: FXN, Frataxin, pancreatic cancer, CDKN4
Math: HMM,
Informatics: programing, bioinformatics tools, getting
and exporting data
Reduced expression of frataxin is
the cause of Friedrich's ataxia
(FRDA), a lethal neurodegenerative
disease, how about liver cancer?
Aim: The purpose of this lab is to initiate online
biological exploration tools of the human model large
scale data study (metabolic, proteic, genomic, …). We
simulated the application on FXN gene and pancreatic
cancer disease. Now we can understand how a
researcher can come to identify cross biological
knowledge available in data banks.
Acquired skills
Online and server tools:
- Query biological DB (fasta, Html, txt, figure formats)
- Sequence tools (protein and gene)
Alignment (showalign, clustalw2), similarity, …
- Manage data result (select, keep, map, export)
- Build and reuse workflow
Biological Hypothesis
FXN on chromosome 9
Frataxin molecule structure (pymol)
Pancreatic cancerPancreasanatomy
?
BiologicalDB
Tools
Resolution Process
T2. Genome exploration:
Objective: Use of Ensembl to localize the FXN on the human
genome and identify the genes implicate in pancreatic cancer
disease.
T3. Sequences manipulation
Objective: Find similar sequence using BLAST tools
and make an alignment on given sequences.
T2.1. Locate a given gene on human genome
T2.2. Get a genomic sequence from NCBI
T2.3. Get the protein data and sequence from EBI
T2.4. Save the export sequences data in data folder
T3.1. Find similar sequences using BLAST tool
T3.2. Align generated sequences with ClustalW tool
T3.3. Visualized result using phylogenic tree on
Jalview
T5. BioExtract server
Objective: used server tool to optimized data
manipulation process, apply on BioExtract server.
T5.1. Server Initialization
T5.2. Pancreatic cancer & Frataxin (FXN)
T5.3. Mapping, Alignment
T5.4. Workflow save & reused
T4. Protein Data and Structural
Biology Knowledge
Objective: To provide protein levels of frataxin study
and its connection with pancreatic cancer (functional ad
structural data)
T1. Metabolomics
Objective: Use metabolic data repository to
understand the frataxin protein mechanism
T1.1. Finding the Enzyme and Pathway related to
Frataxin using KEGG
T1.2. Finding the Reaction involved with Frataxin
using Reactome
T1.3. Using BRENDA for enzyme data on Frataxin
T1.4. Using Collected data for Analysis
T1.5. Redu the process with Pancreatic Cancer
Results
T4.1. Structural Knowledge on Frataxin using
SBKB
T4.2. Using Uniprot for Frataxin Protein Study
T4.3. Protein-Protein Interaction using STRING
T4.4. Using same method for Pancreatic Cancer
and compare
Data Manipulation Molecular Online Tools and BioExtract Server
T1. Metabolomics
Objective : Use metabolic data repository to understand the frataxin protein mechanism
Theme: Frataxin (FXN) implication in the pancreatic cancer genesis
T1.1. Finding the Enzyme and Pathway related to Frataxin using KEGG
T1.2. Finding the Reaction involved with Frataxin using Reactome
T1.3. Using BRENDA to find information on Frataxin
On the Reactome website: http://www.reactome.org/ReactomeGWT/entrypoint.html
o Search frataxin and select the 4th result with Frataxin in the title. This shows you the pathway model related to frataxin
and how frataxin is involved in it.
On the BRENDA Database website: http://www.brenda-enzymes.org/
o Search using the E.C. number obtained in T1.1 and select the result given. This website gives multitudes of information on
the enzyme including the reaction, related species, and so on. At the very bottom of the webpage you can select other
databases that have infromation on the same compound or protein
On the KEGG Database website: http://www.genome.jp/kegg/
o Search frataxin, and select the first result under KEGG Gene Database (hsa:2395)
o Copy the E.C. number given in “Definition” (EC:1.16.3.1)
o In order to find the related pathway, search the E.C. number in the general KEGG Database search (click on the KEGG
logo on top)
o Select the result given in the KEGG Enzyme Database at the bottom. Here you can see how this enzyme is involved in the
metabolism given.
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
T1.4. Using Collected Information to Analyze the Data
On the BioModels website: http://www.ebi.ac.uk/biomodels-main/
o Search using the E.C. number obtained in T1.1 and select the first result given. Here you can download the SMBL file (in
student folder) for this pathway (top left corner) and analyze it in the Sematic SBML website.
http://semanticsbml.org/semanticSBML/simple/index
o Click on the first box “Find Similar Models” and click “Browse” and select the file you just saved from BioModels. In this
website you can use multiple tools to analyze the model and compare with other models as well.
T1.5. Same Process Searching for Pancreatic Cancer Results (Optional)
o Use the same process searching instead for pancreatic cancer results.
Molecular Online Tools and BioExtract Server
T2. Genome Exploration
Objective: Use Ensembl online tools to localize the FXN on the human genome and identify the genes implicated in pancreatic
cancer disease. Next, find an appropriate data (sequence) on FASTA format.
Theme: Frataxin (FXN) implication in the pancreatic cancer genesis
On the NCBI website: http://www.ncbi.nlm.nih.gov/guide/
o Pull down “All Databases” and select “Gene” database, then do a keyword search using term FXN
o Click the corresponding Homo-sapiens FXN gene (first result)
o Scroll down and look for the “NCBI Reference Sequences” title and go to subtitle “mRNA and Proteins”
o Click on the corresponding accession number of the first transcript variant (NM_000144.4)
o Get the same sequence in FASTA format by clicking on “FASTA” link
o Click Send on the top right in blue, select complete record, file, FASTA, and Create File – then save in
student folder if possible (will save in downloads automatically)
T2.1. Locate a given gene on human genome
T2.2. Get a genomic sequence from NCBI (42 DataBases)
The common protein name for FXN is Frataxin
On the EBI website: http://www.ebi.ac.uk/
o Type “FXN” in the search and click on “find”
o Select the Homo Sapien Frataxin to get all the information about the protein (function, domains, structure, gene expression..)
o Don’t close the window
T2.3. Get the protein information and sequence from EBI
On the Ensembl web site http://uswest.ensembl.org/index.html
o Select our species "human“
o Do a keyword search using the term "FXN“
o Follow the link of the “Gene” drop down feature
o Click the link for “Location”
o Export this gene by clicking “Export data” (left side bar) in html file as a FASTA sequence.
o Click Next
o Click the “HTML” link
o Do the same process by searching for “pancreatic cancer”. When you find the list of genes, select the CDKN2A gene
Data Manipulation
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
Data Manipulation Molecular Online Tools and BioExtract Server
T3. Sequences Manipulation
Objective : Find similar sequence using BLAST tools and make alignment on given sequences.
Theme: Frataxin (FXN) implication in the pancreatic cancer genesis
T3.1. Find similar sequences using BLAST tool
T3.2. Align generated sequences with ClustalW tool
o Select about 10 different species then click on “Align” at the bottom of the screen. Selected sequences will be
directly inserted in ClustalW tool and the tool will run automatically.
o From the right menu, it is possible to select similarities, polar residues, aromatic residues, etc. if interested…
o Through the same page you may add further sequences to the same alignment if needed. You can also access
the phylogenetic tree. More details about the residues and the distances can be obtained by clicking on
“Jalview” on the top right in orange. (May have to open Jalview manually)
o In Jalview, click “file”, “add sequences”, “from file”, then select the sequence file you save earlier.
o Continuing from Task T2.3, select the “Protein” tab on the left and select “view sequence in Uniprot”
o You can get the Fasta format of the protein by clicking on “fasta” in the top right
o Go back to previous page (using browser’s back button) and check the box next to the first sequence under
“Sequences” title.
o Select the “Blast” tool in the drop down menu then click on “Go” .
o The best matched sequences will appear on the first page (green indicates a better match). To see other
sequences you can click on next. Blast parameters can be modified by clicking on “Options” at the top
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
Data Manipulation Molecular Online Tools and BioExtract Server
T4. Protein Data and Structure Data
Objective : To provide protein levels of frataxin study and its connection with pancreatic cancer
(functional ad structural data)
Theme: Frataxin (FXN) implication in the pancreatic cancer genesis
T4.1. Structural Knowledge on Frataxin using SBKB
T4.2. Using Uniprot for Frataxin Protein Study
T4.3. Protein-Protein Interaction using STRING
On Uniprot Database: http://www.uniprot.org/
o Search frataxin and select the first 3 results given and click “Download” in top right. You can then
“Open” or “Download” any of the results given
On the STRING Database: http://string-db.org/
o Search under “search by name” “FXN”.
oSelect the first result given and click “Continue”. Here you can look at the Protein-Protein
Interaction model and obtain more information on a given protein or integration by clicking on it
in the model, as well as use many other useful tools.
On Systems Biology Knowledgebase (SBKB): http://www.sbkb.org/
o Select “by text” (options on left) and search “frataxin”.
o For our example select the link next to “Structures and annotations…” Here you can obtain information
on all the different hits such a structure by looking under all the given tabs.
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
T4.4. Using same method for Pancreatic Cancer and compare
o Go back to the STRING Database home page search under “multiple names” “frataxin” and
“pancreatic cancer”. Select the first result.
oSelect all three results given and click “Continue”. Here it shows the 3 proteins we have
selected, however there are no interaction shown between them in this database.
o Can widen the given result by change our search for cancer in general.
o (If previous step was skipped, then this step is skipped as well) Again go to the query tab and search “FXN”. Search and select a few listings.
Export them as done in T5.2 Go to the tools tab.
o Select similarity search tools, then select “blastp”. Select “use records on extract page formatted as “Fasta”. Under "choose search set" select the
database "swissprot"
o When execution complete, go to the extract page and select 10 different sequences belonging to 10 different species including human, then “keep
only selected records.” Again export the records.
o Go to the tools tab again, select “iPlant”, then “clustal w2”. Select “use records on extract page formatted as “Fasta”. Your 10 protein sequences
will be automatically incorporated as an input in clustalw2 tool. Execute the tool. Use the pull down for “Search Results” and select “clustalw2.fa”
before viewing the results.
Data Manipulation Molecular Online Tools and BioExtract Server
Etienne Z. Gnimpieba
BRIN WS 2013
Mount Marty College – June 24th 2013
T5. Bioextract Server
Objective : Use Workflow Management Systems (WMS) to optimized data manipulation processes (BioExtract server).
Theme: Frataxin (FXN) implication in the pancreatic cancer genesis
T5.4. Workflow save & reused
http://bioextract.orgT5.1. Server Initialization
T5.2. Pancreatic cancer & Frataxin (FXN) data
T5.3. Mapping, Alignment
o Register on BioExtract Server to be able to create and save your own workflows.
o Click on the “workflows tab”, then click “create and import workflows.” Now click “record workflow” then “close.”
o To obtain the workflow at the end of the lab: From the “workflows” tab click on “create and Import workflows” then click on “save records”.
o Select the query tab. Then select the protein sequences and check the box next to NCBI protein database. Select “gene” as Search field and type “FXN”. Click
on “Add Seach Line” and select “Species” and type “Human”. Submit the query.
o Results will appear on the “extract page”. You can get the Genbank view of each sequence by clicking on “View record”. We will need only the Homo sapien
Frataxin. For that, we will click “select records”, then check the corresponding box of your choosing. Click on “keep only selected records”. The results can
be saved or extracted in Fasta or txt format (Export the records in FASTA format)
o Click to the "tools" tab. then click on “Alignment Tools”, and “showalign”. Select “Use records on extract page formatted in Fasta”.
o Click on “execute” to run the tool. When execution is complete, results can be retrieved by selecting the desired format and clicking on “view results”.
o Repeat the search process with “pancreatic cancer”. Make sure you change the first search field to “all text ” (Optional)
o Go back to the “workflow” tab and click “create and import workflows”. Write a name and a description for your workflow then click on Save. All
the previous steps will be saved in this workflow.
o Once the workflow saves, you will find it in the bottom of the workflow list. Click on the name of the workflow to have a schematic view of it.
Run the workflow by clicking on “start”.
o Get and verify all the results by clicking on “provenance”. The general report can be saved for later analysis. Results of each tool can be viewed or
saved by clicking on “view file”.
o The same workflow can be executed for another query by simply modifying the accession number of the protein. (Click save in the “create and
import workflows” section to temporarily save the new query)

Contenu connexe

Tendances (20)

Protein database
Protein databaseProtein database
Protein database
 
Protein database
Protein  databaseProtein  database
Protein database
 
Evolution Phylogenetic
Evolution PhylogeneticEvolution Phylogenetic
Evolution Phylogenetic
 
Proteomics repositories
Proteomics repositoriesProteomics repositories
Proteomics repositories
 
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
 
UniProt & Ontologies
UniProt & OntologiesUniProt & Ontologies
UniProt & Ontologies
 
GPKB: Genomic and Proteomic Knowledge Base
GPKB: Genomic and Proteomic Knowledge BaseGPKB: Genomic and Proteomic Knowledge Base
GPKB: Genomic and Proteomic Knowledge Base
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
EMBL
EMBLEMBL
EMBL
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Protein Data Bank
Protein Data BankProtein Data Bank
Protein Data Bank
 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
 
Protein structure
Protein structureProtein structure
Protein structure
 
PROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANKPROTEIN STRUCTURE DATABANK
PROTEIN STRUCTURE DATABANK
 
02.databases slides
02.databases slides02.databases slides
02.databases slides
 
Curation Introduction - Apollo Workshop
Curation Introduction - Apollo WorkshopCuration Introduction - Apollo Workshop
Curation Introduction - Apollo Workshop
 
protein data bank
protein data bankprotein data bank
protein data bank
 
Role of genomics proteomics, and bioinformatics.
Role of genomics proteomics, and bioinformatics.Role of genomics proteomics, and bioinformatics.
Role of genomics proteomics, and bioinformatics.
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 

En vedette

Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrUSD Bioinformatics
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccUSD Bioinformatics
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisUSD Bioinformatics
 
Session ii g1 overview genomics and gene expression mmc-good
Session ii g1 overview genomics and gene expression mmc-goodSession ii g1 overview genomics and gene expression mmc-good
Session ii g1 overview genomics and gene expression mmc-goodUSD Bioinformatics
 

En vedette (6)

Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
Session ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mccSession ii g2 overview metabolic network modeling mcc
Session ii g2 overview metabolic network modeling mcc
 
Huber brin pb1_f2_poster_2012
Huber brin pb1_f2_poster_2012Huber brin pb1_f2_poster_2012
Huber brin pb1_f2_poster_2012
 
Lab Gene Expression Data Analysis
Lab Gene Expression Data AnalysisLab Gene Expression Data Analysis
Lab Gene Expression Data Analysis
 
Session ii g1 overview genomics and gene expression mmc-good
Session ii g1 overview genomics and gene expression mmc-goodSession ii g1 overview genomics and gene expression mmc-good
Session ii g1 overview genomics and gene expression mmc-good
 
Visualization Tools
Visualization ToolsVisualization Tools
Visualization Tools
 

Similaire à Session i lab bioinfo dm and app mmc

Lab Online Molecular Tools and BioExtract Server
Lab Online Molecular Tools and BioExtract ServerLab Online Molecular Tools and BioExtract Server
Lab Online Molecular Tools and BioExtract ServerUSD Bioinformatics
 
Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manualFrazAhmadMazari
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!adcobb
 
UniProt
UniProtUniProt
UniProtAmnaA7
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsAyeshaYousaf20
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxxRowlet
 
Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5Osama Barayan
 
Working with Chromosomes
Working with ChromosomesWorking with Chromosomes
Working with ChromosomesIoanna Leontiou
 
Epigeneticsand methylation
Epigeneticsand methylationEpigeneticsand methylation
Epigeneticsand methylationShubhda Roy
 
Sequencedatabases
SequencedatabasesSequencedatabases
SequencedatabasesAbhik Seal
 
Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)Sijo A
 
TaskDifferentiate the following terms and provide an image obtain.docx
TaskDifferentiate the following terms and provide an image obtain.docxTaskDifferentiate the following terms and provide an image obtain.docx
TaskDifferentiate the following terms and provide an image obtain.docxjosies1
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBIgeetikaJethra
 

Similaire à Session i lab bioinfo dm and app mmc (20)

Lab Online Molecular Tools and BioExtract Server
Lab Online Molecular Tools and BioExtract ServerLab Online Molecular Tools and BioExtract Server
Lab Online Molecular Tools and BioExtract Server
 
Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manual
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 
Applications of bioinformatics
Applications of bioinformaticsApplications of bioinformatics
Applications of bioinformatics
 
UniProt
UniProtUniProt
UniProt
 
GFP Workshop
GFP WorkshopGFP Workshop
GFP Workshop
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
Article
ArticleArticle
Article
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
 
Bioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptxBioinformatics_1_ChenS.pptx
Bioinformatics_1_ChenS.pptx
 
Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5Practical 7 dna, rna and the flow of genetic information5
Practical 7 dna, rna and the flow of genetic information5
 
Working with Chromosomes
Working with ChromosomesWorking with Chromosomes
Working with Chromosomes
 
Epigeneticsand methylation
Epigeneticsand methylationEpigeneticsand methylation
Epigeneticsand methylation
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)Bioinformatics for beginners (exam point of view)
Bioinformatics for beginners (exam point of view)
 
TaskDifferentiate the following terms and provide an image obtain.docx
TaskDifferentiate the following terms and provide an image obtain.docxTaskDifferentiate the following terms and provide an image obtain.docx
TaskDifferentiate the following terms and provide an image obtain.docx
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 

Plus de USD Bioinformatics

Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerClinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerUSD Bioinformatics
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time SequencingUSD Bioinformatics
 
Next Generation Sequencing - the basics
Next Generation Sequencing - the basicsNext Generation Sequencing - the basics
Next Generation Sequencing - the basicsUSD Bioinformatics
 
Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcUSD Bioinformatics
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcUSD Bioinformatics
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcUSD Bioinformatics
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcUSD Bioinformatics
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcUSD Bioinformatics
 

Plus de USD Bioinformatics (20)

Clinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder CancerClinical Application of RNA Sequencing - Bladder Cancer
Clinical Application of RNA Sequencing - Bladder Cancer
 
Clinical Application 1.0
Clinical Application 1.0Clinical Application 1.0
Clinical Application 1.0
 
Clinical Application 2.0
Clinical Application 2.0Clinical Application 2.0
Clinical Application 2.0
 
Bridge Amplification Part 2
Bridge Amplification Part 2Bridge Amplification Part 2
Bridge Amplification Part 2
 
Bridge Amplification Part 1
Bridge Amplification Part 1Bridge Amplification Part 1
Bridge Amplification Part 1
 
Basic Steps of the NGS Method
Basic Steps of the NGS MethodBasic Steps of the NGS Method
Basic Steps of the NGS Method
 
True Single Molecule Sequencing
True Single Molecule SequencingTrue Single Molecule Sequencing
True Single Molecule Sequencing
 
Small Molecule Real Time Sequencing
Small Molecule Real Time SequencingSmall Molecule Real Time Sequencing
Small Molecule Real Time Sequencing
 
Sanger Dideoxy Method
Sanger Dideoxy MethodSanger Dideoxy Method
Sanger Dideoxy Method
 
Pyrosequencing 454
Pyrosequencing 454Pyrosequencing 454
Pyrosequencing 454
 
Ion Torrent Sequencing
Ion Torrent SequencingIon Torrent Sequencing
Ion Torrent Sequencing
 
Next Generation Sequencing - the basics
Next Generation Sequencing - the basicsNext Generation Sequencing - the basics
Next Generation Sequencing - the basics
 
Illumina Sequencing
Illumina SequencingIllumina Sequencing
Illumina Sequencing
 
Session ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmcSession ii g3 overview epidemiology modeling mmc
Session ii g3 overview epidemiology modeling mmc
 
Session ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmcSession ii g3 overview behavior science mmc
Session ii g3 overview behavior science mmc
 
Session ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmcSession ii g3 lab behavior science mmc
Session ii g3 lab behavior science mmc
 
Session ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmcSession ii g2 overview protein modeling mmc
Session ii g2 overview protein modeling mmc
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
 
Session ii g2 lab modeling mmc
Session ii g2 lab modeling mmcSession ii g2 lab modeling mmc
Session ii g2 lab modeling mmc
 
Swiss model evaluation
Swiss model evaluationSwiss model evaluation
Swiss model evaluation
 

Dernier

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Dernier (20)

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Session i lab bioinfo dm and app mmc

  • 1. Bioinformatics Data Manipulation: Molecular Online Tools & BioExtract Server Theme: FXN Gene and Pancreatic Cancer. Lab #1 Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013 Etienne.gnimpieba@usd.edu
  • 2. Context 0. Specification & Aims . Statement of problem / Case study: The FXN gene provides instructions for making a protein called frataxin. This protein is found in cells throughout the body, with the highest levels in the heart, spinal cord, liver, pancreas, and muscles. The protein is used for voluntary movement (skeletal muscles). Within cells, frataxin is found in energy-producing structures called mitochondria. Although its function is not fully understood, frataxin appears to help assemble clusters of iron and sulfur molecules that are critical for the function of many proteins, including those needed for energy production. Mutations in the FXN gene cause Friedreich ataxia. Friedreich ataxia is a genetic condition that affects the nervous system and causes movement problems. Most people with Friedreich ataxia begin to experience the signs and symptoms of the disorder around puberty. Bioinformatics Molecular Online Tools and Server Keywords: Bio: FXN, Frataxin, pancreatic cancer, CDKN4 Math: HMM, Informatics: programing, bioinformatics tools, getting and exporting data Reduced expression of frataxin is the cause of Friedrich's ataxia (FRDA), a lethal neurodegenerative disease, how about liver cancer? Aim: The purpose of this lab is to initiate online biological exploration tools of the human model large scale data study (metabolic, proteic, genomic, …). We simulated the application on FXN gene and pancreatic cancer disease. Now we can understand how a researcher can come to identify cross biological knowledge available in data banks. Acquired skills Online and server tools: - Query biological DB (fasta, Html, txt, figure formats) - Sequence tools (protein and gene) Alignment (showalign, clustalw2), similarity, … - Manage data result (select, keep, map, export) - Build and reuse workflow Biological Hypothesis FXN on chromosome 9 Frataxin molecule structure (pymol) Pancreatic cancerPancreasanatomy ? BiologicalDB Tools Resolution Process T2. Genome exploration: Objective: Use of Ensembl to localize the FXN on the human genome and identify the genes implicate in pancreatic cancer disease. T3. Sequences manipulation Objective: Find similar sequence using BLAST tools and make an alignment on given sequences. T2.1. Locate a given gene on human genome T2.2. Get a genomic sequence from NCBI T2.3. Get the protein data and sequence from EBI T2.4. Save the export sequences data in data folder T3.1. Find similar sequences using BLAST tool T3.2. Align generated sequences with ClustalW tool T3.3. Visualized result using phylogenic tree on Jalview T5. BioExtract server Objective: used server tool to optimized data manipulation process, apply on BioExtract server. T5.1. Server Initialization T5.2. Pancreatic cancer & Frataxin (FXN) T5.3. Mapping, Alignment T5.4. Workflow save & reused T4. Protein Data and Structural Biology Knowledge Objective: To provide protein levels of frataxin study and its connection with pancreatic cancer (functional ad structural data) T1. Metabolomics Objective: Use metabolic data repository to understand the frataxin protein mechanism T1.1. Finding the Enzyme and Pathway related to Frataxin using KEGG T1.2. Finding the Reaction involved with Frataxin using Reactome T1.3. Using BRENDA for enzyme data on Frataxin T1.4. Using Collected data for Analysis T1.5. Redu the process with Pancreatic Cancer Results T4.1. Structural Knowledge on Frataxin using SBKB T4.2. Using Uniprot for Frataxin Protein Study T4.3. Protein-Protein Interaction using STRING T4.4. Using same method for Pancreatic Cancer and compare
  • 3. Data Manipulation Molecular Online Tools and BioExtract Server T1. Metabolomics Objective : Use metabolic data repository to understand the frataxin protein mechanism Theme: Frataxin (FXN) implication in the pancreatic cancer genesis T1.1. Finding the Enzyme and Pathway related to Frataxin using KEGG T1.2. Finding the Reaction involved with Frataxin using Reactome T1.3. Using BRENDA to find information on Frataxin On the Reactome website: http://www.reactome.org/ReactomeGWT/entrypoint.html o Search frataxin and select the 4th result with Frataxin in the title. This shows you the pathway model related to frataxin and how frataxin is involved in it. On the BRENDA Database website: http://www.brenda-enzymes.org/ o Search using the E.C. number obtained in T1.1 and select the result given. This website gives multitudes of information on the enzyme including the reaction, related species, and so on. At the very bottom of the webpage you can select other databases that have infromation on the same compound or protein On the KEGG Database website: http://www.genome.jp/kegg/ o Search frataxin, and select the first result under KEGG Gene Database (hsa:2395) o Copy the E.C. number given in “Definition” (EC:1.16.3.1) o In order to find the related pathway, search the E.C. number in the general KEGG Database search (click on the KEGG logo on top) o Select the result given in the KEGG Enzyme Database at the bottom. Here you can see how this enzyme is involved in the metabolism given. Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013 T1.4. Using Collected Information to Analyze the Data On the BioModels website: http://www.ebi.ac.uk/biomodels-main/ o Search using the E.C. number obtained in T1.1 and select the first result given. Here you can download the SMBL file (in student folder) for this pathway (top left corner) and analyze it in the Sematic SBML website. http://semanticsbml.org/semanticSBML/simple/index o Click on the first box “Find Similar Models” and click “Browse” and select the file you just saved from BioModels. In this website you can use multiple tools to analyze the model and compare with other models as well. T1.5. Same Process Searching for Pancreatic Cancer Results (Optional) o Use the same process searching instead for pancreatic cancer results.
  • 4. Molecular Online Tools and BioExtract Server T2. Genome Exploration Objective: Use Ensembl online tools to localize the FXN on the human genome and identify the genes implicated in pancreatic cancer disease. Next, find an appropriate data (sequence) on FASTA format. Theme: Frataxin (FXN) implication in the pancreatic cancer genesis On the NCBI website: http://www.ncbi.nlm.nih.gov/guide/ o Pull down “All Databases” and select “Gene” database, then do a keyword search using term FXN o Click the corresponding Homo-sapiens FXN gene (first result) o Scroll down and look for the “NCBI Reference Sequences” title and go to subtitle “mRNA and Proteins” o Click on the corresponding accession number of the first transcript variant (NM_000144.4) o Get the same sequence in FASTA format by clicking on “FASTA” link o Click Send on the top right in blue, select complete record, file, FASTA, and Create File – then save in student folder if possible (will save in downloads automatically) T2.1. Locate a given gene on human genome T2.2. Get a genomic sequence from NCBI (42 DataBases) The common protein name for FXN is Frataxin On the EBI website: http://www.ebi.ac.uk/ o Type “FXN” in the search and click on “find” o Select the Homo Sapien Frataxin to get all the information about the protein (function, domains, structure, gene expression..) o Don’t close the window T2.3. Get the protein information and sequence from EBI On the Ensembl web site http://uswest.ensembl.org/index.html o Select our species "human“ o Do a keyword search using the term "FXN“ o Follow the link of the “Gene” drop down feature o Click the link for “Location” o Export this gene by clicking “Export data” (left side bar) in html file as a FASTA sequence. o Click Next o Click the “HTML” link o Do the same process by searching for “pancreatic cancer”. When you find the list of genes, select the CDKN2A gene Data Manipulation Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013
  • 5. Data Manipulation Molecular Online Tools and BioExtract Server T3. Sequences Manipulation Objective : Find similar sequence using BLAST tools and make alignment on given sequences. Theme: Frataxin (FXN) implication in the pancreatic cancer genesis T3.1. Find similar sequences using BLAST tool T3.2. Align generated sequences with ClustalW tool o Select about 10 different species then click on “Align” at the bottom of the screen. Selected sequences will be directly inserted in ClustalW tool and the tool will run automatically. o From the right menu, it is possible to select similarities, polar residues, aromatic residues, etc. if interested… o Through the same page you may add further sequences to the same alignment if needed. You can also access the phylogenetic tree. More details about the residues and the distances can be obtained by clicking on “Jalview” on the top right in orange. (May have to open Jalview manually) o In Jalview, click “file”, “add sequences”, “from file”, then select the sequence file you save earlier. o Continuing from Task T2.3, select the “Protein” tab on the left and select “view sequence in Uniprot” o You can get the Fasta format of the protein by clicking on “fasta” in the top right o Go back to previous page (using browser’s back button) and check the box next to the first sequence under “Sequences” title. o Select the “Blast” tool in the drop down menu then click on “Go” . o The best matched sequences will appear on the first page (green indicates a better match). To see other sequences you can click on next. Blast parameters can be modified by clicking on “Options” at the top Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013
  • 6. Data Manipulation Molecular Online Tools and BioExtract Server T4. Protein Data and Structure Data Objective : To provide protein levels of frataxin study and its connection with pancreatic cancer (functional ad structural data) Theme: Frataxin (FXN) implication in the pancreatic cancer genesis T4.1. Structural Knowledge on Frataxin using SBKB T4.2. Using Uniprot for Frataxin Protein Study T4.3. Protein-Protein Interaction using STRING On Uniprot Database: http://www.uniprot.org/ o Search frataxin and select the first 3 results given and click “Download” in top right. You can then “Open” or “Download” any of the results given On the STRING Database: http://string-db.org/ o Search under “search by name” “FXN”. oSelect the first result given and click “Continue”. Here you can look at the Protein-Protein Interaction model and obtain more information on a given protein or integration by clicking on it in the model, as well as use many other useful tools. On Systems Biology Knowledgebase (SBKB): http://www.sbkb.org/ o Select “by text” (options on left) and search “frataxin”. o For our example select the link next to “Structures and annotations…” Here you can obtain information on all the different hits such a structure by looking under all the given tabs. Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013 T4.4. Using same method for Pancreatic Cancer and compare o Go back to the STRING Database home page search under “multiple names” “frataxin” and “pancreatic cancer”. Select the first result. oSelect all three results given and click “Continue”. Here it shows the 3 proteins we have selected, however there are no interaction shown between them in this database. o Can widen the given result by change our search for cancer in general.
  • 7. o (If previous step was skipped, then this step is skipped as well) Again go to the query tab and search “FXN”. Search and select a few listings. Export them as done in T5.2 Go to the tools tab. o Select similarity search tools, then select “blastp”. Select “use records on extract page formatted as “Fasta”. Under "choose search set" select the database "swissprot" o When execution complete, go to the extract page and select 10 different sequences belonging to 10 different species including human, then “keep only selected records.” Again export the records. o Go to the tools tab again, select “iPlant”, then “clustal w2”. Select “use records on extract page formatted as “Fasta”. Your 10 protein sequences will be automatically incorporated as an input in clustalw2 tool. Execute the tool. Use the pull down for “Search Results” and select “clustalw2.fa” before viewing the results. Data Manipulation Molecular Online Tools and BioExtract Server Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24th 2013 T5. Bioextract Server Objective : Use Workflow Management Systems (WMS) to optimized data manipulation processes (BioExtract server). Theme: Frataxin (FXN) implication in the pancreatic cancer genesis T5.4. Workflow save & reused http://bioextract.orgT5.1. Server Initialization T5.2. Pancreatic cancer & Frataxin (FXN) data T5.3. Mapping, Alignment o Register on BioExtract Server to be able to create and save your own workflows. o Click on the “workflows tab”, then click “create and import workflows.” Now click “record workflow” then “close.” o To obtain the workflow at the end of the lab: From the “workflows” tab click on “create and Import workflows” then click on “save records”. o Select the query tab. Then select the protein sequences and check the box next to NCBI protein database. Select “gene” as Search field and type “FXN”. Click on “Add Seach Line” and select “Species” and type “Human”. Submit the query. o Results will appear on the “extract page”. You can get the Genbank view of each sequence by clicking on “View record”. We will need only the Homo sapien Frataxin. For that, we will click “select records”, then check the corresponding box of your choosing. Click on “keep only selected records”. The results can be saved or extracted in Fasta or txt format (Export the records in FASTA format) o Click to the "tools" tab. then click on “Alignment Tools”, and “showalign”. Select “Use records on extract page formatted in Fasta”. o Click on “execute” to run the tool. When execution is complete, results can be retrieved by selecting the desired format and clicking on “view results”. o Repeat the search process with “pancreatic cancer”. Make sure you change the first search field to “all text ” (Optional) o Go back to the “workflow” tab and click “create and import workflows”. Write a name and a description for your workflow then click on Save. All the previous steps will be saved in this workflow. o Once the workflow saves, you will find it in the bottom of the workflow list. Click on the name of the workflow to have a schematic view of it. Run the workflow by clicking on “start”. o Get and verify all the results by clicking on “provenance”. The general report can be saved for later analysis. Results of each tool can be viewed or saved by clicking on “view file”. o The same workflow can be executed for another query by simply modifying the accession number of the protein. (Click save in the “create and import workflows” section to temporarily save the new query)

Notes de l'éditeur

  1. Welcome to this bioinformatics lab on data manipulation using online and server tools.As the theme, we have chosen to study of the interaction between Frataxin and pancreatic cancer.
  2. This is the lab template: The context is a biological context based on a real biological problem. And a given hypothesisI don’t use computer science, strong word.When you read this template, you have a different view than an informatician.You want to understand the process to build the used tools.The architecture of the systemThe algorithm implementationThe quality of the resulting dataAnd so on