1. Using BOLD Data in Bioinformatics Workflows Dr. Justin Schonfeld Biodiversity Institute of Ontario
2.
3.
4. High level data flow Museums Private collections Regulatory Agencies Researchers CCDB BOLD Genbank Mirrors Educators Researchers Regulatory Agencies Australian Museum
5. Typical Informatics Workflow Filtered Data Aligned Data Cleaned Data BOLD Align Data Identify Problematic Sequences Analyze Data Extract Data Local Copy Filter Data
6.
7.
8.
9.
10.
11. Impact of Alignment Alignment Build Phylogenetic Trees Nearest Neighbor Analysis Clustering Distance Matrices
13. Aligning Animal Barcode Data CO1 Barcode Short CO1 3’ CO1’ Full CO1 sequence Barcode Even a gene as straightforward as CO1 can provide alignment challenges. 5’ 3’
14.
15.
16.
17.
18.
19.
20. Example Workflow: Occurrence of Indels Download public BOLD Hymenoptera ecords using webservices Select sequences with full taxonomy Align sequences using MAAFT, Muscle, Transalign Select one representative per species Remove problematic Sequences Tree Map sequences onto phylogeny
21. Example Workflow: Code shifts Download public BOLD Hymenoptera ecords using webservices 80,000 sequences – Align pairwise Scan sequences for code shifts Remove problematic sequences Analyze results