SlideShare une entreprise Scribd logo
1  sur  22
A practical introduction
to handling 16S data
Mads Albertsen
Internal workshop 2013
CENTER FOR MICROBIAL COMMUNITIES
• Introduction
• Case story: GAO Reactors
• Generating OTU tables (Hands on)
• Analyzing data in Excel (Hands on)
• Analyzing data in R (Hands on)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Agenda
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Who - when, where and why?
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Who - when, where and why?
http://phil.cdc.gov/phil/details.asp?pid=2226http://en.wikipedia.org/wiki/File:EBPR_FISH_Floc.jpg P. Larsen 2012
Accumulibacter Competibacter Bacillus anthracis
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
The affinities of all the beings of the same class have
sometimes been represented by a great tree... The
green and budding twigs may represent existing
species; and those produced during former years
may represent the long succession of extinct species.
C. Darwin, 1872
http://tolweb.org
Nothing in biology makes sense,
except in the light of evolution.
T. Dobzhansky, 1973
Taking advantage of evolution
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Why do we use the 16S gene?
Ribosomes are universal
rRNA = Structural RNA
http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Why do we use the 16S gene?
http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf
8F8F Universal primer
8F
8F
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Why do we use the 16S gene?
http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf
Ashelford et al. AEM. 2005;71:7724-7736
• Advantages:
• Universal gene (No horizontal gene transfer)
• Conserved regions
• Variable regions
• Great databases and alignments
• Problems:
• Variable copy number
• No universal (unbiased) primers
• (Not directly correlated with activity)
• (Lack of functional information)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
Sampling SequencingExtraction Sample prep Bioinformatics
The focus of the workshop is bioinformatics.
However, the preceding steps influences how we
handle the data.
Sampling SequencingExtraction Sample prep Bioinformatics
• Standardisation, standardization, standardizasion..!
• Use biological replicates and evaluate your variation…!
• Design a good experiment with realistic expectations to
the outcome (Most studies fail here!!!)
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
AAU activated sludge standard @ midasfieldguide.org
Sampling SequencingExtraction Sample prep Bioinformatics
eDNA removal
Input (mg)
Bead beating
Storage
Intensity (ms-1)
Duration(s) 4 6
400
160
80
40
20
1 2 4 9 22
• Fresh
• 24 h @ 4°C
• 24 h @ 20 °C
PMA650 W 10 min
+ N+ CH3
NH2
N3
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
AAU activated sludge standard @ midasfieldguide.org
Sampling SequencingExtraction Sample prep Bioinformatics
Bp
Meanfrequencyof
mostcommonresidue
in50bpwindow
0 500 1000 1500
1.0
0.8
0.6 V1
V2 V3
V4 V5
V6
V7
V8
V9
V1.3 V4
V3.4
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
AAU activated sludge standard @ midasfieldguide.org
Ashelford et al. AEM. 2005;71:7724-7736
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
PCR with modified 16S primers
5’-AATGATACGGCGACCACCGAGATCTACAC GTACGTACG GT AGAGTTTGATCCTGGCTCAG-3’
5’-CAAGCAGAAGACGGCATACGAGAT TCCCTTGTCTCC ACGTACGTAC CCG ATTACCGCGGCTGCTGG-3’
Illumina adapter Barcode Pad linker 534R
Illumina adapter Pad linker 27F
////
Target region
//
1.
2.
3.
AAU activated sludge standard @ midasfieldguide.org
PCR Cycle
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
Mardis, 2008 (PMID 18576944)
≈ 500 bp target amplicon
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
Read 1: 300 bp
Read 2: 300 bp
Read 1
Read 2
Barcode
≈ 500 bp target amplicon
After Sequencing:
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
How many sequences are needed? It depends on your question!
(although 50.000 raw sequences per sample is usually fine)
AAU raw kit and chemical costs (DKK) Cost Cost v2
DNA extraction 105 70a
Library preparation 40 40
Sequencing (min 100k reads / sample) 190b 70c
Total 335 180
a Kits discounted
b 50 samples per run
c 150 samples per run (can run up to 300)
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
Merge Cluster
3
11
3
1
OTU Count
Assign taxonomy (Compare to database)
3 Accumulibacter
11 Unkown
3 Competibacter
1 Bacillus anthracis
OTU Count OTU table
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
Merge Cluster
2 1
3 8
3 0
1 0
OTU A B
Assign taxonomy (Compare to database)
A
A
A
A
A
A
A
A
A
B
B
B
B
B
B
B
B
B
Barcode
2 1 Accumulibacter
3 8 Unkown
3 0 Competibacter
1 0 Bacillus anthracis
OTU A B
OTU table
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
Sequence errors, chimera’s and weird stuff..
The chance of a perfect read as
function of the read length
Chimera’s
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
Typical workflow
Merge Cluster
3
11
3
OTU Count
Assign taxonomy (Compare to database)
3 Accumulibacter
11 Unkown
3 Competibacter
OTU Count OTU table
Removing unique sequences makes the
subsequent steps 10-100x faster and removes
the majority of errors and chimera’s
Dependent on sequencing depth and
sample complexity! Be careful!
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
AAU workflow
16SAMP-145
16SAMP-146
16SAMP-147
16SAMP-148
16SAMP-149
16SAMP-150
16S.V13.workflow.sh
Find sample ID’s on Google drive
OTU table (+ R version)
Plain text file
2 1 Accumulibacter
3 8 Unkown
3 0 Competibacter
OTU A B
Sampling SequencingExtraction Sample prep Bioinformatics
CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY
AAU workflow
What 16S.V13.workflow.sh does:
1. Find and unpack your samples
2. Optional subsampling
3. Remove potential phiX contamination (bowtie2)
4. Merge read 1 and read 2 (flash)
5. Remove reads outside length criteria
6. Optional removal of unique reads and subsampling to even depth
7. Format reads for QIIME
8. Cluster reads to OTUs (Uclust, QIIME)
9. Assign taxonomy (RDP classifier, QIIME + database: MiDAS, Greengnes or Silva)
10. Generate OTU table (QIIME)

Contenu connexe

Tendances

MCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfMCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfRajendraChavhan3
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)LOGESWARAN KA
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNAmaryamshah13
 
Application of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sectorApplication of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sectorSuraj Singh
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesSurya Saha
 
A Brief Introduction to Metabolomics
A Brief Introduction to Metabolomics A Brief Introduction to Metabolomics
A Brief Introduction to Metabolomics Ranjith Raj V
 
Metabolomics- concepts and applications
Metabolomics- concepts and applicationsMetabolomics- concepts and applications
Metabolomics- concepts and applicationsAnup Ray
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequencesababibi
 
Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014Mohammad Hessam Rafiee
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesMelanie Courtot
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted MutationAmit Kyada
 
Single cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsSingle cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsfaraharooj
 
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015
Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015Torsten Seemann
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure predictionkaramveer prajapat
 

Tendances (20)

MCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdfMCQs on DNA MicroArray.pdf
MCQs on DNA MicroArray.pdf
 
Primer designing for pcr and qpcr and their applications
Primer designing for pcr and qpcr and their applicationsPrimer designing for pcr and qpcr and their applications
Primer designing for pcr and qpcr and their applications
 
Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)Next Generation Sequencing (NGS)
Next Generation Sequencing (NGS)
 
Peptide Mass Fingerprinting
Peptide Mass FingerprintingPeptide Mass Fingerprinting
Peptide Mass Fingerprinting
 
Next Generation Sequencing of DNA
Next Generation Sequencing of DNANext Generation Sequencing of DNA
Next Generation Sequencing of DNA
 
Application of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sectorApplication of bioinformatics in agriculture sector
Application of bioinformatics in agriculture sector
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
 
A Brief Introduction to Metabolomics
A Brief Introduction to Metabolomics A Brief Introduction to Metabolomics
A Brief Introduction to Metabolomics
 
Emboss
EmbossEmboss
Emboss
 
Metabolomics- concepts and applications
Metabolomics- concepts and applicationsMetabolomics- concepts and applications
Metabolomics- concepts and applications
 
Whole genome sequence
Whole genome sequenceWhole genome sequence
Whole genome sequence
 
Gene knockout
Gene knockoutGene knockout
Gene knockout
 
Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014Clinical proteomics in diseases lecture, 2014
Clinical proteomics in diseases lecture, 2014
 
The Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resourcesThe Gene Ontology & Gene Ontology Annotation resources
The Gene Ontology & Gene Ontology Annotation resources
 
Transcriptomics
TranscriptomicsTranscriptomics
Transcriptomics
 
PAM : Point Accepted Mutation
PAM : Point Accepted MutationPAM : Point Accepted Mutation
PAM : Point Accepted Mutation
 
Single cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applicationsSingle cell RNA sequencing; Methods and applications
Single cell RNA sequencing; Methods and applications
 
Sage
SageSage
Sage
 
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015
Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015Long read sequencing -  WEHI  bioinformatics seminar - tue 16 june 2015
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 

En vedette

16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence AnalysisAbdulrahman Muhammad
 
Bacterial Identification by 16s rRNA Sequencing.ppt
Bacterial Identification by 16s rRNA Sequencing.pptBacterial Identification by 16s rRNA Sequencing.ppt
Bacterial Identification by 16s rRNA Sequencing.pptRakesh Kumar
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisJosh Neufeld
 
Silva ribosomal RNA database
Silva ribosomal RNA databaseSilva ribosomal RNA database
Silva ribosomal RNA databasecfloare
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiomejukais
 
Amplicon sequencing slides - Trina McMahon - MEWE 2013
Amplicon sequencing slides - Trina McMahon - MEWE 2013Amplicon sequencing slides - Trina McMahon - MEWE 2013
Amplicon sequencing slides - Trina McMahon - MEWE 2013mcmahonUW
 
เหรียญดุษฎีมาลา เข็มศิลปวิทยา
เหรียญดุษฎีมาลา เข็มศิลปวิทยาเหรียญดุษฎีมาลา เข็มศิลปวิทยา
เหรียญดุษฎีมาลา เข็มศิลปวิทยาPrasit Chanarat
 
Annotating 18S rDNA sequences from environmental molecular surveys
Annotating 18S rDNA sequences from environmental molecular surveysAnnotating 18S rDNA sequences from environmental molecular surveys
Annotating 18S rDNA sequences from environmental molecular surveys EukRef
 
Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...
Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...
Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...Anupam Singh
 
Toast 2015 qiime_talk2
Toast 2015 qiime_talk2Toast 2015 qiime_talk2
Toast 2015 qiime_talk2TOASTworkshop
 
Collecting bacteria sample
Collecting bacteria sampleCollecting bacteria sample
Collecting bacteria sampleTeresa Long
 
North Atlantic fucoids in the light of global warming
North Atlantic fucoids in the light of global warmingNorth Atlantic fucoids in the light of global warming
North Atlantic fucoids in the light of global warmingAlexander Jueterbock
 
Toast 2015 qiime_talk
Toast 2015 qiime_talkToast 2015 qiime_talk
Toast 2015 qiime_talkTOASTworkshop
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.jennomics
 
Horse gut microbiome
Horse gut microbiomeHorse gut microbiome
Horse gut microbiomeShebl E Salem
 
Esa 2014 qiime
Esa 2014 qiimeEsa 2014 qiime
Esa 2014 qiimeZech Xu
 

En vedette (20)

16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis16S Ribosomal DNA Sequence Analysis
16S Ribosomal DNA Sequence Analysis
 
Bacterial Identification by 16s rRNA Sequencing.ppt
Bacterial Identification by 16s rRNA Sequencing.pptBacterial Identification by 16s rRNA Sequencing.ppt
Bacterial Identification by 16s rRNA Sequencing.ppt
 
16s
16s16s
16s
 
Introduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysisIntroduction to 16S rRNA gene multivariate analysis
Introduction to 16S rRNA gene multivariate analysis
 
Thesis
ThesisThesis
Thesis
 
16S classifier
16S classifier16S classifier
16S classifier
 
Silva ribosomal RNA database
Silva ribosomal RNA databaseSilva ribosomal RNA database
Silva ribosomal RNA database
 
Ngs microbiome
Ngs microbiomeNgs microbiome
Ngs microbiome
 
Amplicon sequencing slides - Trina McMahon - MEWE 2013
Amplicon sequencing slides - Trina McMahon - MEWE 2013Amplicon sequencing slides - Trina McMahon - MEWE 2013
Amplicon sequencing slides - Trina McMahon - MEWE 2013
 
identification of bacteria
identification of bacteriaidentification of bacteria
identification of bacteria
 
เหรียญดุษฎีมาลา เข็มศิลปวิทยา
เหรียญดุษฎีมาลา เข็มศิลปวิทยาเหรียญดุษฎีมาลา เข็มศิลปวิทยา
เหรียญดุษฎีมาลา เข็มศิลปวิทยา
 
Annotating 18S rDNA sequences from environmental molecular surveys
Annotating 18S rDNA sequences from environmental molecular surveysAnnotating 18S rDNA sequences from environmental molecular surveys
Annotating 18S rDNA sequences from environmental molecular surveys
 
Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...
Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...
Policy Brief-Costly Disease: How to reduce out of pocket expenditure in Diabe...
 
Toast 2015 qiime_talk2
Toast 2015 qiime_talk2Toast 2015 qiime_talk2
Toast 2015 qiime_talk2
 
Collecting bacteria sample
Collecting bacteria sampleCollecting bacteria sample
Collecting bacteria sample
 
North Atlantic fucoids in the light of global warming
North Atlantic fucoids in the light of global warmingNorth Atlantic fucoids in the light of global warming
North Atlantic fucoids in the light of global warming
 
Toast 2015 qiime_talk
Toast 2015 qiime_talkToast 2015 qiime_talk
Toast 2015 qiime_talk
 
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
Microbiome studies using 16S ribosomal DNA PCR: some cautionary tales.
 
Horse gut microbiome
Horse gut microbiomeHorse gut microbiome
Horse gut microbiome
 
Esa 2014 qiime
Esa 2014 qiimeEsa 2014 qiime
Esa 2014 qiime
 

Similaire à [13.09.19] 16S workshop introduction

[2013.11.01] visualizing omics_data
[2013.11.01] visualizing omics_data[2013.11.01] visualizing omics_data
[2013.11.01] visualizing omics_dataMads Albertsen
 
[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomics[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomicsMads Albertsen
 
[2013.09.27] extracting genomes from metagenomes
[2013.09.27] extracting genomes from metagenomes[2013.09.27] extracting genomes from metagenomes
[2013.09.27] extracting genomes from metagenomesMads Albertsen
 
[13.07.07] karst mewe13 dna_extraction_nonotes
[13.07.07] karst mewe13 dna_extraction_nonotes[13.07.07] karst mewe13 dna_extraction_nonotes
[13.07.07] karst mewe13 dna_extraction_nonotessorenkarst
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible researchYannick Wurm
 
[2017.06.02] ASM17 Mads Albertsen
[2017.06.02] ASM17 Mads Albertsen[2017.06.02] ASM17 Mads Albertsen
[2017.06.02] ASM17 Mads AlbertsenMads Albertsen
 
[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomics[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomicsMads Albertsen
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issuesDongyan Zhao
 
[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes
[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes
[2013.12.02] Mads Albertsen: Extracting Genomes from MetagenomesMads Albertsen
 
Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Alex Hardisty
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMonica Munoz-Torres
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisUniversity of California, Davis
 
2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible researchYannick Wurm
 
Traditional OTUs versus modern Amplicon Sequence Variants
Traditional OTUs versus modern Amplicon Sequence VariantsTraditional OTUs versus modern Amplicon Sequence Variants
Traditional OTUs versus modern Amplicon Sequence VariantsKasper Skytte Andersen
 
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is brokenMads Albertsen
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingJonathan Eisen
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceOscar Corcho
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017David Cook
 

Similaire à [13.09.19] 16S workshop introduction (20)

[2013.11.01] visualizing omics_data
[2013.11.01] visualizing omics_data[2013.11.01] visualizing omics_data
[2013.11.01] visualizing omics_data
 
[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomics[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomics
 
[2013.09.27] extracting genomes from metagenomes
[2013.09.27] extracting genomes from metagenomes[2013.09.27] extracting genomes from metagenomes
[2013.09.27] extracting genomes from metagenomes
 
[13.07.07] karst mewe13 dna_extraction_nonotes
[13.07.07] karst mewe13 dna_extraction_nonotes[13.07.07] karst mewe13 dna_extraction_nonotes
[13.07.07] karst mewe13 dna_extraction_nonotes
 
Introduction to 16S Microbiome Analysis
Introduction to 16S Microbiome AnalysisIntroduction to 16S Microbiome Analysis
Introduction to 16S Microbiome Analysis
 
2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research2014 11-13-sbsm032-reproducible research
2014 11-13-sbsm032-reproducible research
 
[2017.06.02] ASM17 Mads Albertsen
[2017.06.02] ASM17 Mads Albertsen[2017.06.02] ASM17 Mads Albertsen
[2017.06.02] ASM17 Mads Albertsen
 
[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomics[13.07.07] albertsen mewe13 metagenomics
[13.07.07] albertsen mewe13 metagenomics
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
 
[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes
[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes
[2013.12.02] Mads Albertsen: Extracting Genomes from Metagenomes
 
Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)Biodiversity Virtual e-Laboratory (BioVeL)
Biodiversity Virtual e-Laboratory (BioVeL)
 
Munoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ssMunoz torres web-apollo-workshop_exeter-2014_ss
Munoz torres web-apollo-workshop_exeter-2014_ss
 
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression AnalysisSo you want to do a: RNAseq experiment, Differential Gene Expression Analysis
So you want to do a: RNAseq experiment, Differential Gene Expression Analysis
 
2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research2015 10-7-11am-reproducible research
2015 10-7-11am-reproducible research
 
Traditional OTUs versus modern Amplicon Sequence Variants
Traditional OTUs versus modern Amplicon Sequence VariantsTraditional OTUs versus modern Amplicon Sequence Variants
Traditional OTUs versus modern Amplicon Sequence Variants
 
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken
[2014.08.25] Albertsen ISME15 CAMI: Why metgenomics is broken
 
2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked2015_CV_J_SHELTON_linked
2015_CV_J_SHELTON_linked
 
Talk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meetingTalk by J. Eisen for NZ Computational Genomics meeting
Talk by J. Eisen for NZ Computational Genomics meeting
 
Towards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experienceTowards Reproducible Science: a few building blocks from my personal experience
Towards Reproducible Science: a few building blocks from my personal experience
 
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
 

Dernier

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Dernier (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

[13.09.19] 16S workshop introduction

  • 1. A practical introduction to handling 16S data Mads Albertsen Internal workshop 2013 CENTER FOR MICROBIAL COMMUNITIES
  • 2. • Introduction • Case story: GAO Reactors • Generating OTU tables (Hands on) • Analyzing data in Excel (Hands on) • Analyzing data in R (Hands on) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Agenda
  • 3. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Who - when, where and why?
  • 4. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Who - when, where and why? http://phil.cdc.gov/phil/details.asp?pid=2226http://en.wikipedia.org/wiki/File:EBPR_FISH_Floc.jpg P. Larsen 2012 Accumulibacter Competibacter Bacillus anthracis
  • 5. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY The affinities of all the beings of the same class have sometimes been represented by a great tree... The green and budding twigs may represent existing species; and those produced during former years may represent the long succession of extinct species. C. Darwin, 1872 http://tolweb.org Nothing in biology makes sense, except in the light of evolution. T. Dobzhansky, 1973 Taking advantage of evolution
  • 6. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Why do we use the 16S gene? Ribosomes are universal rRNA = Structural RNA http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf
  • 7. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Why do we use the 16S gene? http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf 8F8F Universal primer 8F 8F
  • 8. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Why do we use the 16S gene? http://www.rna.icmb.utexas.edu/SAE/2B/ConsStruc/Diagrams/cons.16.b.Bacteria.pdf Ashelford et al. AEM. 2005;71:7724-7736 • Advantages: • Universal gene (No horizontal gene transfer) • Conserved regions • Variable regions • Great databases and alignments • Problems: • Variable copy number • No universal (unbiased) primers • (Not directly correlated with activity) • (Lack of functional information)
  • 9. CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow Sampling SequencingExtraction Sample prep Bioinformatics The focus of the workshop is bioinformatics. However, the preceding steps influences how we handle the data.
  • 10. Sampling SequencingExtraction Sample prep Bioinformatics • Standardisation, standardization, standardizasion..! • Use biological replicates and evaluate your variation…! • Design a good experiment with realistic expectations to the outcome (Most studies fail here!!!) CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow AAU activated sludge standard @ midasfieldguide.org
  • 11. Sampling SequencingExtraction Sample prep Bioinformatics eDNA removal Input (mg) Bead beating Storage Intensity (ms-1) Duration(s) 4 6 400 160 80 40 20 1 2 4 9 22 • Fresh • 24 h @ 4°C • 24 h @ 20 °C PMA650 W 10 min + N+ CH3 NH2 N3 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow AAU activated sludge standard @ midasfieldguide.org
  • 12. Sampling SequencingExtraction Sample prep Bioinformatics Bp Meanfrequencyof mostcommonresidue in50bpwindow 0 500 1000 1500 1.0 0.8 0.6 V1 V2 V3 V4 V5 V6 V7 V8 V9 V1.3 V4 V3.4 CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow AAU activated sludge standard @ midasfieldguide.org Ashelford et al. AEM. 2005;71:7724-7736
  • 13. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow PCR with modified 16S primers 5’-AATGATACGGCGACCACCGAGATCTACAC GTACGTACG GT AGAGTTTGATCCTGGCTCAG-3’ 5’-CAAGCAGAAGACGGCATACGAGAT TCCCTTGTCTCC ACGTACGTAC CCG ATTACCGCGGCTGCTGG-3’ Illumina adapter Barcode Pad linker 534R Illumina adapter Pad linker 27F //// Target region // 1. 2. 3. AAU activated sludge standard @ midasfieldguide.org PCR Cycle
  • 14. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow Mardis, 2008 (PMID 18576944) ≈ 500 bp target amplicon
  • 15. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow Read 1: 300 bp Read 2: 300 bp Read 1 Read 2 Barcode ≈ 500 bp target amplicon After Sequencing:
  • 16. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow How many sequences are needed? It depends on your question! (although 50.000 raw sequences per sample is usually fine) AAU raw kit and chemical costs (DKK) Cost Cost v2 DNA extraction 105 70a Library preparation 40 40 Sequencing (min 100k reads / sample) 190b 70c Total 335 180 a Kits discounted b 50 samples per run c 150 samples per run (can run up to 300)
  • 17. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow Merge Cluster 3 11 3 1 OTU Count Assign taxonomy (Compare to database) 3 Accumulibacter 11 Unkown 3 Competibacter 1 Bacillus anthracis OTU Count OTU table
  • 18. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow Merge Cluster 2 1 3 8 3 0 1 0 OTU A B Assign taxonomy (Compare to database) A A A A A A A A A B B B B B B B B B Barcode 2 1 Accumulibacter 3 8 Unkown 3 0 Competibacter 1 0 Bacillus anthracis OTU A B OTU table
  • 19. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow Sequence errors, chimera’s and weird stuff.. The chance of a perfect read as function of the read length Chimera’s
  • 20. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY Typical workflow Merge Cluster 3 11 3 OTU Count Assign taxonomy (Compare to database) 3 Accumulibacter 11 Unkown 3 Competibacter OTU Count OTU table Removing unique sequences makes the subsequent steps 10-100x faster and removes the majority of errors and chimera’s Dependent on sequencing depth and sample complexity! Be careful!
  • 21. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY AAU workflow 16SAMP-145 16SAMP-146 16SAMP-147 16SAMP-148 16SAMP-149 16SAMP-150 16S.V13.workflow.sh Find sample ID’s on Google drive OTU table (+ R version) Plain text file 2 1 Accumulibacter 3 8 Unkown 3 0 Competibacter OTU A B
  • 22. Sampling SequencingExtraction Sample prep Bioinformatics CENTER FOR MICROBIAL COMMUNITIES | AALBORG UNIVERSITY AAU workflow What 16S.V13.workflow.sh does: 1. Find and unpack your samples 2. Optional subsampling 3. Remove potential phiX contamination (bowtie2) 4. Merge read 1 and read 2 (flash) 5. Remove reads outside length criteria 6. Optional removal of unique reads and subsampling to even depth 7. Format reads for QIIME 8. Cluster reads to OTUs (Uclust, QIIME) 9. Assign taxonomy (RDP classifier, QIIME + database: MiDAS, Greengnes or Silva) 10. Generate OTU table (QIIME)