SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
Quick-StartGuide

GeneRead DNAseq variant analysis
This guide describes how to analyze sequencing reads generated from a GeneRead DNAseq Gene
Panel kit for targeted exon enrichment. The first section provides instructions for uploading reads,
setting analysis parameters, and downloading the variant report. Next, technical details are
provided for how the sequence reads are mapped to the reference genome. Thirdly, the method
for variant identification and annotation from the aligned reads is described. Finally, definitions for
the terms used in the variant report are provided.

Minimum computer requirements
GeneRead Targeted Exon Enrichment Panel Data Analysis is web-based and compatible with the
following web browsers:


Mozilla® Firefox®



Google® Chrome®

Note: We apologize for the inconvenience, but due to technical issues our software is not
compatible with Microsoft® Internet Explorer®.

Getting started with GeneRead Targeted Exon
Enrichment Panel Data Analysis
1.

Go to http://ngsdataanalysis.sabiosciences.com.

2.

Click on “Add files” and select either the .SFF file (generated from Ion Torrent™ PGM)
or the .FASTQ file (generated from Illumina® sequencers such as MiSeq).
Note: Read files should contain reads from only one sample. For example, if your sequencing
library contains barcoded samples, the reads must already be split by sample.

Sample & Assay Technologies
3.

Click “Start” to start uploading the selected file.

4.

The upload process may take a few minutes or longer to complete.

5.

Select several pieces of information about each sequencing file in the “Analysis
Setup” tab:



Sequencing platform: Ion Torrent PGM / Ion ProtonTM or Illumina



Analysis Mode: Germline or Somatic



6.

GeneRead catalog number

Read Mode: Single-end or Paired-end

For paired-end analysis, specify which file pairs correspond to paired-end reads
(“Make A Pair”). Select the first and second file in a pair, and click on “Make A Pair”.
Note: The files must be paired two at a time, but the order in which the files are paired is not
important. The “Read Mode” for both files will be automatically updated to “Paired-end.”

GeneRead DNAseq variant analysis

page 2 of 13
7.

Select the file to analyze and click on “Add Jobs to Queue.” For a paired-end run,
select only the first file in the pair. Analysis will start.
Note: Analysis will typically take 30 minutes to a few hours. Run time depends on the file size,
the number of files, and how many processes are running concurrently on the servers.

8.

The status and results of the analysis can be checked by clicking the “View Jobs” tab.

GeneRead DNAseq variant analysis

page 3 of 13
9.

While analysis is running, the status will display, “Job in progress.”

10. The status will change to “Job Done Successfully” when analysis of the reads is
complete.
11. Click on “View Analysis Report” to see the results summary and links to download
the results.

12. Files available for download are:


summary.txt: summary statistics for read mapping and variant
calling



summaryByGene.txt: summary statistics for read mapping and
variant calling by gene



reads.trimmed.sff (IonTorrent) or reads.trimmed.fastq (Illumina):
reads in the same format as they were uploaded, after quality
filtering and primer trimming



reads.trimmed.bam: reads aligned to human genome reference



reads.trimmed.bam.bai: index for BAM file (which is necessary to
view the alignments in IGV)



variants.vcf: variants in variant call format (VCF)



variants.xls: variants in Excel® format

Note: Finished jobs and the associated data file will be stored at the analysis site for a
maximum of 30 days. After that, all files will be automatically deleted.
Note: See the following sections for technical background on how the software maps
sequence reads and identifies variants.

GeneRead DNAseq variant analysis

page 4 of 13
Note: Analysis files available for download will appear in a pop-up box.

GeneRead DNAseq variant analysis

page 5 of 13
Read alignment workflow
The uploaded sequence reads are aligned to the reference genome in several steps.

Preliminary alignment to reference genome
A preliminary alignment is performed using the full read set. Illumina reads (such as from MiSeq)
are aligned using Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml). Ion Torrent
PGM

and

Ion

Proton

reads

are

aligned

using

TMAP

(https://github.com/nh13/TMAP/tree/master/doc). Reads are aligned with hard-clipping on the 5’
end of the read, and soft-clipping on the 3’ end of the read.

Trimming primer region from reads
Priming sites are identified, and primer regions are trimmed at primer binding sites. The trimmed
read file is available for download for users who wish to perform their own alignment and variant
calling.

Quality filter
Reads with an untrimmed length of less than 45 bp are discarded. These short reads can result
from either undesired PCR products or poor sequencing quality.

Final read alignment
Next, final alignment of trimmed reads to the reference genome is performed. Alignment
parameters are identical to those used in the preliminary alignment. This final read alignment is
used for variant calling.

Alignment summary
Finally, basic read and coverage statistics are generated, including the total number of reads
mapped, the median depth of coverage, and the number of targeted bases covered at a depth of
at least 10, 30, and 100. An alignment summary is provided for the entire gene panel, and also for
each gene included in the gene panel.

GeneRead DNAseq variant analysis

page 6 of 13
Variant calling workflow
This section describes the steps involved in the GeneRead variant calling pipeline. The pipeline
incorporates tools and programs from multiple sources, including many that are part of the
Genome Analysis Toolkit (GATK: http://www.broadinstitute.org/gatk/) from the Broad Institute.

Inputs
The following files are inputs for the variant calling pipeline:


reads.trimmed.bam: This file contains the mapped reads after primer trimming.



target.bed: This file contains the coordinates of the target amplicons. This file is
automatically generated from the catalog number of the GeneRead DNAseq Gene Panel
used.



params.txt: This file contains various parameters for variant calling. Some user-supplied
parameters are included, including sequencing platform (such as Ion Torrent PGM or
MiSeq) and the desired workflow (germline or somatic).

Variant calling
Software used for variant calling varies depending on the sequencing platform. For Ion Torrent /
Ion Proton data, the Torrent Variant Caller (TVC) version 2.2 is used for calling the variants. For
Illumina (MiSeq) data, the GATK Unified Genotyper program (GATKLite version 2.1–8) is used. Both
programs generate a variant call format (VCF) file containing both SNPs and indels identified from
the input reads. The GeneRead data analysis workflow runs the GATK Variant Annotator program
using the output from TVC in order to populate the INFO field in the VCF file with parameters
needed for downstream filtering. The INFO field in the GATK Unified Genotyper output is already
populated with all necessary parameters, so the Variant Annotator step is not necessary for these
data (Figure 8).

Variant filtering
Variant filtering is performed in two stages. Stage 1 of the filtering marks variants that fail some of
the thresholds for variant calling. However, note that the variants that fail the filters are not
removed from the VCF file. Instead, the FILTER field in the VCF file is marked with the names of the
filters failed by the variants. We use two filters derived from the recommendations in the GATK best
practices document.
GatkSNPFilter includes the following thresholds:


Fisher Strand (FS: Fisher’s exact test for strand bias). Variants with FS>60 are marked.



RMS Mapping Quality (MQ). Variants with MQ<40 are marked.



ReadPosRankSum. Variants with ReadPosRankSum<–8.0 are marked.

GeneRead DNAseq variant analysis

page 7 of 13
GatkIndelFilter includes the following thresholds:


Variants with FS>200 are marked.



Variants with ReadPosRankSum<–20.0 are marked.

Stage 2 of filtering removes SNPs and indels that do not pass certain variant frequency and
coverage thresholds. Unlike in Stage 1, the variants that fail Stage 2 are removed from the VCF file.
The following filters are used:


SNPs with <4% variant allele frequency are removed in the somatic workflow, and SNPs
with <20% variant allele frequency are removed in the germline workflow.



Indels with <20% variant allele frequency are removed in the somatic workflow, and
indels with <25% variant allele frequency are removed in the germline workflow.

Functional annotation
After filtering, the variants undergo various annotation steps to add useful information. The
following annotation steps are part of the variant calling workflow:


snpEff annotations: The snpEff program (http;//snpeff.sourceforge.net/) predicts the effects
of variants on genes. Information such as gene name, transcript ID, codon change, animo
acid change, and others are added.



dbSNP binding: If the variant is present in dbSNP
(http://www.ncbi.nlm.nih.gov/projects/SNP/), the corresponding dbSNP ID(s) is added to
the variant.



Cosmic binding: If the variant is present in Cosmic
(http://www.sanger.ac.uk/perl/genetics/CGP/cosmic), the corresponding Cosmic ID is
added to the variant.



dbNSFP annotations: SNPSift (http://snpeff.sourceforge.net/SnpSift.html) is used to extract
available annotations from dbNSFP (database of nonsynonymous SNPs’ functional
predictions). Annotations derived from dbNSFP include SIFT score, Polyphen score,
Interpro domain, and Uniport ID.

The final output from the variant calling pipeline is a VCF file named “variants.vcf”, which contains
all the added annotations. This VCF file is also converted to Excel format (“variants.xls”) for
convenience.

GeneRead DNAseq variant analysis

page 8 of 13
BAM file
(mapped reads)

BED file for
amplicons used

Parameters
file

Variant calling and filtering
Torrent Variant
Caller (TVC)

IonTorrent

VCF

GATK Variant Annotator
VCF

VCF

Sequencing
platform

VCF

GATK Unified
Genotyper

MiSeq

GATK Variant Filtration
Additional filtering (based on
frequency and coverage)

VCF

Annotation
snpEff
(basic annotation)
VCF

SnpSift (links to dbSNP Cosmic
,
and computation of Sift scores,
etc.)

dbSNP
Cosmic
dbNSFP

VCF

VCF to Excel/html

Variants in Excel format

Figure 1. GeneRead variant calling workflow.

GeneRead DNAseq variant analysis

page 9 of 13
Functional annotation
After filtering, the variants undergo various annotation steps to add useful information. The
following annotation steps are part of the variant calling workflow:


snpEff annotations: The snpEff program (http;//snpeff.sourceforge.net/) predicts the effects
of variants on genes. Information such as gene name, transcript ID, codon change, animo
acid change, and others are added.



dbSNP binding: If the variant is present in dbSNP
(http://www.ncbi.nlm.nih.gov/projects/SNP/), the corresponding dbSNP ID(s) is added to
the variant.



Cosmic binding: If the variant is present in Cosmic
(http://www.sanger.ac.uk/perl/genetics/CGP/cosmic), the corresponding Cosmic ID is
added to the variant.



dbNSFP annotations: SNPSift (http://snpeff.sourceforge.net/SnpSift.html) is used to extract
available annotations from dbNSFP (database of nonsynonymous SNPs’ functional
predictions). Annotations derived from dbNSFP include SIFT score, Polyphen score,
Interpro domain, and Uniport ID.

The final output from the variant calling pipeline is a VCF file named “variants.vcf”, which contains
all the added annotations. This VCF file is also converted to Excel format (“variants.xls”) for
convenience.

GeneRead DNAseq variant analysis

page 10 of 13
Terms and definitions in the variant report
Position: Position within the chromosome.
ID: dbSNP or Cosmic ID. Currently, dbSNP version 135 and cosmic version 60 are used. This field
can have multiple IDs separated by a space.
Ref: Reference allele at the position. UCSC hg19 is currently used as the reference. For point
mutations, the reference allele will be a single nucleotide. For InDels or MNPs, the reference allele
can be longer than one nucleotide.
Alt: A list of alternate alleles separated by commas.
Gene Name: Gene within which the mutation is contained.
Mutations Type: Type of mutation or variant. The value will be either SNP, MNP, INS (insertion),
or DEL (deletion).
Codon Change:

Change in the nucleotide space. Codon change is presented in the format

c.999A>B, where “999” is the chromosomal position, “A” is the reference allele, and “B” is the
alternate allele.
AA Change: Change in the amino acid space. AA change is presented in the format p.A999B,
where “A” is a string with the reference amino acid(s), “999” is the first position within the residue
that is different, and “B” is a string with the alternative amino acid(s) present in the variant.
Filtered Coverage: Coverage depth which is taken into consideration in variant calling. This does
not include reads with low base quality at the current position, and hence is generally smaller than
the actual read depth at the position.
Allele Coverage: Coverage depth of each allele at the position. The REF allele is listed first,
followed by each of the alleles in ALT field, in the same order as they appear in the ALT field.
Allele Frequency: Frequency of the REF and ALT alleles, listed in the same order as in the “Allele
Coverage” field.
Variant Frequency: Frequency of the most common ALT allele (within the sample) at the current
position.
SNPEFF Effect: Most significant effect of the variant, as predicted by snpEff. For the list of all
possible effects, please refer to
http://snpeff.sourceforge.net/faq.html#What_effects_are_predicted?.
SNPEFF Impact: Impact of the SNP, as predicted by snpEff. The value will be listed as “HIGH,”
“MODERATE,” “LOW,” or “MODIFIER,” in decreasing order of importance. For detailed
information about the possible values, please refer to
http://snpeff.sourceforge.net/faq.html#How_is_impact_categorized?_(VCF_output).
Filter: “PASS” indicates that variant passes all the filters. Otherwise, this field contains a comma
separated list of filters that the variant files. “BroadSNPFilter” indicates that the variant fails some of
the thresholds recommended by the Broad Institute for SNPs, as listed below:
GeneRead DNAseq variant analysis

page 11 of 13


MQ<40 (MQ: RMS Mapping Quality)
http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_anno
tator_RMSMappingQuality.html



FS>60 (FS: Fisher Strand)
http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_anno
tator_FisherStrand.html



ReadPosRankSum<–8.0
http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_anno
tator_ReadPosRankSumTest.html

“BroadIndelFilter” indicates that the variant fails some of the thresholds recommended by the Broad
Institute for indels:


FS>200



ReadPosRankSum<–20.0

P-Value: The Phred quality score of the SNP converted to a p-value. The value indicates the
likelihood that the variant call is a false positive.
SIFT Score: A score assigned by the SIFT program that indicates if the amino acid substitution
affects protein function. Please refer to http://sift.jcvi.org/.
Interpro Domain: Domain or conserved site in which the variant is located.
Uniprot Acc: Uniprot accession number.
Polyphen Score: Score computed by the Polyphen program that predicts the impact of amino acid
substitution of the structure and function of the protein. Please refer to
http://genetics.bwh.harvard.edu/pph/pph_help_text.html.

GeneRead DNAseq variant analysis

page 12 of 13
For up-to-date licensing and product-specific disclaimers, see the respective QIAGEN kit handbook
or user manual. QIAGEN handbooks can be requested from QIAGEN Technical Service or your
local QIAGEN distributor. Selected handbooks can be downloaded from
www.qiagen.com/literature. Safety data sheets (SDS) for any QIAGEN product can be downloaded
from www.qiagen.com/safety.

GeneRead DNAseq Gene Panels are intended for molecular biology applications. These products
are not intended for the diagnosis, prevention, or treatment of a disease.

Trademarks: QIAGEN® (QIAGEN Group); Chrome®, Google® (Google, Inc.); Illumina® (Illumina, Inc.); Ion
Proton™, Ion Torrent™ (Life Technologies Corporation); Excel®, Internet Explorer®, Microsoft® (Microsoft
Corporation); Firefox®, Mozilla® (Mozilla Foundation).
1074970 Nov-12 © 2012 QIAGEN, all rights reserved.

www.qiagen.com

France  01-60-920-930

The Netherlands  0800 0229592

Australia  1-800-243-800

Germany  02103-29-12000

Norway  800-18859

Austria  0800/281010

Hong Kong  800 933 965

Singapore  65-67775366

Belgium  0800-79612

Ireland  1800 555 049

Spain  91-630-7050

Canada  800-572-9613

Italy  800 787980

Sweden  020-790282

China  021-51345678

Japan  03-6890-7300

Switzerland  055-254-22-11

Denmark  80-885945

Korea (South)  1544 7145

UK  0808-2343665

Finland  0800-914416

Luxembourg  8002 2076

USA  800-426-8157

Sample & Assay Technologies

Contenu connexe

En vedette

Bioinformatics t2-databases wim-vancriekinge_v2013
Bioinformatics t2-databases wim-vancriekinge_v2013Bioinformatics t2-databases wim-vancriekinge_v2013
Bioinformatics t2-databases wim-vancriekinge_v2013Prof. Wim Van Criekinge
 
Evolution2014 desert mouse talk
Evolution2014 desert mouse talkEvolution2014 desert mouse talk
Evolution2014 desert mouse talkMatthew MacManes
 
ASHG sequencing workshop
ASHG sequencing workshopASHG sequencing workshop
ASHG sequencing workshopruthburton
 
Introducing data analysis: reads to results
Introducing data analysis: reads to resultsIntroducing data analysis: reads to results
Introducing data analysis: reads to resultsAGRF_Ltd
 
2012 erin-crc-nih-seattle
2012 erin-crc-nih-seattle2012 erin-crc-nih-seattle
2012 erin-crc-nih-seattlec.titus.brown
 
Workshop NGS data analysis - 3
Workshop NGS data analysis - 3Workshop NGS data analysis - 3
Workshop NGS data analysis - 3Maté Ongenaert
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingSurya Saha
 
Assembly: before and after
Assembly: before and afterAssembly: before and after
Assembly: before and afterLex Nederbragt
 
Sequencing: The Next Generation
Sequencing: The Next GenerationSequencing: The Next Generation
Sequencing: The Next GenerationSurya Saha
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingmikaelhuss
 
Workshop NGS data analysis - 2
Workshop NGS data analysis - 2Workshop NGS data analysis - 2
Workshop NGS data analysis - 2Maté Ongenaert
 
ECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialThomas Keane
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencingDenis C. Bauer
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingStephen Turner
 

En vedette (20)

Bioinformatica 06-10-2011-t2-databases
Bioinformatica 06-10-2011-t2-databasesBioinformatica 06-10-2011-t2-databases
Bioinformatica 06-10-2011-t2-databases
 
Bioinformatics t2-databases wim-vancriekinge_v2013
Bioinformatics t2-databases wim-vancriekinge_v2013Bioinformatics t2-databases wim-vancriekinge_v2013
Bioinformatics t2-databases wim-vancriekinge_v2013
 
Evolution2014 desert mouse talk
Evolution2014 desert mouse talkEvolution2014 desert mouse talk
Evolution2014 desert mouse talk
 
ASHG sequencing workshop
ASHG sequencing workshopASHG sequencing workshop
ASHG sequencing workshop
 
De Novo
De NovoDe Novo
De Novo
 
NGS - QC & Dataformat
NGS - QC & Dataformat NGS - QC & Dataformat
NGS - QC & Dataformat
 
Introducing data analysis: reads to results
Introducing data analysis: reads to resultsIntroducing data analysis: reads to results
Introducing data analysis: reads to results
 
Eshg sequencing workshop
Eshg sequencing workshopEshg sequencing workshop
Eshg sequencing workshop
 
2012 erin-crc-nih-seattle
2012 erin-crc-nih-seattle2012 erin-crc-nih-seattle
2012 erin-crc-nih-seattle
 
Workshop NGS data analysis - 3
Workshop NGS data analysis - 3Workshop NGS data analysis - 3
Workshop NGS data analysis - 3
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Assembly: before and after
Assembly: before and afterAssembly: before and after
Assembly: before and after
 
Sequencing: The Next Generation
Sequencing: The Next GenerationSequencing: The Next Generation
Sequencing: The Next Generation
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Curso de Genómica - UAT (VHIR) 2012 - Análisis de datos de NGS
Curso de Genómica - UAT (VHIR) 2012 - Análisis de datos de NGSCurso de Genómica - UAT (VHIR) 2012 - Análisis de datos de NGS
Curso de Genómica - UAT (VHIR) 2012 - Análisis de datos de NGS
 
Workshop NGS data analysis - 2
Workshop NGS data analysis - 2Workshop NGS data analysis - 2
Workshop NGS data analysis - 2
 
ECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing TutorialECCB 2010 Next-gen sequencing Tutorial
ECCB 2010 Next-gen sequencing Tutorial
 
Introduction to second generation sequencing
Introduction to second generation sequencingIntroduction to second generation sequencing
Introduction to second generation sequencing
 
RNAseq Analysis
RNAseq AnalysisRNAseq Analysis
RNAseq Analysis
 
Examining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencingExamining gene expression and methylation with next gen sequencing
Examining gene expression and methylation with next gen sequencing
 

Similaire à Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112

EPANET Programmer's Toolkit
EPANET Programmer's ToolkitEPANET Programmer's Toolkit
EPANET Programmer's ToolkitMawar 99
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisSANJANA PANDEY
 
1st KeyStone Summer School - Hackathon Challenge
1st KeyStone Summer School - Hackathon Challenge1st KeyStone Summer School - Hackathon Challenge
1st KeyStone Summer School - Hackathon ChallengeJoel Azzopardi
 
Java Performance & Profiling
Java Performance & ProfilingJava Performance & Profiling
Java Performance & ProfilingIsuru Perera
 
Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...
Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...
Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...Thomas K. Y. Lam
 
Functional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptxFunctional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptxUmerjibranRaza
 
LUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineLUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineHai-Wei Yen
 
PhyloPipe.v1.1_manual_20150610
PhyloPipe.v1.1_manual_20150610PhyloPipe.v1.1_manual_20150610
PhyloPipe.v1.1_manual_20150610Yixuan Guo
 
CDS Filtering Program - User Manual
CDS Filtering Program - User ManualCDS Filtering Program - User Manual
CDS Filtering Program - User ManualYoann Pageaud
 
Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...cscpconf
 
BC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationBC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationElijah Willie
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012Dan Gaston
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchDavid Ruau
 
Project 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdf
Project 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdfProject 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdf
Project 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdfabhaykush25
 

Similaire à Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112 (20)

EPANET Programmer's Toolkit
EPANET Programmer's ToolkitEPANET Programmer's Toolkit
EPANET Programmer's Toolkit
 
Raptor user manual3.0
Raptor user manual3.0Raptor user manual3.0
Raptor user manual3.0
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
1st KeyStone Summer School - Hackathon Challenge
1st KeyStone Summer School - Hackathon Challenge1st KeyStone Summer School - Hackathon Challenge
1st KeyStone Summer School - Hackathon Challenge
 
Java Performance & Profiling
Java Performance & ProfilingJava Performance & Profiling
Java Performance & Profiling
 
11i Logs
11i Logs11i Logs
11i Logs
 
Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...
Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...
Quick start with pallaral meta on window10 virtual desktop-virtualbox linux u...
 
Functional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptxFunctional ANNOTATION OF GENOME.pptx
Functional ANNOTATION OF GENOME.pptx
 
LUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis PipelineLUGM-Update of the Illumina Analysis Pipeline
LUGM-Update of the Illumina Analysis Pipeline
 
PhyloPipe.v1.1_manual_20150610
PhyloPipe.v1.1_manual_20150610PhyloPipe.v1.1_manual_20150610
PhyloPipe.v1.1_manual_20150610
 
CDS Filtering Program - User Manual
CDS Filtering Program - User ManualCDS Filtering Program - User Manual
CDS Filtering Program - User Manual
 
Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...Creation of a Test Bed Environment for Core Java Applications using White Box...
Creation of a Test Bed Environment for Core Java Applications using White Box...
 
BC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan PresentationBC-Cancer ChimeraScan Presentation
BC-Cancer ChimeraScan Presentation
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
Rna rocket demo
Rna rocket demoRna rocket demo
Rna rocket demo
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
DNA Baser and Darwin
DNA Baser and DarwinDNA Baser and Darwin
DNA Baser and Darwin
 
Dna baser
Dna baserDna baser
Dna baser
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
Project 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdf
Project 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdfProject 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdf
Project 2 Assigned Tuesday February 21tst2023 Due Tuesd.pdf
 

Plus de Elsa von Licy

Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...
Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...
Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...Elsa von Licy
 
Strategie Decisions Incertitude Actes conference fnege xerfi
Strategie Decisions Incertitude Actes conference fnege xerfiStrategie Decisions Incertitude Actes conference fnege xerfi
Strategie Decisions Incertitude Actes conference fnege xerfiElsa von Licy
 
Neuropsychophysiologie
NeuropsychophysiologieNeuropsychophysiologie
NeuropsychophysiologieElsa von Licy
 
L agressivite en psychanalyse (21 pages 184 ko)
L agressivite en psychanalyse (21 pages   184 ko)L agressivite en psychanalyse (21 pages   184 ko)
L agressivite en psychanalyse (21 pages 184 ko)Elsa von Licy
 
C1 clef pour_la_neuro
C1 clef pour_la_neuroC1 clef pour_la_neuro
C1 clef pour_la_neuroElsa von Licy
 
Vuillez jean philippe_p01
Vuillez jean philippe_p01Vuillez jean philippe_p01
Vuillez jean philippe_p01Elsa von Licy
 
Spr ue3.1 poly cours et exercices
Spr ue3.1   poly cours et exercicesSpr ue3.1   poly cours et exercices
Spr ue3.1 poly cours et exercicesElsa von Licy
 
Plan de cours all l1 l2l3m1m2 p
Plan de cours all l1 l2l3m1m2 pPlan de cours all l1 l2l3m1m2 p
Plan de cours all l1 l2l3m1m2 pElsa von Licy
 
Bioph pharm 1an-viscosit-des_liquides_et_des_solutions
Bioph pharm 1an-viscosit-des_liquides_et_des_solutionsBioph pharm 1an-viscosit-des_liquides_et_des_solutions
Bioph pharm 1an-viscosit-des_liquides_et_des_solutionsElsa von Licy
 
Poly histologie-et-embryologie-medicales
Poly histologie-et-embryologie-medicalesPoly histologie-et-embryologie-medicales
Poly histologie-et-embryologie-medicalesElsa von Licy
 
Methodes travail etudiants
Methodes travail etudiantsMethodes travail etudiants
Methodes travail etudiantsElsa von Licy
 
Atelier.etude.efficace
Atelier.etude.efficaceAtelier.etude.efficace
Atelier.etude.efficaceElsa von Licy
 
There is no_such_thing_as_a_social_science_intro
There is no_such_thing_as_a_social_science_introThere is no_such_thing_as_a_social_science_intro
There is no_such_thing_as_a_social_science_introElsa von Licy
 

Plus de Elsa von Licy (20)

Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...
Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...
Styles of Scientific Reasoning, Scientific Practices and Argument in Science ...
 
Strategie Decisions Incertitude Actes conference fnege xerfi
Strategie Decisions Incertitude Actes conference fnege xerfiStrategie Decisions Incertitude Actes conference fnege xerfi
Strategie Decisions Incertitude Actes conference fnege xerfi
 
Rainville pierre
Rainville pierreRainville pierre
Rainville pierre
 
Neuropsychophysiologie
NeuropsychophysiologieNeuropsychophysiologie
Neuropsychophysiologie
 
L agressivite en psychanalyse (21 pages 184 ko)
L agressivite en psychanalyse (21 pages   184 ko)L agressivite en psychanalyse (21 pages   184 ko)
L agressivite en psychanalyse (21 pages 184 ko)
 
C1 clef pour_la_neuro
C1 clef pour_la_neuroC1 clef pour_la_neuro
C1 clef pour_la_neuro
 
Hemostase polycop
Hemostase polycopHemostase polycop
Hemostase polycop
 
Antiphilos
AntiphilosAntiphilos
Antiphilos
 
Vuillez jean philippe_p01
Vuillez jean philippe_p01Vuillez jean philippe_p01
Vuillez jean philippe_p01
 
Spr ue3.1 poly cours et exercices
Spr ue3.1   poly cours et exercicesSpr ue3.1   poly cours et exercices
Spr ue3.1 poly cours et exercices
 
Plan de cours all l1 l2l3m1m2 p
Plan de cours all l1 l2l3m1m2 pPlan de cours all l1 l2l3m1m2 p
Plan de cours all l1 l2l3m1m2 p
 
M2 bmc2007 cours01
M2 bmc2007 cours01M2 bmc2007 cours01
M2 bmc2007 cours01
 
Feuilletage
FeuilletageFeuilletage
Feuilletage
 
Chapitre 1
Chapitre 1Chapitre 1
Chapitre 1
 
Biophy
BiophyBiophy
Biophy
 
Bioph pharm 1an-viscosit-des_liquides_et_des_solutions
Bioph pharm 1an-viscosit-des_liquides_et_des_solutionsBioph pharm 1an-viscosit-des_liquides_et_des_solutions
Bioph pharm 1an-viscosit-des_liquides_et_des_solutions
 
Poly histologie-et-embryologie-medicales
Poly histologie-et-embryologie-medicalesPoly histologie-et-embryologie-medicales
Poly histologie-et-embryologie-medicales
 
Methodes travail etudiants
Methodes travail etudiantsMethodes travail etudiants
Methodes travail etudiants
 
Atelier.etude.efficace
Atelier.etude.efficaceAtelier.etude.efficace
Atelier.etude.efficace
 
There is no_such_thing_as_a_social_science_intro
There is no_such_thing_as_a_social_science_introThere is no_such_thing_as_a_social_science_intro
There is no_such_thing_as_a_social_science_intro
 

Hb 1486-001 1074970 qsg-gene_readdataanalysis_1112

  • 1. Quick-StartGuide GeneRead DNAseq variant analysis This guide describes how to analyze sequencing reads generated from a GeneRead DNAseq Gene Panel kit for targeted exon enrichment. The first section provides instructions for uploading reads, setting analysis parameters, and downloading the variant report. Next, technical details are provided for how the sequence reads are mapped to the reference genome. Thirdly, the method for variant identification and annotation from the aligned reads is described. Finally, definitions for the terms used in the variant report are provided. Minimum computer requirements GeneRead Targeted Exon Enrichment Panel Data Analysis is web-based and compatible with the following web browsers:  Mozilla® Firefox®  Google® Chrome® Note: We apologize for the inconvenience, but due to technical issues our software is not compatible with Microsoft® Internet Explorer®. Getting started with GeneRead Targeted Exon Enrichment Panel Data Analysis 1. Go to http://ngsdataanalysis.sabiosciences.com. 2. Click on “Add files” and select either the .SFF file (generated from Ion Torrent™ PGM) or the .FASTQ file (generated from Illumina® sequencers such as MiSeq). Note: Read files should contain reads from only one sample. For example, if your sequencing library contains barcoded samples, the reads must already be split by sample. Sample & Assay Technologies
  • 2. 3. Click “Start” to start uploading the selected file. 4. The upload process may take a few minutes or longer to complete. 5. Select several pieces of information about each sequencing file in the “Analysis Setup” tab:   Sequencing platform: Ion Torrent PGM / Ion ProtonTM or Illumina  Analysis Mode: Germline or Somatic  6. GeneRead catalog number Read Mode: Single-end or Paired-end For paired-end analysis, specify which file pairs correspond to paired-end reads (“Make A Pair”). Select the first and second file in a pair, and click on “Make A Pair”. Note: The files must be paired two at a time, but the order in which the files are paired is not important. The “Read Mode” for both files will be automatically updated to “Paired-end.” GeneRead DNAseq variant analysis page 2 of 13
  • 3. 7. Select the file to analyze and click on “Add Jobs to Queue.” For a paired-end run, select only the first file in the pair. Analysis will start. Note: Analysis will typically take 30 minutes to a few hours. Run time depends on the file size, the number of files, and how many processes are running concurrently on the servers. 8. The status and results of the analysis can be checked by clicking the “View Jobs” tab. GeneRead DNAseq variant analysis page 3 of 13
  • 4. 9. While analysis is running, the status will display, “Job in progress.” 10. The status will change to “Job Done Successfully” when analysis of the reads is complete. 11. Click on “View Analysis Report” to see the results summary and links to download the results. 12. Files available for download are:  summary.txt: summary statistics for read mapping and variant calling  summaryByGene.txt: summary statistics for read mapping and variant calling by gene  reads.trimmed.sff (IonTorrent) or reads.trimmed.fastq (Illumina): reads in the same format as they were uploaded, after quality filtering and primer trimming  reads.trimmed.bam: reads aligned to human genome reference  reads.trimmed.bam.bai: index for BAM file (which is necessary to view the alignments in IGV)  variants.vcf: variants in variant call format (VCF)  variants.xls: variants in Excel® format Note: Finished jobs and the associated data file will be stored at the analysis site for a maximum of 30 days. After that, all files will be automatically deleted. Note: See the following sections for technical background on how the software maps sequence reads and identifies variants. GeneRead DNAseq variant analysis page 4 of 13
  • 5. Note: Analysis files available for download will appear in a pop-up box. GeneRead DNAseq variant analysis page 5 of 13
  • 6. Read alignment workflow The uploaded sequence reads are aligned to the reference genome in several steps. Preliminary alignment to reference genome A preliminary alignment is performed using the full read set. Illumina reads (such as from MiSeq) are aligned using Bowtie2 (http://bowtie-bio.sourceforge.net/bowtie2/index.shtml). Ion Torrent PGM and Ion Proton reads are aligned using TMAP (https://github.com/nh13/TMAP/tree/master/doc). Reads are aligned with hard-clipping on the 5’ end of the read, and soft-clipping on the 3’ end of the read. Trimming primer region from reads Priming sites are identified, and primer regions are trimmed at primer binding sites. The trimmed read file is available for download for users who wish to perform their own alignment and variant calling. Quality filter Reads with an untrimmed length of less than 45 bp are discarded. These short reads can result from either undesired PCR products or poor sequencing quality. Final read alignment Next, final alignment of trimmed reads to the reference genome is performed. Alignment parameters are identical to those used in the preliminary alignment. This final read alignment is used for variant calling. Alignment summary Finally, basic read and coverage statistics are generated, including the total number of reads mapped, the median depth of coverage, and the number of targeted bases covered at a depth of at least 10, 30, and 100. An alignment summary is provided for the entire gene panel, and also for each gene included in the gene panel. GeneRead DNAseq variant analysis page 6 of 13
  • 7. Variant calling workflow This section describes the steps involved in the GeneRead variant calling pipeline. The pipeline incorporates tools and programs from multiple sources, including many that are part of the Genome Analysis Toolkit (GATK: http://www.broadinstitute.org/gatk/) from the Broad Institute. Inputs The following files are inputs for the variant calling pipeline:  reads.trimmed.bam: This file contains the mapped reads after primer trimming.  target.bed: This file contains the coordinates of the target amplicons. This file is automatically generated from the catalog number of the GeneRead DNAseq Gene Panel used.  params.txt: This file contains various parameters for variant calling. Some user-supplied parameters are included, including sequencing platform (such as Ion Torrent PGM or MiSeq) and the desired workflow (germline or somatic). Variant calling Software used for variant calling varies depending on the sequencing platform. For Ion Torrent / Ion Proton data, the Torrent Variant Caller (TVC) version 2.2 is used for calling the variants. For Illumina (MiSeq) data, the GATK Unified Genotyper program (GATKLite version 2.1–8) is used. Both programs generate a variant call format (VCF) file containing both SNPs and indels identified from the input reads. The GeneRead data analysis workflow runs the GATK Variant Annotator program using the output from TVC in order to populate the INFO field in the VCF file with parameters needed for downstream filtering. The INFO field in the GATK Unified Genotyper output is already populated with all necessary parameters, so the Variant Annotator step is not necessary for these data (Figure 8). Variant filtering Variant filtering is performed in two stages. Stage 1 of the filtering marks variants that fail some of the thresholds for variant calling. However, note that the variants that fail the filters are not removed from the VCF file. Instead, the FILTER field in the VCF file is marked with the names of the filters failed by the variants. We use two filters derived from the recommendations in the GATK best practices document. GatkSNPFilter includes the following thresholds:  Fisher Strand (FS: Fisher’s exact test for strand bias). Variants with FS>60 are marked.  RMS Mapping Quality (MQ). Variants with MQ<40 are marked.  ReadPosRankSum. Variants with ReadPosRankSum<–8.0 are marked. GeneRead DNAseq variant analysis page 7 of 13
  • 8. GatkIndelFilter includes the following thresholds:  Variants with FS>200 are marked.  Variants with ReadPosRankSum<–20.0 are marked. Stage 2 of filtering removes SNPs and indels that do not pass certain variant frequency and coverage thresholds. Unlike in Stage 1, the variants that fail Stage 2 are removed from the VCF file. The following filters are used:  SNPs with <4% variant allele frequency are removed in the somatic workflow, and SNPs with <20% variant allele frequency are removed in the germline workflow.  Indels with <20% variant allele frequency are removed in the somatic workflow, and indels with <25% variant allele frequency are removed in the germline workflow. Functional annotation After filtering, the variants undergo various annotation steps to add useful information. The following annotation steps are part of the variant calling workflow:  snpEff annotations: The snpEff program (http;//snpeff.sourceforge.net/) predicts the effects of variants on genes. Information such as gene name, transcript ID, codon change, animo acid change, and others are added.  dbSNP binding: If the variant is present in dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/), the corresponding dbSNP ID(s) is added to the variant.  Cosmic binding: If the variant is present in Cosmic (http://www.sanger.ac.uk/perl/genetics/CGP/cosmic), the corresponding Cosmic ID is added to the variant.  dbNSFP annotations: SNPSift (http://snpeff.sourceforge.net/SnpSift.html) is used to extract available annotations from dbNSFP (database of nonsynonymous SNPs’ functional predictions). Annotations derived from dbNSFP include SIFT score, Polyphen score, Interpro domain, and Uniport ID. The final output from the variant calling pipeline is a VCF file named “variants.vcf”, which contains all the added annotations. This VCF file is also converted to Excel format (“variants.xls”) for convenience. GeneRead DNAseq variant analysis page 8 of 13
  • 9. BAM file (mapped reads) BED file for amplicons used Parameters file Variant calling and filtering Torrent Variant Caller (TVC) IonTorrent VCF GATK Variant Annotator VCF VCF Sequencing platform VCF GATK Unified Genotyper MiSeq GATK Variant Filtration Additional filtering (based on frequency and coverage) VCF Annotation snpEff (basic annotation) VCF SnpSift (links to dbSNP Cosmic , and computation of Sift scores, etc.) dbSNP Cosmic dbNSFP VCF VCF to Excel/html Variants in Excel format Figure 1. GeneRead variant calling workflow. GeneRead DNAseq variant analysis page 9 of 13
  • 10. Functional annotation After filtering, the variants undergo various annotation steps to add useful information. The following annotation steps are part of the variant calling workflow:  snpEff annotations: The snpEff program (http;//snpeff.sourceforge.net/) predicts the effects of variants on genes. Information such as gene name, transcript ID, codon change, animo acid change, and others are added.  dbSNP binding: If the variant is present in dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP/), the corresponding dbSNP ID(s) is added to the variant.  Cosmic binding: If the variant is present in Cosmic (http://www.sanger.ac.uk/perl/genetics/CGP/cosmic), the corresponding Cosmic ID is added to the variant.  dbNSFP annotations: SNPSift (http://snpeff.sourceforge.net/SnpSift.html) is used to extract available annotations from dbNSFP (database of nonsynonymous SNPs’ functional predictions). Annotations derived from dbNSFP include SIFT score, Polyphen score, Interpro domain, and Uniport ID. The final output from the variant calling pipeline is a VCF file named “variants.vcf”, which contains all the added annotations. This VCF file is also converted to Excel format (“variants.xls”) for convenience. GeneRead DNAseq variant analysis page 10 of 13
  • 11. Terms and definitions in the variant report Position: Position within the chromosome. ID: dbSNP or Cosmic ID. Currently, dbSNP version 135 and cosmic version 60 are used. This field can have multiple IDs separated by a space. Ref: Reference allele at the position. UCSC hg19 is currently used as the reference. For point mutations, the reference allele will be a single nucleotide. For InDels or MNPs, the reference allele can be longer than one nucleotide. Alt: A list of alternate alleles separated by commas. Gene Name: Gene within which the mutation is contained. Mutations Type: Type of mutation or variant. The value will be either SNP, MNP, INS (insertion), or DEL (deletion). Codon Change: Change in the nucleotide space. Codon change is presented in the format c.999A>B, where “999” is the chromosomal position, “A” is the reference allele, and “B” is the alternate allele. AA Change: Change in the amino acid space. AA change is presented in the format p.A999B, where “A” is a string with the reference amino acid(s), “999” is the first position within the residue that is different, and “B” is a string with the alternative amino acid(s) present in the variant. Filtered Coverage: Coverage depth which is taken into consideration in variant calling. This does not include reads with low base quality at the current position, and hence is generally smaller than the actual read depth at the position. Allele Coverage: Coverage depth of each allele at the position. The REF allele is listed first, followed by each of the alleles in ALT field, in the same order as they appear in the ALT field. Allele Frequency: Frequency of the REF and ALT alleles, listed in the same order as in the “Allele Coverage” field. Variant Frequency: Frequency of the most common ALT allele (within the sample) at the current position. SNPEFF Effect: Most significant effect of the variant, as predicted by snpEff. For the list of all possible effects, please refer to http://snpeff.sourceforge.net/faq.html#What_effects_are_predicted?. SNPEFF Impact: Impact of the SNP, as predicted by snpEff. The value will be listed as “HIGH,” “MODERATE,” “LOW,” or “MODIFIER,” in decreasing order of importance. For detailed information about the possible values, please refer to http://snpeff.sourceforge.net/faq.html#How_is_impact_categorized?_(VCF_output). Filter: “PASS” indicates that variant passes all the filters. Otherwise, this field contains a comma separated list of filters that the variant files. “BroadSNPFilter” indicates that the variant fails some of the thresholds recommended by the Broad Institute for SNPs, as listed below: GeneRead DNAseq variant analysis page 11 of 13
  • 12.  MQ<40 (MQ: RMS Mapping Quality) http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_anno tator_RMSMappingQuality.html  FS>60 (FS: Fisher Strand) http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_anno tator_FisherStrand.html  ReadPosRankSum<–8.0 http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_anno tator_ReadPosRankSumTest.html “BroadIndelFilter” indicates that the variant fails some of the thresholds recommended by the Broad Institute for indels:  FS>200  ReadPosRankSum<–20.0 P-Value: The Phred quality score of the SNP converted to a p-value. The value indicates the likelihood that the variant call is a false positive. SIFT Score: A score assigned by the SIFT program that indicates if the amino acid substitution affects protein function. Please refer to http://sift.jcvi.org/. Interpro Domain: Domain or conserved site in which the variant is located. Uniprot Acc: Uniprot accession number. Polyphen Score: Score computed by the Polyphen program that predicts the impact of amino acid substitution of the structure and function of the protein. Please refer to http://genetics.bwh.harvard.edu/pph/pph_help_text.html. GeneRead DNAseq variant analysis page 12 of 13
  • 13. For up-to-date licensing and product-specific disclaimers, see the respective QIAGEN kit handbook or user manual. QIAGEN handbooks can be requested from QIAGEN Technical Service or your local QIAGEN distributor. Selected handbooks can be downloaded from www.qiagen.com/literature. Safety data sheets (SDS) for any QIAGEN product can be downloaded from www.qiagen.com/safety. GeneRead DNAseq Gene Panels are intended for molecular biology applications. These products are not intended for the diagnosis, prevention, or treatment of a disease. Trademarks: QIAGEN® (QIAGEN Group); Chrome®, Google® (Google, Inc.); Illumina® (Illumina, Inc.); Ion Proton™, Ion Torrent™ (Life Technologies Corporation); Excel®, Internet Explorer®, Microsoft® (Microsoft Corporation); Firefox®, Mozilla® (Mozilla Foundation). 1074970 Nov-12 © 2012 QIAGEN, all rights reserved. www.qiagen.com France  01-60-920-930 The Netherlands  0800 0229592 Australia  1-800-243-800 Germany  02103-29-12000 Norway  800-18859 Austria  0800/281010 Hong Kong  800 933 965 Singapore  65-67775366 Belgium  0800-79612 Ireland  1800 555 049 Spain  91-630-7050 Canada  800-572-9613 Italy  800 787980 Sweden  020-790282 China  021-51345678 Japan  03-6890-7300 Switzerland  055-254-22-11 Denmark  80-885945 Korea (South)  1544 7145 UK  0808-2343665 Finland  0800-914416 Luxembourg  8002 2076 USA  800-426-8157 Sample & Assay Technologies