SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
Population-Based DNA
Variant Analysis
with Golden Helix SVS
Jan 21, 2015
Bryce Christensen, PhD
Statistical Geneticist / Director of Services
Use the Questions pane in
your GoToWebinar window
Questions during
the presentation
Golden Helix
Leaders in Genetic Analytics
 Founded in 1998
 Multi-disciplinary: computer science,
bioinformatics, statistics, genetics
 Software and analytic services
About Golden Helix
Core
Features
Packages
Core Features
 Powerful Data Management
 Rich Visualizations
 Robust Statistics
 Flexible
 Easy-to-use
Applications
 Genotype Analysis
 DNA sequence analysis
 CNV Analysis
 RNA-seq differential expression
 Family Based Association
SNP & Variation Suite (SVS)
Agenda
Overview of RV analysis methods
Method Comparisons
2
3
4
NGS workflow design in SVS
Define the problem: What is rare variant (RV) analysis?1
Upstream analysis and QC considerations5
Q&A6
The Problem
 Array-based GWAS has been the primary technology for gene-
finding research for the past decade
 NGS technology, particularly whole-exome sequencing, makes it
possible to include rare variants (RVs) in the analysis
 Individual RVs lack statistical power for standard GWAS
approaches
- How do we utilize that information?
 Proposed solution: combine RVs into logical groups and analyze
them as a single unit
- AKA “Collapsing” or “Burden” tests.
Two Primary Approaches
 Direct search for susceptibility variants
- Assume highly penetrant variant and/or Mendelian disease
- Extensive reliance on bioinformatics for variant annotation and filtering
- Sample sizes usually small
 Rare Variant (RV) “collapsing” methods
- Increasingly common in complex disease research
- May require very large sample sizes!
- Assume that any of several LOF variants in a susceptibility gene may lead to
same disease or trait
- Many statistical tests available
- Also relies heavily on bioinformatics
Families of Collapsing Tests
 Burden Tests
- Combine minor alleles across multiple variant sites…
- Without weighting (CMC, CAST, CMAT)
- With fixed weights based on allele frequency (WSS, RWAS)
- With data-adaptive weights (Lin/Tang, KBAC)
- With data-adaptive thresholds (Step-Up, VT)
- With extensions to allow for effects in either direction (Ionita-Laza/Lange, C-alpha)
 Kernel Tests
- Allow for individual variant effects in either direction and permit covariate
adjustment based on kernel regression
- Kwee et al., AJHG, 2008
- SKAT
- SKAT-O
Credit: Schaid et al., Genet Epi, 2013
Burden Test Methods in SVS
 CMC: Combined Multivariate and Collapsing test
- Multivariate test: simultaneous test for association of common
and rare variants in gene
- Testing methods include Hotelling T2 and Regression
- Li and Leal, AJHG, 2008
 KBAC: Kernel-Based Adaptive Clustering
- Test models the risk associated with multi-locus genotypes in gene regions
- Adaptive weighting procedure that gives higher weights to genotypes with higher
sample risks
- Permutation testing, regression or mixed-model significance testing options
- Liu and Leal, PLoS Genetics, 2010
KBAC: Kernel Based Adaptive Clustering
 Test models the risk associated with multi-locus genotypes at a
per-gene level
 Adaptive weighting procedure that gives higher weights to
genotypes with higher sample risks
 SVS implementation includes option for 1- or 2-tailed test
- But most powerful when all variants in gene have unidirectional effect
 Permutation testing, regression or mixed-model significance
testing options
 Liu and Leal, PLoS Genetics, 2010
SKAT: Sequence Kernel Association Test
 Utilizes kernel machine methods
 Aggregates test statistics of variants over gene region to compute
region level p-values
 Many extensions of the method
 “This method can be more powerful when causal variants have
bidirectional effects and/or a large proportion of the variants within
gene region are non-causal.”
 “SKAT is less powerful than burden tests when causal variant
effects are unidirectional.”
- Liu and Leal, PLoS Genetics, 2012
SKAT-O: Optimized SKAT approach
 Combines a burden test with SKAT test in unified approach
 Burden tests are more powerful when most variants in a region are
causal
 SKAT test is more powerful when variants have effects in different
directions or some variants have no effect
 SKAT-O optimizes power by adaptively weighting the SKAT and
burden results in a combined test
Variant Weighting in SKAT Tests
 Default weighting scheme is based on Beta(1,25) distribution
 Gives much greater weight to the rarest variants
0.0 0.1 0.2 0.3 0.4 0.5
0510152025
MAF
Weight
Beta(1,25)
WSS
Uniform
Bioinformatic Filtering and Annotation
 The genomics community has spent years producing vast
resources of data about DNA sequence variants
- Some data is observational, like variant frequencies from the 1000 genomes
project or the NHLBI Exome Sequencing Project
- Other data is based on predictive algorithms, like PolyPhen or SIFT.
- Even “simple” annotations, like mapping data for genes, segmental duplications
and other sequence features are extremely valuable for analytic workflows.
 These data sources can be used to annotate variants identified in
an NGS experiment
- Annotations may be used for both QC and analysis purposes.
 Once annotated, variants may be filtered, sorted, and prioritized to
help us identify disease-causing mutations
Interactive Demonstration
 SVS 8.3
 Exploratory analysis workflow
- Simulate the development of a burden test
 Formal RV association test workflows
- SKAT and SKAT-O
- KBAC and MM-KBAC
- CMC
NGS Analysis Workflow Development in SVS
 SVS is very flexible in workflow design.
 SVS includes a broad range of tools for data manipulation and
variant annotation and visualization that can be used together to
guide us on an interactive exploration of the data.
 We begin by defining the final goal and the steps needed to help us
reach that goal:
- Are we looking for a very rare, non-synonymous variant that causes a dominant
Mendelian trait?
- Are we looking for a gene with excess rare variation in cases vs controls?
 Once we know what we are looking for, we can identify the
available annotation sources that will help us answer the question.
Data Simulation Process
 Begin with 2504 samples from 1000 Genomes Phase 3 data.
 Define LOF variants as:
- MAF<0.01 in NHLBI/ESP 6500 Exomes data
- MAF<0.01 according to dbSNP
- Predicted damaging by at least one of 5 prediction methods in dbNSFP
- (Automatically excludes synonymous, non-coding, and InDel variants)
- Results in 395k variant sites
 Create random binary phenotype
 Adjust phenotypes based on carrying LOF variants in any of eight
genes previously implicated in asthma
 Final: 1338 cases, 1166 controls
 About 18k genes in tests
Data Simulation
Gene Chr Rare NS
Variants
LOF
Variants
Samples
w/ LOF
variant
Cases w/
LOF
variant
% of
Total
Cases
IL12B 5 18 11 19 19 1.42
TNF 6 11 9 17 17 1.27
COL26A1 7 23 23 58 53 3.96
TPSG1 16 48 48 80 72 5.38
TPSAB1 16 22 19 80 68 5.08
TPSD1 16 12 11 23 21 1.57
IL4R 16 39 20 30 28 2.09
DHX8 17 20 18 26 23 1.72
Demonstration
Results in Ideal Conditions (only rare LOF variants)
Gene Samples
w/ variant
Cases w/
variant
KBAC p
(1M sims)
SKAT p SKAT-O p
IL12B 19 19 7e-6 0.02 6.8e-5
TNF 17 17 2.8e-5 0.03 2.0e-4
COL26A1 58 53 1e-6 9.9e-5 7.8e-9
TPSG1 80 72 1e-6 0.001 2.6e-10
TPSAB1 80 68 1e-6 3.7e-5 1.3e-8
TPSD1 23 21 5.3e-4 0.07 4.3e-4
IL4R 30 28 1e-6 0.25 2.6e-5
DHX8 26 23 0.004 0.37 6.1e-4
SKAT vs SKAT-O
KBAC vs SKAT-O
Results in Noisy Conditions (All rare NS variants)
Gene Rare NS
Variants
LOF
Variants
KBAC p
(1M sims)
SKAT p SKAT-O p
IL12B 18 11 2.6e-4 0.06 1.0e-4
TNF 11 9 4.5e-5 0.03 5.4e-4
COL26A1 23 23 1e-6 9.9e-5 7.8e-9
TPSG1 48 48 1e-6 0.002 2.6e-10
TPSAB1 22 19 1e-6 3.8e-5 1.5e-8
TPSD1 12 11 4.1e-4 0.07 2.8e-5
IL4R 39 20 0.0037 0.12 0.017
DHX8 20 18 0.0038 0.43 0.001
SKAT vs SKAT-O
KBAC vs SKAT-O
Results in Noisy Conditions (All Functional Variants)
Gene Functional
Variants
LOF
Variants
CMC p SKAT p SKAT-O p
IL12B 14 11 4.6e-4 0.13 0.026
TNF 9 9 1.1e-4 0.029 2.0e-4
COL26A1 28 23 5.1e-7 0.043 1.3e-5
TPSG1 67 48 0.002 0.35 0.0078
TPSAB1 22 19 7.2e-7 0.004 2.0e-4
TPSD1 21 11 4.1e-4 0.12 0.0033
IL4R 28 20 0.022 0.04 0.075
DHX8 20 18 0.014 0.31 0.021
SKAT vs SKAT-O
CMC vs SKAT-O
NGS Analysis
Primary
Analysis
Secondary
Analysis
Tertiary
Analysis
“Sense Making”
 Analysis of hardware generated data, on-machine real-time stats.
 Production of sequence reads and quality scores
 Typical product is “FASTQ” file
 Recalibrating, de-duplication, QA and clipping/filtering reads
 Alignment/Assembly of reads
 Variant calling on aligned reads
 Typical products are “BAM” and/or “VCF” files
 QA and filtering of variant calls
 Annotation and filtering of variants
 Multi-sample integration
 Visualization of variants in genomic context
 Experiment-specific inheritance/population analysis
 “Small-N” and “Large-N” approaches
NGS Analysis
Primary
Analysis
Secondary
Analysis
Tertiary
Analysis
“Sense Making”
 Analysis of hardware generated data, on-machine real-time stats.
 Production of sequence reads and quality scores
 Typical product is “FASTQ” file
 Recalibrating, de-duplication, QA and clipping/filtering reads
 Alignment/Assembly of reads
 Variant calling on aligned reads
 Typical products are “BAM” and/or “VCF” files
 QA and filtering of variant calls
 Annotation and filtering of variants
 Multi-sample integration
 Visualization of variants in genomic context
 Experiment-specific inheritance/population analysis
 “Small-N” and “Large-N” approaches
Secondary Analysis and QC Considerations
 What did we do in GWAS?
- Call rate
- HWE
- MAF
- But those aren’t really applicable for RV analysis…
 What do we use for NGS?
- Coverage depth
- Quality scores per variant and per genotype call
- Singleton counts
- Ts/Tv ratios
- Mappability of the region
Most Importantly: Be Consistent!
Gholson Lyon, 2012
What about common variants?
 Yes, you can run GWAS-style
common variant analysis
procedures with sequence data
 Helpful to have homozygous
reference genotype calls in the
data
 Certain procedures may require
careful consideration in terms
of variant selection
- PCA
- IBD analysis
- Mixed model regression
Marker Selection Process
Affymetrix SNP6
 Full content: 906k
 Autosomes: 867k
 MAF>0.01, CR>0.99: 806k
 LD Pruned: 74k
1kG Phase 3
 Whole genome: 81M
 Exons +/- 5bp: 2.2M
 MAF>0.01: 263k
 LD-Pruned: 70k
PCA Comparison
Affymetrix SNP6 1kG Phase 3
Measuring the same thing?
Conclusions
 SVS is a comprehensive platform for analysis of common and rare
sequence variants
 SKAT-O is an optimized combination of SKAT and burden test
approaches
 The power of collapsing tests is affected by many factors:
- Variant weighting schemes
- Variant filtering/selection process
- Causal vs. “passenger” variants
- Sample size
- The true underlying biology of the disease
Questions or
more info:
 info@goldenhelix.com
 Request a copy of SVS at
www.goldenhelix.com
 Download GenomeBrowse
for free at
www.GenomeBrowse.com
Use the Questions pane in
your GoToWebinar window
Any Questions?

Contenu connexe

Tendances

Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Databricks
 

Tendances (20)

171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
Fine-tuning CNV Analysis for the Clinical Analysis of NGS Samples
Fine-tuning CNV Analysis for the Clinical Analysis of NGS SamplesFine-tuning CNV Analysis for the Clinical Analysis of NGS Samples
Fine-tuning CNV Analysis for the Clinical Analysis of NGS Samples
 
Calling Large LOH and CNV Events with NGS Exomes
Calling Large LOH and CNV Events with NGS ExomesCalling Large LOH and CNV Events with NGS Exomes
Calling Large LOH and CNV Events with NGS Exomes
 
160627 giab for festival sv workshop
160627 giab for festival sv workshop160627 giab for festival sv workshop
160627 giab for festival sv workshop
 
160628 giab for festival of genomics
160628 giab for festival of genomics160628 giab for festival of genomics
160628 giab for festival of genomics
 
Ashg sedlazeck grc_share
Ashg sedlazeck grc_shareAshg sedlazeck grc_share
Ashg sedlazeck grc_share
 
Qi liu 08.08.2014
Qi liu 08.08.2014Qi liu 08.08.2014
Qi liu 08.08.2014
 
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
Building Genomic Data Processing and Machine Learning Workflows Using Apache ...
 
ASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottleASHG 2015 Genome in a bottle
ASHG 2015 Genome in a bottle
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
Mane v2 final
Mane v2 finalMane v2 final
Mane v2 final
 
Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128Giab jan2016 intro and update 160128
Giab jan2016 intro and update 160128
 
Giab ashg webinar 160224
Giab ashg webinar 160224Giab ashg webinar 160224
Giab ashg webinar 160224
 
GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005GIAB-GRC workshop oct2015 giab introduction 151005
GIAB-GRC workshop oct2015 giab introduction 151005
 
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
Aug2015 Ali Bashir and Jason Chin Pac bio giab_assembly_summary_ali3
 
GIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seqGIAB Sep2016 Lightning megan cleveland targeted seq
GIAB Sep2016 Lightning megan cleveland targeted seq
 
171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justin171114 best practices for benchmarking variant calls justin
171114 best practices for benchmarking variant calls justin
 
Jan2016 bina giab
Jan2016 bina giabJan2016 bina giab
Jan2016 bina giab
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
GIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatchGIAB Sep2016 Lightning chen sun varmatch
GIAB Sep2016 Lightning chen sun varmatch
 

Similaire à Population-Based DNA Variant Analysis

2012 10-24 - ngs webinar
2012 10-24 - ngs webinar2012 10-24 - ngs webinar
2012 10-24 - ngs webinar
Elsa von Licy
 
2013 02-14 - ngs webinar - sellappan
2013 02-14 - ngs webinar - sellappan2013 02-14 - ngs webinar - sellappan
2013 02-14 - ngs webinar - sellappan
Elsa von Licy
 

Similaire à Population-Based DNA Variant Analysis (20)

Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large CohortsRare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
 
Using VarSeq to Improve Variant Analysis Research Workflows
Using VarSeq to Improve Variant Analysis Research WorkflowsUsing VarSeq to Improve Variant Analysis Research Workflows
Using VarSeq to Improve Variant Analysis Research Workflows
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVSExploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
Exploring DNA/RNA-Seq Analysis Results with Golden Helix GenomeBrowse and SVS
 
CNV, GWAS & Clinical Analysis Advancements in SVS
CNV, GWAS & Clinical Analysis Advancements in SVSCNV, GWAS & Clinical Analysis Advancements in SVS
CNV, GWAS & Clinical Analysis Advancements in SVS
 
Tools for Using NIST Reference Materials
Tools for Using NIST Reference MaterialsTools for Using NIST Reference Materials
Tools for Using NIST Reference Materials
 
Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex DiseasesBack to Basics: Using GWAS to Drive Discovery for Complex Diseases
Back to Basics: Using GWAS to Drive Discovery for Complex Diseases
 
Knowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and VariantsKnowing Your NGS Upstream: Alignment and Variants
Knowing Your NGS Upstream: Alignment and Variants
 
RNA-seq differential expression analysis
RNA-seq differential expression analysisRNA-seq differential expression analysis
RNA-seq differential expression analysis
 
Assign 2.0 software for the analysis of Phred quality values for quality con...
Assign 2.0  software for the analysis of Phred quality values for quality con...Assign 2.0  software for the analysis of Phred quality values for quality con...
Assign 2.0 software for the analysis of Phred quality values for quality con...
 
Prediction and Meta-Analysis
Prediction and Meta-AnalysisPrediction and Meta-Analysis
Prediction and Meta-Analysis
 
Prediction and Meta-Analysis
Prediction and Meta-AnalysisPrediction and Meta-Analysis
Prediction and Meta-Analysis
 
Ngs webinar 2013
Ngs webinar 2013Ngs webinar 2013
Ngs webinar 2013
 
Golden Helix's End-to-End Solution for Clinical Labs
Golden Helix's End-to-End Solution for Clinical LabsGolden Helix's End-to-End Solution for Clinical Labs
Golden Helix's End-to-End Solution for Clinical Labs
 
2012 10-24 - ngs webinar
2012 10-24 - ngs webinar2012 10-24 - ngs webinar
2012 10-24 - ngs webinar
 
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
XabTracker & SeqAgent: Integrated LIMS & Sequence Analysis Tools for Antibody...
 
2013 02-14 - ngs webinar - sellappan
2013 02-14 - ngs webinar - sellappan2013 02-14 - ngs webinar - sellappan
2013 02-14 - ngs webinar - sellappan
 
VS-CNV Annotations from the User's Perspective
VS-CNV Annotations from the User's PerspectiveVS-CNV Annotations from the User's Perspective
VS-CNV Annotations from the User's Perspective
 
Performing a Trio Analysis in VSClinical
Performing a Trio Analysis in VSClinicalPerforming a Trio Analysis in VSClinical
Performing a Trio Analysis in VSClinical
 
Annotation capabilities
Annotation capabilitiesAnnotation capabilities
Annotation capabilities
 

Plus de Golden Helix

Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...
Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...
Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...
Golden Helix
 
Enhance Genomic Research with Polygenic Risk Score Calculations in SVS
Enhance Genomic Research with Polygenic Risk Score Calculations in SVSEnhance Genomic Research with Polygenic Risk Score Calculations in SVS
Enhance Genomic Research with Polygenic Risk Score Calculations in SVS
Golden Helix
 
VarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMG
VarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMGVarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMG
VarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMG
Golden Helix
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
Golden Helix
 
Prenatal Genetic Screening with VarSeq
Prenatal Genetic Screening with VarSeqPrenatal Genetic Screening with VarSeq
Prenatal Genetic Screening with VarSeq
Golden Helix
 
Automated FASTQ to Reports with VarSeq Suite: A fast, flexible solution
Automated FASTQ to Reports with VarSeq Suite: A fast, flexible solutionAutomated FASTQ to Reports with VarSeq Suite: A fast, flexible solution
Automated FASTQ to Reports with VarSeq Suite: A fast, flexible solution
Golden Helix
 
Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...
Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...
Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...
Golden Helix
 
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic VariationVarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
Golden Helix
 
VarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMP
VarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMPVarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMP
VarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMP
Golden Helix
 
Single Sample and Family Based Genome Analysis With VarSeq
Single Sample and Family Based Genome Analysis With VarSeqSingle Sample and Family Based Genome Analysis With VarSeq
Single Sample and Family Based Genome Analysis With VarSeq
Golden Helix
 

Plus de Golden Helix (20)

VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic AnalysisVarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
VarSeq 2.6.0: Advancing Pharmacogenomics and Genomic Analysis
 
Introducing VSPGx: Pharmacogenomics Testing in VarSeq
Introducing VSPGx: Pharmacogenomics Testing in VarSeqIntroducing VSPGx: Pharmacogenomics Testing in VarSeq
Introducing VSPGx: Pharmacogenomics Testing in VarSeq
 
Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...
Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...
Analyzing Performance of the Twist Exome with CNV Backbone at Various Probe D...
 
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
From Panels to Genomes with VarSeq: The Complete Tertiary Platform for Short ...
 
Enhance Genomic Research with Polygenic Risk Score Calculations in SVS
Enhance Genomic Research with Polygenic Risk Score Calculations in SVSEnhance Genomic Research with Polygenic Risk Score Calculations in SVS
Enhance Genomic Research with Polygenic Risk Score Calculations in SVS
 
VarSeq 2.5.0: VSClinical AMP Workflow from the User Perspective
VarSeq 2.5.0: VSClinical AMP Workflow from the User PerspectiveVarSeq 2.5.0: VSClinical AMP Workflow from the User Perspective
VarSeq 2.5.0: VSClinical AMP Workflow from the User Perspective
 
VarSeq 2.5.0: Empowering Family Planning through Carrier Screening Analysis
VarSeq 2.5.0: Empowering Family Planning through Carrier Screening AnalysisVarSeq 2.5.0: Empowering Family Planning through Carrier Screening Analysis
VarSeq 2.5.0: Empowering Family Planning through Carrier Screening Analysis
 
Identifying Oncogenic Variants in VarSeq
Identifying Oncogenic Variants in VarSeqIdentifying Oncogenic Variants in VarSeq
Identifying Oncogenic Variants in VarSeq
 
Best Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing WorkflowBest Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing Workflow
 
2023 Innovation Awards Winner, Dr. Muthukumaran
2023 Innovation Awards Winner, Dr. Muthukumaran2023 Innovation Awards Winner, Dr. Muthukumaran
2023 Innovation Awards Winner, Dr. Muthukumaran
 
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User PerspectiveVarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
VarSeq 2.4.0: VSClinical ACMG Workflow from the User Perspective
 
VarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMG
VarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMGVarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMG
VarSeq 2.4.0: Structural Variants and Advanced Automation in VSClinical ACMG
 
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeqThe Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
The Wide Spectrum of Next-Generation Sequencing Assays with VarSeq
 
Prenatal Genetic Screening with VarSeq
Prenatal Genetic Screening with VarSeqPrenatal Genetic Screening with VarSeq
Prenatal Genetic Screening with VarSeq
 
Automated FASTQ to Reports with VarSeq Suite: A fast, flexible solution
Automated FASTQ to Reports with VarSeq Suite: A fast, flexible solutionAutomated FASTQ to Reports with VarSeq Suite: A fast, flexible solution
Automated FASTQ to Reports with VarSeq Suite: A fast, flexible solution
 
Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...
Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...
Maximizing the Benefits of Comprehensive Genomic Testing in Cancer Care with ...
 
A User’s Perspective: Somatic Variant Analysis in VarSeq 2.3.0
A User’s Perspective: Somatic Variant Analysis in VarSeq 2.3.0A User’s Perspective: Somatic Variant Analysis in VarSeq 2.3.0
A User’s Perspective: Somatic Variant Analysis in VarSeq 2.3.0
 
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic VariationVarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
VarSeq 2.3.0: Supporting the Full Spectrum of Genomic Variation
 
VarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMP
VarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMPVarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMP
VarSeq 2.3.0: New TSO-500 and Genomic Signature Support in VSClinical AMP
 
Single Sample and Family Based Genome Analysis With VarSeq
Single Sample and Family Based Genome Analysis With VarSeqSingle Sample and Family Based Genome Analysis With VarSeq
Single Sample and Family Based Genome Analysis With VarSeq
 

Dernier

nagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
nagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetnagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
nagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh
💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh
💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh
Sheetaleventcompany
 
Bhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetBhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
mahaiklolahd
 
Tirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Tirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetTirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Tirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Nanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Nanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetNanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Nanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
palanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
palanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetpalanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
palanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Ozhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Ozhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetOzhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Ozhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
raisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
raisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetraisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
raisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Bihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetBihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Patna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Patna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetPatna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Patna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Rajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Rajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetRajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Rajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetSambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
bhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetbhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 
kozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
kozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetkozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
kozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Call Girls Service
 

Dernier (20)

nagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
nagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetnagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
nagpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Vip Call Girls Makarba 👙 6367187148 👙 Genuine WhatsApp Number for Real Meet
Vip Call Girls Makarba 👙 6367187148 👙 Genuine WhatsApp Number for Real MeetVip Call Girls Makarba 👙 6367187148 👙 Genuine WhatsApp Number for Real Meet
Vip Call Girls Makarba 👙 6367187148 👙 Genuine WhatsApp Number for Real Meet
 
(Big Boobs Indian Girls) 💓 9257276172 💓High Profile Call Girls Jaipur You Can...
(Big Boobs Indian Girls) 💓 9257276172 💓High Profile Call Girls Jaipur You Can...(Big Boobs Indian Girls) 💓 9257276172 💓High Profile Call Girls Jaipur You Can...
(Big Boobs Indian Girls) 💓 9257276172 💓High Profile Call Girls Jaipur You Can...
 
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetMangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Mangalore Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Sexy Call Girl Dharmapuri Arshi 💚9058824046💚 Dharmapuri Escort Service
Sexy Call Girl Dharmapuri Arshi 💚9058824046💚 Dharmapuri Escort ServiceSexy Call Girl Dharmapuri Arshi 💚9058824046💚 Dharmapuri Escort Service
Sexy Call Girl Dharmapuri Arshi 💚9058824046💚 Dharmapuri Escort Service
 
💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh
💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh
💚 Punjabi Call Girls In Chandigarh 💯Lucky 🔝8868886958🔝Call Girl In Chandigarh
 
Bhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetBhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bhagalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
ooty Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
 
Tirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Tirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetTirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Tirupati Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Nanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Nanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetNanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Nanded Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
palanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
palanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetpalanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
palanpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Ozhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Ozhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetOzhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Ozhukarai Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
raisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
raisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetraisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
raisen Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Bihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetBihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Bihar Sharif Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Patna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Patna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetPatna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Patna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Rajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Rajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetRajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Rajkot Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetSambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Sambalpur Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
bhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetbhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
bhubaneswar Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
kozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
kozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetkozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
kozhikode Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 

Population-Based DNA Variant Analysis

  • 1. Population-Based DNA Variant Analysis with Golden Helix SVS Jan 21, 2015 Bryce Christensen, PhD Statistical Geneticist / Director of Services
  • 2. Use the Questions pane in your GoToWebinar window Questions during the presentation
  • 3. Golden Helix Leaders in Genetic Analytics  Founded in 1998  Multi-disciplinary: computer science, bioinformatics, statistics, genetics  Software and analytic services About Golden Helix
  • 4. Core Features Packages Core Features  Powerful Data Management  Rich Visualizations  Robust Statistics  Flexible  Easy-to-use Applications  Genotype Analysis  DNA sequence analysis  CNV Analysis  RNA-seq differential expression  Family Based Association SNP & Variation Suite (SVS)
  • 5. Agenda Overview of RV analysis methods Method Comparisons 2 3 4 NGS workflow design in SVS Define the problem: What is rare variant (RV) analysis?1 Upstream analysis and QC considerations5 Q&A6
  • 6. The Problem  Array-based GWAS has been the primary technology for gene- finding research for the past decade  NGS technology, particularly whole-exome sequencing, makes it possible to include rare variants (RVs) in the analysis  Individual RVs lack statistical power for standard GWAS approaches - How do we utilize that information?  Proposed solution: combine RVs into logical groups and analyze them as a single unit - AKA “Collapsing” or “Burden” tests.
  • 7. Two Primary Approaches  Direct search for susceptibility variants - Assume highly penetrant variant and/or Mendelian disease - Extensive reliance on bioinformatics for variant annotation and filtering - Sample sizes usually small  Rare Variant (RV) “collapsing” methods - Increasingly common in complex disease research - May require very large sample sizes! - Assume that any of several LOF variants in a susceptibility gene may lead to same disease or trait - Many statistical tests available - Also relies heavily on bioinformatics
  • 8. Families of Collapsing Tests  Burden Tests - Combine minor alleles across multiple variant sites… - Without weighting (CMC, CAST, CMAT) - With fixed weights based on allele frequency (WSS, RWAS) - With data-adaptive weights (Lin/Tang, KBAC) - With data-adaptive thresholds (Step-Up, VT) - With extensions to allow for effects in either direction (Ionita-Laza/Lange, C-alpha)  Kernel Tests - Allow for individual variant effects in either direction and permit covariate adjustment based on kernel regression - Kwee et al., AJHG, 2008 - SKAT - SKAT-O Credit: Schaid et al., Genet Epi, 2013
  • 9. Burden Test Methods in SVS  CMC: Combined Multivariate and Collapsing test - Multivariate test: simultaneous test for association of common and rare variants in gene - Testing methods include Hotelling T2 and Regression - Li and Leal, AJHG, 2008  KBAC: Kernel-Based Adaptive Clustering - Test models the risk associated with multi-locus genotypes in gene regions - Adaptive weighting procedure that gives higher weights to genotypes with higher sample risks - Permutation testing, regression or mixed-model significance testing options - Liu and Leal, PLoS Genetics, 2010
  • 10. KBAC: Kernel Based Adaptive Clustering  Test models the risk associated with multi-locus genotypes at a per-gene level  Adaptive weighting procedure that gives higher weights to genotypes with higher sample risks  SVS implementation includes option for 1- or 2-tailed test - But most powerful when all variants in gene have unidirectional effect  Permutation testing, regression or mixed-model significance testing options  Liu and Leal, PLoS Genetics, 2010
  • 11. SKAT: Sequence Kernel Association Test  Utilizes kernel machine methods  Aggregates test statistics of variants over gene region to compute region level p-values  Many extensions of the method  “This method can be more powerful when causal variants have bidirectional effects and/or a large proportion of the variants within gene region are non-causal.”  “SKAT is less powerful than burden tests when causal variant effects are unidirectional.” - Liu and Leal, PLoS Genetics, 2012
  • 12. SKAT-O: Optimized SKAT approach  Combines a burden test with SKAT test in unified approach  Burden tests are more powerful when most variants in a region are causal  SKAT test is more powerful when variants have effects in different directions or some variants have no effect  SKAT-O optimizes power by adaptively weighting the SKAT and burden results in a combined test
  • 13. Variant Weighting in SKAT Tests  Default weighting scheme is based on Beta(1,25) distribution  Gives much greater weight to the rarest variants 0.0 0.1 0.2 0.3 0.4 0.5 0510152025 MAF Weight Beta(1,25) WSS Uniform
  • 14. Bioinformatic Filtering and Annotation  The genomics community has spent years producing vast resources of data about DNA sequence variants - Some data is observational, like variant frequencies from the 1000 genomes project or the NHLBI Exome Sequencing Project - Other data is based on predictive algorithms, like PolyPhen or SIFT. - Even “simple” annotations, like mapping data for genes, segmental duplications and other sequence features are extremely valuable for analytic workflows.  These data sources can be used to annotate variants identified in an NGS experiment - Annotations may be used for both QC and analysis purposes.  Once annotated, variants may be filtered, sorted, and prioritized to help us identify disease-causing mutations
  • 15. Interactive Demonstration  SVS 8.3  Exploratory analysis workflow - Simulate the development of a burden test  Formal RV association test workflows - SKAT and SKAT-O - KBAC and MM-KBAC - CMC
  • 16. NGS Analysis Workflow Development in SVS  SVS is very flexible in workflow design.  SVS includes a broad range of tools for data manipulation and variant annotation and visualization that can be used together to guide us on an interactive exploration of the data.  We begin by defining the final goal and the steps needed to help us reach that goal: - Are we looking for a very rare, non-synonymous variant that causes a dominant Mendelian trait? - Are we looking for a gene with excess rare variation in cases vs controls?  Once we know what we are looking for, we can identify the available annotation sources that will help us answer the question.
  • 17. Data Simulation Process  Begin with 2504 samples from 1000 Genomes Phase 3 data.  Define LOF variants as: - MAF<0.01 in NHLBI/ESP 6500 Exomes data - MAF<0.01 according to dbSNP - Predicted damaging by at least one of 5 prediction methods in dbNSFP - (Automatically excludes synonymous, non-coding, and InDel variants) - Results in 395k variant sites  Create random binary phenotype  Adjust phenotypes based on carrying LOF variants in any of eight genes previously implicated in asthma  Final: 1338 cases, 1166 controls  About 18k genes in tests
  • 18. Data Simulation Gene Chr Rare NS Variants LOF Variants Samples w/ LOF variant Cases w/ LOF variant % of Total Cases IL12B 5 18 11 19 19 1.42 TNF 6 11 9 17 17 1.27 COL26A1 7 23 23 58 53 3.96 TPSG1 16 48 48 80 72 5.38 TPSAB1 16 22 19 80 68 5.08 TPSD1 16 12 11 23 21 1.57 IL4R 16 39 20 30 28 2.09 DHX8 17 20 18 26 23 1.72
  • 20. Results in Ideal Conditions (only rare LOF variants) Gene Samples w/ variant Cases w/ variant KBAC p (1M sims) SKAT p SKAT-O p IL12B 19 19 7e-6 0.02 6.8e-5 TNF 17 17 2.8e-5 0.03 2.0e-4 COL26A1 58 53 1e-6 9.9e-5 7.8e-9 TPSG1 80 72 1e-6 0.001 2.6e-10 TPSAB1 80 68 1e-6 3.7e-5 1.3e-8 TPSD1 23 21 5.3e-4 0.07 4.3e-4 IL4R 30 28 1e-6 0.25 2.6e-5 DHX8 26 23 0.004 0.37 6.1e-4
  • 23. Results in Noisy Conditions (All rare NS variants) Gene Rare NS Variants LOF Variants KBAC p (1M sims) SKAT p SKAT-O p IL12B 18 11 2.6e-4 0.06 1.0e-4 TNF 11 9 4.5e-5 0.03 5.4e-4 COL26A1 23 23 1e-6 9.9e-5 7.8e-9 TPSG1 48 48 1e-6 0.002 2.6e-10 TPSAB1 22 19 1e-6 3.8e-5 1.5e-8 TPSD1 12 11 4.1e-4 0.07 2.8e-5 IL4R 39 20 0.0037 0.12 0.017 DHX8 20 18 0.0038 0.43 0.001
  • 26. Results in Noisy Conditions (All Functional Variants) Gene Functional Variants LOF Variants CMC p SKAT p SKAT-O p IL12B 14 11 4.6e-4 0.13 0.026 TNF 9 9 1.1e-4 0.029 2.0e-4 COL26A1 28 23 5.1e-7 0.043 1.3e-5 TPSG1 67 48 0.002 0.35 0.0078 TPSAB1 22 19 7.2e-7 0.004 2.0e-4 TPSD1 21 11 4.1e-4 0.12 0.0033 IL4R 28 20 0.022 0.04 0.075 DHX8 20 18 0.014 0.31 0.021
  • 29. NGS Analysis Primary Analysis Secondary Analysis Tertiary Analysis “Sense Making”  Analysis of hardware generated data, on-machine real-time stats.  Production of sequence reads and quality scores  Typical product is “FASTQ” file  Recalibrating, de-duplication, QA and clipping/filtering reads  Alignment/Assembly of reads  Variant calling on aligned reads  Typical products are “BAM” and/or “VCF” files  QA and filtering of variant calls  Annotation and filtering of variants  Multi-sample integration  Visualization of variants in genomic context  Experiment-specific inheritance/population analysis  “Small-N” and “Large-N” approaches
  • 30. NGS Analysis Primary Analysis Secondary Analysis Tertiary Analysis “Sense Making”  Analysis of hardware generated data, on-machine real-time stats.  Production of sequence reads and quality scores  Typical product is “FASTQ” file  Recalibrating, de-duplication, QA and clipping/filtering reads  Alignment/Assembly of reads  Variant calling on aligned reads  Typical products are “BAM” and/or “VCF” files  QA and filtering of variant calls  Annotation and filtering of variants  Multi-sample integration  Visualization of variants in genomic context  Experiment-specific inheritance/population analysis  “Small-N” and “Large-N” approaches
  • 31. Secondary Analysis and QC Considerations  What did we do in GWAS? - Call rate - HWE - MAF - But those aren’t really applicable for RV analysis…  What do we use for NGS? - Coverage depth - Quality scores per variant and per genotype call - Singleton counts - Ts/Tv ratios - Mappability of the region
  • 32. Most Importantly: Be Consistent! Gholson Lyon, 2012
  • 33. What about common variants?  Yes, you can run GWAS-style common variant analysis procedures with sequence data  Helpful to have homozygous reference genotype calls in the data  Certain procedures may require careful consideration in terms of variant selection - PCA - IBD analysis - Mixed model regression
  • 34. Marker Selection Process Affymetrix SNP6  Full content: 906k  Autosomes: 867k  MAF>0.01, CR>0.99: 806k  LD Pruned: 74k 1kG Phase 3  Whole genome: 81M  Exons +/- 5bp: 2.2M  MAF>0.01: 263k  LD-Pruned: 70k
  • 37. Conclusions  SVS is a comprehensive platform for analysis of common and rare sequence variants  SKAT-O is an optimized combination of SKAT and burden test approaches  The power of collapsing tests is affected by many factors: - Variant weighting schemes - Variant filtering/selection process - Causal vs. “passenger” variants - Sample size - The true underlying biology of the disease
  • 38. Questions or more info:  info@goldenhelix.com  Request a copy of SVS at www.goldenhelix.com  Download GenomeBrowse for free at www.GenomeBrowse.com
  • 39. Use the Questions pane in your GoToWebinar window Any Questions?