Presented at the 2014 Bio-IT World Expo in Boston, this slideshow provides info on the use of Lyons-Weiler's entropy-based measures of genotypic signal to improve concordance among alternative variant calling algorithms and to evaluate various steps in the GATK best practices pipeline. The second part of the talk presented data showing a demarcation bias in the widely used measure of fold change in selected differentially expressed genes, transcripts or proteins from microarray and RNASeq data.
http://www.bio-itworldexpo.com/Next-Gen-Sequencing-Informatics/
Avoiding Nonsense Results in your NGS Variant Studies
1. Avoiding Nonsense Results
in your NGS Variant Studies
James Lyons-Weiler, PhD
Scientific Director/
Senior Research Scientist
Bioinformatics Analysis Core
Genomics & Proteomics Core Laboratories
University of Pittsburgh
Pittsburgh, PA
May 1, 2014
2. Two Parts
• Identifying sites with low genotypic signal
increases concordance among variant callers
• Hazards in finding differentially expressed
genes in RNASeq – how to do it more robustly.
23. Part 2: Good and Bad News for
RNASeq (and everything else):
The Bad News:
Fold Change is Biased.
The Good News:
We have identified a much less biased method.
24. T-test is not appropriate
for small N, large P data
(such as RNASeq)
31. FC Bias in
Amyotrophic Lateral Sclerosis
350000
300000
250000
200000
150000
100000
50000
0
0 50000 100000 150000 200000
Control
ALS
DEGy
FCDEGy
Black circles = FC(A/B). Pink = Gthr-J5 genes
32.
33.
34. FC(A/B) Bias in
Alchohol-Induced Hepatitis
Black circles = FC(A/B). Pink = Gthr-J5 genes
35. Conclusions
• Not all NGS/HTS sites have sufficient genotypic signal to warrant a
base call. High coverage alone does not provide a solution.
• By measuring genotypic signal, we can determine which sites we
can call with confidence.
• Fold-change(FC(A/B) is blind to highly expressed genes and should
be abandoned as a measure of differential expression altogether –
even for single gene or single protein studies!
• Published microarray data sets analyzed to date using FC(A/B) only
are a gold-mine for re-analysis using less biased methods.
36. Credits and Contact
• pw, pHom, etc: James Lyons-Weiler, Alan Twaddle, Rahil Sethi.
– (MS in preparation)
– Our software is called Gconf (not yet available)
• Fold-Change Bias: James Lyons-Weiler, Tamanna Sultana, Rick
Jordan, Rahil Sethi
– (Paper in review)
– For now, read
• Mariani TJ, Budhraja V, Mecham BH, Gu CC, Watson MA, Sadovsky Y. 2003. A
variable fold change threshold determines significance for expression microarrays.
FASEB J. 17:321-3. doi: 10.1096/fj.02-0351fje
• Pearson, K. 1897. On a form of spurious correlation that may arise when indices are
used for the measurement of organs. Proc Roy Soc Lond 60:489-498 doi:
10.1098/rspl.1896.0076