"The prospects for Nextgen surveillance of pathogens: A view from a Public Health Lab" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by the National Institute for Standards and Technology in October 2014 by William Wolfgang, PhD from Wadsworth Center NYSDOH.
The prospects for Nextgen surveillance of pathogens: A view from a Public Health Lab
1. The prospects for Nextgen surveillance of
pathogens: A view from a Public Health Lab
William Wolfgang
Wadsworth Center NYSDOH
NIST Workshop 10/20/14
3. We collaborate with a number of different
groups on Pathogen sequencing projects
• Global Microbial Identifier (GMI) initiative
• CDC: Listeria monocytogenes initiative and AMD initiative
• FDA: GenomeTrakr initiative
• Minnesota and Washington Departments of Health
• And we hope to do more
4. At the Wadsworth we need standards
• As we translate to new technologies we need to know:
• Faster
• Cheaper
• Better
• For this we need to make accurate and meaningful comparisons.
• To do this we need standards.
5. Why use Nextgen for Salmonella typing?
• PFGE has low discriminatory power.
.
Each year
• 1 million cases Salmonella in US.
• 19,000 hospitalizations and 378 deaths.
• The Wadsworth receives about 1,500/yr.
7. Proof of principle study on a Salmonella
Enteritidis outbreak
• Sept. 2010 Connecticut Dept. of Health identifies a Salmonella
outbreak in a long term care facility (LTCF).
• Outbreak was linked to cannoli from a Westchester bakery.
• Both NY and CT cases consumed cannoli’s.
• Isolates had the most common PFGE pattern.
9. Whole genome Cluster Analysis ( WGCA)
can identify an outbreak cluster not detected
88
7.3 SNPs
1106235
1033603
1035184
1127690
1037723
1035183
1036319
+
1031528
1033369
+
1035417
+
1132014
1033213
+
1034599
1034587
+
1035179
1122186
1130508
1030147
1034601
+
1103844
1034213
1033371
+
1035181
1036119
1028670
1121079
1035178
1035182
1029153
1131312
1038792
+
1036979
+
1029949
1039087
1035180
100
100
100
85
88
68
100
100
A
B
LTCF
by PFGE
All isolates are PFGE PATTERN 4
10. Implementing WGCA for SE in real-time.
• Evaluate WGCA compared to PFGE.
• Speed - Faster
• Cost - Cheaper
• More Actionable Clusters – Better
• Develop an in house bioinformatics pipeline.
• Develop communication pipeline to epidemiologists.
• Determine cluster parameters that represent an outbreak from a single
source (assign a probability).
• Use data sets to evaluate evolving informatic methods.
• Become proficient (PT programs).
11. Over the past 12 months
• Sequenced all Salmonella Enteritidis (379 genomes).
• All data at NCBI
• Developed an in House pipeline to analyze the data.
• SNP based phylogenetic trees were constructed in real time.
• 63 phylogenetic clusters were reported to epidemiologists.
• 0 to 5 snps differences
15. Can we develop cluster metrics that give a
probability of linkage to a single source?
• Perform phylogenetic analysis of epidemiologically confirmed
outbreaks.
I. Calculate SNP distance.
II. Examine tree structure.
• Do for many serovars of each species.
• This could be done relatively easily by freezer diving and HiSeq
runs.
• For SE SNP distances appear to be small (0-3 snps)
I. Based on 9 bonafide outbreaks from NY and MN.
16. NY-swgs1311
swgs1065
Large Western NY pattern 5 cluster
• 13 isolates collected over 10
months (0 to 6 snps distance)
• First isolates in fall 2013.
• February 2014, two distinct clades
form.
• Suggests bug has evolved
• Does this cluster represent 1, 2, or
3 sources?
NY-swgs1335
NY-swgs1347
NY-swgs1366
NY-swgs1305
swgs1217
7/05/14 Oneida
5/12/14 Ontario
2/6/14 Onondaga
NY-swgs1387_NEW
NY-swgs1360
6/22/14 Seneca
NY-swgs1339
6/11/14 Oswego
2/22/14 Oneida
2/21/14 Onondaga
12/12/13 Cattaraugus
9/25/13 Niagara
10/15/13 Monroe
swgs1224
2/22/14 Onondaga
NY-swgs1333
swgs1226
swgs1223
swgs1079
swgs1008
swgs1018
NY-swgs1355
7/22/14 Onondaga
6/3/14 Erie
1+snps+
NY-swgs1375_NEW
17. One person
two outbreaks
• GC-35 appears 6/9/14
• PFGE pattern 4
• Isolate from food a handler
• Total of 5 cases through 8/7/14
• GC-40 appears 6/30/14
• PFGE pattern 21
• isolate recovered from the same
food handler
JEGX01.0021++
JEGX01.0004
18. WGCA is better, but is it faster and cheaper?
metric( PFGE( WGCA(
TAT : extraction to
analysis
2 days 6+days+
Cost+ $69+ $294+
Technician+@me+ 8h+ 10h+
Ac@onable+clusters++ 3+nonEendemic+ 63+
19. We have created a two State Network
• Collaborating with Minnesota.
• Currently no informatics in house.
• We pull their sequences off Basespace.
• Run through our pipeline.
20. Tree from Merged
data
• Does pipeline
used to merge
the data affect
tree structure?
• Do sequence
metrics affect
merged tree
structure?
Travel+associated+
21. National Genomic Surveillance Machine
• State labs feed the machine by uploading sequences from isolates
received through surveillance.
• Federal and other support for reagents and equipment.
• NCBI to analyze the products of this machine and reports results to
state and federal agencies.
22. Current FDA Genome Trackr
network
State Health labs
• New York
• Florida
• Arizona
• Washington
• Minnesota
• Virginia
• Maryland
FDA labs
• 9 FDA field labs
• CFSAN - MOD1
• CFSAN - Wiley
• IEH (contracting lab)
International labs
• Mexico
• Ireland
• UK (FERA)
• Columbia
Contributors
• Turkey
• Brazil
• Italy
24. Expected Outcomes for WGS surveillance
• Laboratory
• Improve outbreak cluster detection.
• Clusters will be detected more rapidly and from fewer isolates.
• Epi
• Allow identification of clusters within endemic patterns.
• Solve more clusters.
• Public Health
• More efficient identification and removal of pathogen sources.
25. Challenges exist
• Creating a network.
• Increasing amounts of data.
• Metadata: how much should be public?
• In real time?
• What elements?
• Paying
• As sequencing technology and bioinformatics evolve:
• Need to maintain backward compatibility
• Transitioning:
• What to do first.
• Integration with serology and PFGE typing.
26. Standards I would like to see
• Pipeline quality and reproducibility.
• Tree quality and reproducibility.
• Probability metrics that a cluster is from a single source.
27. Summary
• WGS can improve surveillance activities and outbreak traceback.
• It is practical to develop network.
• We need standards.
28. • Cornell Acknowledgments
Martin Wiedmann
Henk den Bakker
• FDA
Eric Brown
Peter Evans
Marc Allard
Errol Strain
Ruth Timme
• Connecticut DOH
Stacey Kinney
John Fontana
• Minnesota DOH
David Boxrud
Angie Jones
Victoria Lappi
• Washington State DOH
Ailyn Perez-Osorio
Zhen Li
• Wadsworth Center Genomics Core
Matt Shudt
Zhen Zhang
Charles MacGowan
Melissa Leisner
Danielle Loranger
Mike Palumbo
Pascal LaPierre
• Kara Michell
• Wadsworth PulseNet Lab
Dianna Bopp
Deb Baker
Lisa Thompson
• NCBI
Bill Klimke
Martin Shumway