SlideShare une entreprise Scribd logo
1  sur  25
Trouble with nomenclatures in personalized medicine
Asst.-Prof. Mag. Dr. Matthias Samwald
CeMSIIS, Medical University of Vienna
SUMMER SCHOOL: GENOMIC MEDICINE – Bridging research and the clinic, May 6 2016,
Portoroz, Slovenia
One man's *1 is another man's *13?
Funded by Austrian Science Fund (FWF): [P 25608-N15]
This project has received funding from the European Union’s Horizon
2020 research and Innovation programme under grant agreement
No 668353 (KB and MS).
What‘s the problem?
We simulated the accuracy of various targeted, low-
cost assays suitable for pre-emptive testing compared
to next-gen sequencing
Venn diagram displaying the numbers and overlaps of polymorphisms covered by constrained
views derived from four pharmacogenomic assays. DMET: derived from the Affymetrix
DMET™ Plus assay, VERA: Illumina VeraCode® ADME Core Panel, TAQM: TaqMan® OpenArray®
PGx Panel, FLOR: University of Florida and Stanford Custom Array.
We simulated the accuracy of various targeted, low-
cost assays suitable for pre-emptive testing compared
to next-gen sequencing
We simulated the accuracy of various targeted, low-
cost assays suitable for pre-emptive testing compared
to next-gen sequencing
We simulated the accuracy of various targeted, low-
cost assays suitable for pre-emptive testing compared
to next-gen sequencing
Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay
compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript
currently under review at ‘Pharmacogenomics’.
We simulated the accuracy of various targeted, low-
cost assays suitable for pre-emptive testing compared
to next-gen sequencing
Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay
compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript
currently under review at ‘Pharmacogenomics’.
Where to go from here?
Allele Registry project
From the lab: experimental mnemonic nomenclature
• Idea: Experiment with human-friendly nomenclature
o No human committee
o Less cryptic alphanumeric descriptors
From the lab: experimental mnemonic nomenclature
• Synthetic pseudo-words can encode a lot of information
• CVCVCV pattern examples (C = consonant, V = vowel):
o binoru
o nivudi
o pekuvo
o jutoxu
o hacifi
o dejula
• CVCVCV tuple (Y as vowel) can denote: 20 * 6 * 20 * 6 * 20 * 6 = 1
728 000 variants
Algorithm (no human curation / committee)
• Take large dataset containing variant data of our usual (1000
Genomes, 100.000 Genomes, 1M genomes…) as reference
• Create list of genome loci and variants observed there (some
loci might have more than 2 possible variants)
• For each gene:
o For each locus:
 Sort observed variants based on their frequencies
 define most frequently observed variant as ‘wild type’;
remove these variants from the table we use for constructing
the mnemonics (they are considered to be the default)
o Sort loci based on the frequency of the most frequent non-wild-
type variant of each locus
o Assign mnemonics to each variant systematically, starting with
shorter mnemonic strings (i.e., 2-character tuple)
Algorithm (no human curation / committee)
• Take large dataset containing variant data of our usual (1000
Genomes, 100.000 Genomes, 1M genomes…) as reference
• Create list of genome loci and variants observed there (some
loci might have more than 2 possible variants)
• For each gene:
o For each locus:
 Sort observed variants based on their frequencies
 define most frequently observed variant as ‘wild type’;
remove these variants from the table we use for constructing
the mnemonics (they are considered to be the default)
o Sort loci based on the frequency of the most frequent non-wild-
type variant of each locus
o Assign mnemonics to each variant systematically, starting with
shorter mnemonic strings (i.e., 2-character tuple)
Example mnemonic code sequences
VKORC1: cy-do-du | be-do-du
CYP2D6: nai / nai-pek
CYP2D6: nai / be-wi / nai-pek (copy number variation)
TMPT: be-fu-fy | ba-bi-fi-tek
Mnemonic code + reference to variants/regions covered by assay =
automatically decompress to full sequence / genotype result
Sets auf co-occuring SNP variants could automatically be assigned
identifier of their own and combined with individual SNP variant
identifiers
Currently creating humble proof-of-concept based on 1000
Genomes data
Local team (Medical University of Vienna)
Asst.-Prof. Mag. Dr. Matthias Samwald (PI)
Dr. Kathrin Blagec
Mag. Sebastian Hofer
Hong Xu, BSc
Wolfgang Kuch
Web
http://samwald.info/
http://safety-code.org/
http://upgx.eu
Thanks!
• Reference: Matthias Samwald, Kathrin Blagec, Sebastian Hofer and Robert R. Freimuth. “Analysing
the potential for incorrect haplotype calls with different pharmacogenomic assays in different
populations: a simulation based on 1000 Genomes data.” Pharmacogenomics, September 30,
2015. doi:10.2217/pgs.15.108
• Code Availability: The curated resources and the IPython notebooks available at
https://gitlab.com/medication-safety/ms-ipython
Further info

Contenu connexe

Tendances

Addressing the growing demand for CNV and UPD detection
Addressing the growing demand for CNV and UPD detection Addressing the growing demand for CNV and UPD detection
Addressing the growing demand for CNV and UPD detection
Oxford Gene Technology
 
zandona14nipsA0
zandona14nipsA0zandona14nipsA0
zandona14nipsA0
Pia Sen
 
The trivial case of the missing heritability
The trivial case of the missing heritabilityThe trivial case of the missing heritability
The trivial case of the missing heritability
Max Moldovan
 
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...
Thermo Fisher Scientific
 
A novel method for building custom ampli seq panels using optimized pcr primers
A novel method for building custom ampli seq panels using optimized pcr primers A novel method for building custom ampli seq panels using optimized pcr primers
A novel method for building custom ampli seq panels using optimized pcr primers
Thermo Fisher Scientific
 
Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...
t7260678
 

Tendances (18)

Integrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS dataIntegrating phylogenetic inference and metadata visualization for NGS data
Integrating phylogenetic inference and metadata visualization for NGS data
 
Addressing the growing demand for CNV and UPD detection
Addressing the growing demand for CNV and UPD detection Addressing the growing demand for CNV and UPD detection
Addressing the growing demand for CNV and UPD detection
 
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
Statistical methods for off-target variant genotyping on Affymetrix' Axiom Ar...
 
Integrating arrays and RNA-Seq
Integrating arrays and RNA-Seq Integrating arrays and RNA-Seq
Integrating arrays and RNA-Seq
 
zandona14nipsA0
zandona14nipsA0zandona14nipsA0
zandona14nipsA0
 
Ngs pgd
Ngs pgdNgs pgd
Ngs pgd
 
The trivial case of the missing heritability
The trivial case of the missing heritabilityThe trivial case of the missing heritability
The trivial case of the missing heritability
 
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...
Developing Custom Next-Generation Sequencing Panels using Pre-Optimized Assay...
 
A novel method for building custom ampli seq panels using optimized pcr primers
A novel method for building custom ampli seq panels using optimized pcr primers A novel method for building custom ampli seq panels using optimized pcr primers
A novel method for building custom ampli seq panels using optimized pcr primers
 
NGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical viewNGS and the molecular basis of disease: a practical view
NGS and the molecular basis of disease: a practical view
 
Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...
 
Comparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detectionComparison of LUMPY vs. DELLY for structural variant detection
Comparison of LUMPY vs. DELLY for structural variant detection
 
How to compare typing techniques: do’s and Don’t’s
How to compare typing techniques:do’s and Don’t’sHow to compare typing techniques:do’s and Don’t’s
How to compare typing techniques: do’s and Don’t’s
 
Neurotoxicity assay using High Content Screening technology
Neurotoxicity assay using High Content Screening technologyNeurotoxicity assay using High Content Screening technology
Neurotoxicity assay using High Content Screening technology
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene finding
 
Invicta eshre-poster-mitochondrial dna
Invicta eshre-poster-mitochondrial dnaInvicta eshre-poster-mitochondrial dna
Invicta eshre-poster-mitochondrial dna
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 
Qpcr
QpcrQpcr
Qpcr
 

Similaire à One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generation
MohamedHasan816582
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
Ulises Urzua
 

Similaire à One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine (20)

NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generation
 
ASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary AnalysisASHG 2015 - Redundant Annotations in Tertiary Analysis
ASHG 2015 - Redundant Annotations in Tertiary Analysis
 
Mason abrf single_cell_2017
Mason abrf single_cell_2017Mason abrf single_cell_2017
Mason abrf single_cell_2017
 
Human Cell Line Authentication. Why is it so important?
Human Cell Line Authentication. Why is it so important?Human Cell Line Authentication. Why is it so important?
Human Cell Line Authentication. Why is it so important?
 
VS-CNV Annotations from the User's Perspective
VS-CNV Annotations from the User's PerspectiveVS-CNV Annotations from the User's Perspective
VS-CNV Annotations from the User's Perspective
 
Rnaseq forgenefinding
Rnaseq forgenefindingRnaseq forgenefinding
Rnaseq forgenefinding
 
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
Next Generation Diagnostics: Potential Clinical Applications of Illumina’sTec...
 
Paper presentation @DILS'07
Paper presentation @DILS'07Paper presentation @DILS'07
Paper presentation @DILS'07
 
Axiom® Genome-Wide LAT 1 Array World Array 4
Axiom®  Genome-Wide LAT 1 Array World Array 4Axiom®  Genome-Wide LAT 1 Array World Array 4
Axiom® Genome-Wide LAT 1 Array World Array 4
 
Genomica - Microarreglos de DNA
Genomica - Microarreglos de DNAGenomica - Microarreglos de DNA
Genomica - Microarreglos de DNA
 
GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016GMI proficiency testing- Progress report 2016
GMI proficiency testing- Progress report 2016
 
Bioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss researchBioinformatics and NGS for advancing in hearing loss research
Bioinformatics and NGS for advancing in hearing loss research
 
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
SNP genotyping using the Affymetrix® Axiom® Genome-Wide Pan-African (PanAFR) ...
 
Rapid generation of E.coli O104:H4 PCR diagnostics
Rapid generation of E.coli O104:H4 PCR diagnosticsRapid generation of E.coli O104:H4 PCR diagnostics
Rapid generation of E.coli O104:H4 PCR diagnostics
 
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...Genomic Epidemiology:  How High Throughput Sequencing changed our view on bac...
Genomic Epidemiology: How High Throughput Sequencing changed our view on bac...
 
Annotation capabilities
Annotation capabilitiesAnnotation capabilities
Annotation capabilities
 
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
 
Axiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping ArraysAxiom® Biobank Genotyping Arrays
Axiom® Biobank Genotyping Arrays
 
Expanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGSExpanding Your Research Capabilities Using Targeted NGS
Expanding Your Research Capabilities Using Targeted NGS
 

Plus de Matthias Samwald

Towards Transformative Artificial Intelligence in Life Science and Health Care
Towards Transformative Artificial Intelligence in Life Science and Health CareTowards Transformative Artificial Intelligence in Life Science and Health Care
Towards Transformative Artificial Intelligence in Life Science and Health Care
Matthias Samwald
 
Hcls call september 2013 clinical pharmacogenomics
Hcls call september 2013   clinical pharmacogenomicsHcls call september 2013   clinical pharmacogenomics
Hcls call september 2013 clinical pharmacogenomics
Matthias Samwald
 
Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...
Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...
Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...
Matthias Samwald
 

Plus de Matthias Samwald (10)

Towards Transformative Artificial Intelligence in Life Science and Health Care
Towards Transformative Artificial Intelligence in Life Science and Health CareTowards Transformative Artificial Intelligence in Life Science and Health Care
Towards Transformative Artificial Intelligence in Life Science and Health Care
 
Bridging theory and practice: Clinical decision support systems for personali...
Bridging theory and practice: Clinical decision support systems for personali...Bridging theory and practice: Clinical decision support systems for personali...
Bridging theory and practice: Clinical decision support systems for personali...
 
CRISPR as a potential tool for malaria eradication
CRISPR as a potential tool for malaria eradicationCRISPR as a potential tool for malaria eradication
CRISPR as a potential tool for malaria eradication
 
VO Taxonomie und Ontologie (SS 2016)
VO Taxonomie und Ontologie (SS 2016)VO Taxonomie und Ontologie (SS 2016)
VO Taxonomie und Ontologie (SS 2016)
 
6 Worlds Collide
6 Worlds Collide6 Worlds Collide
6 Worlds Collide
 
The FindMeEvidence project: An open-source, mobile-friendly search engine for...
The FindMeEvidence project: An open-source, mobile-friendly search engine for...The FindMeEvidence project: An open-source, mobile-friendly search engine for...
The FindMeEvidence project: An open-source, mobile-friendly search engine for...
 
The Medication Safety Code initiative: Towards a global IT system for persona...
The Medication Safety Code initiative: Towards a global IT system for persona...The Medication Safety Code initiative: Towards a global IT system for persona...
The Medication Safety Code initiative: Towards a global IT system for persona...
 
Samwald ore 2014
Samwald   ore 2014Samwald   ore 2014
Samwald ore 2014
 
Hcls call september 2013 clinical pharmacogenomics
Hcls call september 2013   clinical pharmacogenomicsHcls call september 2013   clinical pharmacogenomics
Hcls call september 2013 clinical pharmacogenomics
 
Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...
Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...
Genomic CDS: an example of a complex ontology for pharmacogenetics and clinic...
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 

One man's *1 is another man's *13? Trouble with nomenclatures in personalized medicine

  • 1. Trouble with nomenclatures in personalized medicine Asst.-Prof. Mag. Dr. Matthias Samwald CeMSIIS, Medical University of Vienna SUMMER SCHOOL: GENOMIC MEDICINE – Bridging research and the clinic, May 6 2016, Portoroz, Slovenia One man's *1 is another man's *13? Funded by Austrian Science Fund (FWF): [P 25608-N15] This project has received funding from the European Union’s Horizon 2020 research and Innovation programme under grant agreement No 668353 (KB and MS).
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing Venn diagram displaying the numbers and overlaps of polymorphisms covered by constrained views derived from four pharmacogenomic assays. DMET: derived from the Affymetrix DMET™ Plus assay, VERA: Illumina VeraCode® ADME Core Panel, TAQM: TaqMan® OpenArray® PGx Panel, FLOR: University of Florida and Stanford Custom Array.
  • 8. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing
  • 9. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing
  • 10.
  • 11. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript currently under review at ‘Pharmacogenomics’.
  • 12. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript currently under review at ‘Pharmacogenomics’.
  • 13. Where to go from here?
  • 14.
  • 15.
  • 16.
  • 18. From the lab: experimental mnemonic nomenclature • Idea: Experiment with human-friendly nomenclature o No human committee o Less cryptic alphanumeric descriptors
  • 19. From the lab: experimental mnemonic nomenclature • Synthetic pseudo-words can encode a lot of information • CVCVCV pattern examples (C = consonant, V = vowel): o binoru o nivudi o pekuvo o jutoxu o hacifi o dejula • CVCVCV tuple (Y as vowel) can denote: 20 * 6 * 20 * 6 * 20 * 6 = 1 728 000 variants
  • 20. Algorithm (no human curation / committee) • Take large dataset containing variant data of our usual (1000 Genomes, 100.000 Genomes, 1M genomes…) as reference • Create list of genome loci and variants observed there (some loci might have more than 2 possible variants) • For each gene: o For each locus:  Sort observed variants based on their frequencies  define most frequently observed variant as ‘wild type’; remove these variants from the table we use for constructing the mnemonics (they are considered to be the default) o Sort loci based on the frequency of the most frequent non-wild- type variant of each locus o Assign mnemonics to each variant systematically, starting with shorter mnemonic strings (i.e., 2-character tuple)
  • 21. Algorithm (no human curation / committee) • Take large dataset containing variant data of our usual (1000 Genomes, 100.000 Genomes, 1M genomes…) as reference • Create list of genome loci and variants observed there (some loci might have more than 2 possible variants) • For each gene: o For each locus:  Sort observed variants based on their frequencies  define most frequently observed variant as ‘wild type’; remove these variants from the table we use for constructing the mnemonics (they are considered to be the default) o Sort loci based on the frequency of the most frequent non-wild- type variant of each locus o Assign mnemonics to each variant systematically, starting with shorter mnemonic strings (i.e., 2-character tuple)
  • 22.
  • 23. Example mnemonic code sequences VKORC1: cy-do-du | be-do-du CYP2D6: nai / nai-pek CYP2D6: nai / be-wi / nai-pek (copy number variation) TMPT: be-fu-fy | ba-bi-fi-tek Mnemonic code + reference to variants/regions covered by assay = automatically decompress to full sequence / genotype result Sets auf co-occuring SNP variants could automatically be assigned identifier of their own and combined with individual SNP variant identifiers Currently creating humble proof-of-concept based on 1000 Genomes data
  • 24. Local team (Medical University of Vienna) Asst.-Prof. Mag. Dr. Matthias Samwald (PI) Dr. Kathrin Blagec Mag. Sebastian Hofer Hong Xu, BSc Wolfgang Kuch Web http://samwald.info/ http://safety-code.org/ http://upgx.eu Thanks!
  • 25. • Reference: Matthias Samwald, Kathrin Blagec, Sebastian Hofer and Robert R. Freimuth. “Analysing the potential for incorrect haplotype calls with different pharmacogenomic assays in different populations: a simulation based on 1000 Genomes data.” Pharmacogenomics, September 30, 2015. doi:10.2217/pgs.15.108 • Code Availability: The curated resources and the IPython notebooks available at https://gitlab.com/medication-safety/ms-ipython Further info