One man's 1 is another man's 13? Trouble with nomenclatures in personalized medicine

1. Trouble with nomenclatures in personalized medicine Asst.-Prof. Mag. Dr. Matthias Samwald CeMSIIS, Medical University of Vienna SUMMER SCHOOL: GENOMIC MEDICINE – Bridging research and the clinic, May 6 2016, Portoroz, Slovenia One man's *1 is another man's *13? Funded by Austrian Science Fund (FWF): [P 25608-N15] This project has received funding from the European Union’s Horizon 2020 research and Innovation programme under grant agreement No 668353 (KB and MS).

2. What‘s the problem?

7. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing Venn diagram displaying the numbers and overlaps of polymorphisms covered by constrained views derived from four pharmacogenomic assays. DMET: derived from the Affymetrix DMET™ Plus assay, VERA: Illumina VeraCode® ADME Core Panel, TAQM: TaqMan® OpenArray® PGx Panel, FLOR: University of Florida and Stanford Custom Array.

8. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing

9. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing

10.

11. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript currently under review at ‘Pharmacogenomics’.

12. We simulated the accuracy of various targeted, low- cost assays suitable for pre-emptive testing compared to next-gen sequencing Fraction of tested genes resulting in aberrations in haplotype calling with restricted assay compared to next-gen sequencing. Based on full genome sequences of 2504 persons. Manuscript currently under review at ‘Pharmacogenomics’.

13. Where to go from here?

14.

15.

16.

17. Allele Registry project

18. From the lab: experimental mnemonic nomenclature • Idea: Experiment with human-friendly nomenclature o No human committee o Less cryptic alphanumeric descriptors

19. From the lab: experimental mnemonic nomenclature • Synthetic pseudo-words can encode a lot of information • CVCVCV pattern examples (C = consonant, V = vowel): o binoru o nivudi o pekuvo o jutoxu o hacifi o dejula • CVCVCV tuple (Y as vowel) can denote: 20 * 6 * 20 * 6 * 20 * 6 = 1 728 000 variants

20. Algorithm (no human curation / committee) • Take large dataset containing variant data of our usual (1000 Genomes, 100.000 Genomes, 1M genomes…) as reference • Create list of genome loci and variants observed there (some loci might have more than 2 possible variants) • For each gene: o For each locus:  Sort observed variants based on their frequencies  define most frequently observed variant as ‘wild type’; remove these variants from the table we use for constructing the mnemonics (they are considered to be the default) o Sort loci based on the frequency of the most frequent non-wild- type variant of each locus o Assign mnemonics to each variant systematically, starting with shorter mnemonic strings (i.e., 2-character tuple)

21. Algorithm (no human curation / committee) • Take large dataset containing variant data of our usual (1000 Genomes, 100.000 Genomes, 1M genomes…) as reference • Create list of genome loci and variants observed there (some loci might have more than 2 possible variants) • For each gene: o For each locus:  Sort observed variants based on their frequencies  define most frequently observed variant as ‘wild type’; remove these variants from the table we use for constructing the mnemonics (they are considered to be the default) o Sort loci based on the frequency of the most frequent non-wild- type variant of each locus o Assign mnemonics to each variant systematically, starting with shorter mnemonic strings (i.e., 2-character tuple)

22.

23. Example mnemonic code sequences VKORC1: cy-do-du | be-do-du CYP2D6: nai / nai-pek CYP2D6: nai / be-wi / nai-pek (copy number variation) TMPT: be-fu-fy | ba-bi-fi-tek Mnemonic code + reference to variants/regions covered by assay = automatically decompress to full sequence / genotype result Sets auf co-occuring SNP variants could automatically be assigned identifier of their own and combined with individual SNP variant identifiers Currently creating humble proof-of-concept based on 1000 Genomes data

24. Local team (Medical University of Vienna) Asst.-Prof. Mag. Dr. Matthias Samwald (PI) Dr. Kathrin Blagec Mag. Sebastian Hofer Hong Xu, BSc Wolfgang Kuch Web http://samwald.info/ http://safety-code.org/ http://upgx.eu Thanks!

25. • Reference: Matthias Samwald, Kathrin Blagec, Sebastian Hofer and Robert R. Freimuth. “Analysing the potential for incorrect haplotype calls with different pharmacogenomic assays in different populations: a simulation based on 1000 Genomes data.” Pharmacogenomics, September 30, 2015. doi:10.2217/pgs.15.108 • Code Availability: The curated resources and the IPython notebooks available at https://gitlab.com/medication-safety/ms-ipython Further info

One man's 1 is another man's 13? Trouble with nomenclatures in personalized medicine

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (18)

Similaire à One man's 1 is another man's 13? Trouble with nomenclatures in personalized medicine

Similaire à One man's 1 is another man's 13? Trouble with nomenclatures in personalized medicine (20)

Plus de Matthias Samwald

Plus de Matthias Samwald (10)

Dernier

Dernier (20)