MyVariant.info--Community Aggregated Variant Annotation as a Service (NGS2016, Barcelona)

Jiwen (Kevin) Xin, Cyrus Afrasiabi, Sean D. Mooney, Andrew I. Su, Chunlei Wu
kevinxin@scripps.edu
The Scripps Research Institute
La Jolla, CA, USA
NGS 2016
04/05/2016
MyVariant.info
Community-aggregated Variant Annotations As a Service

So many variant annotation resources
dbNSFP

Schematic view of MyVariant.info architecture
Each data source is updated individually. Colors
indicate their different updating schedules.

HGVS name examples
Table . Examples of HGVS (Human Genome Variation Society) nomenclature.

MyVariant.info for the end users:
http://MyVariant.info
(currently v1 API, two endpoints)
http://MyVariant.info/v1/query?q=<query>
any query term(s)
matching variant hits
http://MyVariant.info/v1/variant/<variantid>
hgvs id(s)
matching variant object(s)
Both supports batch-mode via POST
Simple API. No sign-up. No API key.
Try our live API , and documentations

http://myvariant.info/v1/variant/chr1:g.31349647C>T
Retrieving a single variant
Integrated annotations across
resources in well-formatted data
structure
Always up-to-date
http://myvariant.info/v1/variant/chr1:g.31349647C>T

http://myvariant.info/v1/variant/chr1:g.31349647C>T?fields=dbnsfp
http://myvariant.info/v1/variant/chr1:g.31349647C>T?fields=dbnsfp.clinvar
http://myvariant.info/v1/variant/chr1:g.31349647C>T?fields=dbnsfp.clinvar,dbsnp
.gmaf,clinvar.hgvs.coding
Filtering returned fields

Making flexible queries
• All variants with dbNSFP annotation:
http://myvariant.info/v1/query?q=_exists_:dbnsfp
• All non-synonymous variants on gene "BTK":
http://myvariant.info/v1/query?q=dbnsfp.genename:BTK
• All variants within a genomic range:
http://myvariant.info/v1/query?q=chr1:69000-70000
• Query Wellderly variants together with other annotation sources:
http://myvariant.info/v1/query?q=_exists_:wellderly AND cadd.polyphen.cat:possibly_damaging
&fields=wellderly,cadd.polyphen

Many more ways of querying, across resources
 Full-text queries
 Wildcard queries
 Range queries
 Boolean queries
 Regex queries
 Field existing/missing
 Faceting
 Paging
 Sorting
 Batch queries
 Support JSONP, CORS
 …

MyVariant.info stats
• total (334,293,820)
• dbNSFP (82,030,830; v3.0)
• dbSNP (145,132,257; v144)
• ClinVar (131,383; 201602)
• EVS (1,977,300; v2)
• CADD (226,932,858; v1.3)
• MutDB (420,221)
• gwassnps (15,243; from UCSC)
• COSMIC (1,024,498; v68 from UCSC)
• DOCM (1,119)
• SNPedia (5,907)
• EMVClass (12,066)
• Wellderly (21,240,519)
• EXAC (10,195,872; v0.3)
• GRASP (2,212,148; v2.0.0.0)
As of April, 2016

MyVariant.info official Python/R Clients
myvariant Python client hosted in PyPI
(initial release in Aug 2015)
myvariant R client hosted in Bioconductor
(initial release in Oct 2015)

Use case 1
An easy resource to retrieve
well-structured variant
annotations

Use case 2
Direct queries integrated in your
analysis pipeline

User Case 2: An example workflow for variant prioritization
input variants
output variants
filter1 <- lapply(vars, function(i) subset(i,
cadd.consequence %in% c("NON_SYNONYMOUS",
"STOP_GAINED", "STOP_LOST", "CANONICAL_SPLICE",
"SPLICE_SITE")))
filter2 <- lapply(filter1, function(i)
subset(i, exac.af < 0.01))
filter3 <- lapply(filter2, function(i)
subset(i, sapply(dbnsfp.1000gp1.af, function(j)
j < 0.01 )))

Use case 3
For curator/data provider:
A platform for
integrating with other resources
(saving repetitive efforts)
distribute your valuable data
(under your own source field)

Use case 4
For variant curation itself:
Identify discrepancies
Serve as the base of community-engaged curation
process

Linked data
URI (Uniform Resource
Identifier):
Provide unique identifier for
anything or any concept on the
website
Connective:
connecting data, concepts,
applications and ultimately
people.
URL (Uniform Resource Link):
Provide unique identifier for
webpages
Text files, images, music, videos
Interactive:
Twitter, Facebook, blogs

Why Linked Data?
Providing Unique Identifier for a concept
Genenam
e
e.g. CDK2
genename,
(database1)
gene_name,
(database2)
{’gene’: {‘name’:…}},
(database3)
URI:
http://identifiers.org/hgnc.symb
ol

Data Discrepancy ---- Example
http://myvariant.info/v1/variant/chr12:g.111351981C>T?fields=clinvar.rsid,dbsnp.rsid,evs.rsid

Data Discrepancy ---- Example 2
EVS web browser EVS txt data file

Acknowledgement
Funding and Support
U54GM114833
U01HG008473
Washington U:
Ben Ainscough
Obi Griffith
TSRI:
Chunlei Wu
Andrew Su
Jiwen Xin
Cyrus Afrasiabi
Ginger Tsueng
Adam Mark
Greg Stupp
Tim Putman
STSI:
Eric Topol
Ali Torkamani
Galina Erikson
U. Washington:
Sean Mooney
Moritz Juchler
Nikhil Gopal
OICR:
Robin Haw
UC Berkeley:
Chris Mungall
UCSD:
Trish Whetzel
MyVariant.info

MyVariant.info Clients
API:
https://myvariant.info
Python Client:
https://pypi.python.org/pypi/myvariant/
R Client:
http://bioconductor.org/packages/release/bioc/html/myvariant.html.
Jupyter Notebook Tutorial for Python Client (Focus on Clinvar):
https://cdn.rawgit.com/SuLab/myvariant.info/master/docs/ipynb/myvari
ant_clinvar_demo.html

MyVariant.info--Community Aggregated Variant Annotation as a Service (NGS2016, Barcelona)

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (15)

Recently uploaded

Recently uploaded (20)

MyVariant.info--Community Aggregated Variant Annotation as a Service (NGS2016, Barcelona)

Editor's Notes