Jalna Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
Advanced genomics v_medical_pitt_kent_osu
1. Ben Busby, Ph.D.
Genomics Outreach Coordinator
NCBI
ben.busby@nih.gov
Genomic Variation in the Rising Era of Individual Genome Sequence
2.
3.
4. Quick Description
Case Study
Medical Genetics Summary Resource
Other Variation Resources
Additional General Information
12/01/16 4
5. • Aggregates data from many sources
• Term hierarchy (UMLS, GeneReviews, GTR and
other vocabularies)
• Phenotypes using standard vocabularies
• Links to NCBI and outside resources
12/01/16 5
6. Now at www.omim.org
Still searchable at NCBI
• Compendium of human genes and
related phenotypes
• Each entry a review article
• Human curated from biomedical
literature
12/01/16 6
<ncbi>/omim/
7. • Clinical significance of sequence variations and
relationship to phenotypes
• Submitted and curated assertions
• Represents variants using HGVS and reference
sequences including RefSeqGene
12/01/16 7
8. • NIH’s international registry of available genetic tests voluntarily
provided by testing labs
• Submitted tests for Mendelian disorders (including pharmacogenetic tests)
• Provides searches by
• Disorder
• Test
• Laboratory
• Condition pages with links to other resources (Gene, GeneReviews)
12/01/16 8NCBI Public Services
9. Genome Reference Consortium (GRC)
dbGaP
PheGenI (joint with NHGRI)
GeneReviews
Medical Genetics Summaries
…and much, much more
12/01/16 9
10. The clinic secretary schedules this visit for you:
• Boy age 9 years, chief complaint:
needs medical clearance to play soccer
• Referred to genetics because of family history:
• paternal uncle died of a dissecting thoracic aortic
aneurysm at age 52
• paternal grandmother died in childbirth
You do some background reading to prepare for the case
12/01/16 10
24. • Patient does not meet revised Ghent criteria for Marfan
syndrome
– …but he is young and could develop diagnostic features
later in life (too young for accurate clinical diagnosis)
12/01/16 24
25. • Leading diagnosis is Marfan syndrome
– Could follow patient over time but he wants medical
release to play soccer. You are concerned about:
• Possibility of EDS IV with risk of fatal vascular rupture
• Missing the potentially severe vascular manifestations of LDS
• Familial thoracic aortic aneurysm conditions
• You decide to find a gene panel test which includes
genes for all these conditions
12/01/16 25
35. Used MedGen to research the condition
Used GTR to find tests
Received lab report:
FBN1:c.4786C>T
Where to find information about this variant in
the fibrillin gene?
12/01/16 NCBI Public Services 35
38. 38
Practice guideline
Reviewed by expert panel
Multiple interpretations with assertion criteria
that agree
One interpretation with assertion criteria
OR multiple interpretations with assertion
criteria but conflicting
No interpretations with assertion criteria
OR no interpretation provided
http://www.ncbi.nlm.nih.gov/clinvar/docs/assertion_criteria/
44. Interactively on the web; updated weekly
Monthly full releases
Comprehensive XML extraction
VCF files
Tab-delimited summary files for genes, variants
E-utilities as web service or via command line
Annotation on graphic sequence displays
Variation Viewer - www.ncbi.nlm.nih.gov/variation/view/
Variation Reporter
www.ncbi.nlm.nih.gov/variation/tools/reporter
44
45. Diagnostic criteria are met for Marfan syndrome in
this patient
Applied the revised Ghent nosology for diagnosing
Marfan syndrome
Found a pathogenic variation in FBN1
45
46. What are the guidelines for sports participation?
Address the primary reason for referral
Can he play soccer?
http://www.ncbi.nlm.nih.gov/pubmed/15184297
46
Recommendations for
this case study
47. 12/01/16 NCBI Public Services 47
‡Assumes no or only mild aortic
dilatation
*Recreational sports are categorized
with regard to high, moderate, and low
levels of exercise and graded on a
relative scale (from 0 to 5) for eligibility
with
0 to 1 indicating generally not advised or
strongly discouraged;
4 to 5 indicating probably permitted;
and 2 to 3 indicating intermediate and
to be assessed clinically on an
individual basis.
In practical terms, this means
cardiovascular evaluation for structural
defects and arrhythmias, possible
permission to play soccer if normal,
and monitoring over time
49. Concise, structured reviews about genetic variants
and drug responses
• Includes genetic testing strategy and dosing
recommendations
• Expert-reviewed
• Regularly updated
• Free to access
• Integrated with GTR and MedGen
http://www.ncbi.nlm.nih.gov/books/NBK109194/
49
53. Quick Description
Case Study
Medical Genetics Summary Resource
Other Variation Resources
Additional General Information
12/01/16 53NCBI Public Services
54. 12/01/16 NCBI Public Services 54
How does NCBI calculate clinical significance?
www.ncbi.nlm.nih.gov/variation
Which tool do I use for…?
55. dbSNP
Submitted (ss) and reference (rs)
dbVar
Structural variants (SV)
dbGaP
genome-wide association studies, molecular diagnostic
assays
Controlled access to individual level data
55
56. Quick Description
Case Study
Medical Genetics Summary Resource
Other Variation Resources
Additional General Information
56
60. Webinar: MedGen, GTR, and ClinVar
Webinar: NCBI Resources and Variant Interpretation Tools
for the Clinical Community
Webinar: NCBI Human Variation and Medical Genetics
Resources
Search ClinVar with Ease
Tutorials: dbGaP
Tutorials: Genetic Testing Registry (GTR)
Using NCBI Data with Tools that Predict the Functional
Impact of Genomic Variants
Explore Gene Pages at NCBI: Variation & Expression
The Variation Viewer
60
64. My View of Data Transfer Principles
• Metadata Search
• Rapid NoSQL (for now)
• Integration
• Non-ambiguous identifiers
• Transferring Small amounts of Data
• Data still gets transferred in the cloud
• Underlying structure
• Finding specific data from validated formats
• Democratization of Data
• Rapid comparison by domain experts
• Reporting
• Metrics to report data upload and [unique IP] download of datasets
• Post-publication User Review
• The NCBI LinkOut Mechanism as a test suite
119. 119
Introducing… Entrez Direct
The E-utilities on the UNIX
command line
esearch –db gene –query “foxp2[gene]
AND human[orgn]” |
elink –target protein –name
gene_protein_refseq |
efetch –format fasta
ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/
134. NCBI Genomics Hackathon March 20-22 NIH
Campus, Bethesda
BioFrontiers Institute Hackathon May 22-24,
Boulder, CO
NYGC Hackathon June 2017
135. u Basespace (Illumina)
v Independent Consultants
w mothur (for metagenomics)
x CyVerse (iPlant)
y NCBI Submission Portal
• If you have another submission platform, or offer this as a service, please send
us an email at webinars@ncbi.nlm.nih.gov, and we will include it in the FTP release notes
137. Easily Import Samples into BaseSpace with the SRA Import App
1. Launch the SRA Import
App
2. Input Accession
Numbers
3. View Downloaded
FASTQs (Samples)
Launch the SRA Import app
from the Apps page
Enter SRA Accession
number(s) and Continue
After app completes, view
the imported Samples in
your BaseSpace project
138. Submit Your Data Using the SRA Submission App
1. Register a BioSample and
BioProject with NCBI
2. Launch the SRA Submission
App
3. Enter your submission
information
Launch the SRA Submission
app from the Apps page
Fill out the input form with
your submission and sample
details
4. When the app completes, receive email confirmations from NCBI
Receive emails from NCBI upon data receipt and once the data is available in SRA
140. Originally published in 2009
(doi:10.1128/AEM.01541-09)
Most cited tool for analyzing 16S
rRNA gene sequences
3,410 citations (WoS: 1/8/2016)
Working on 37th release
Overview
100% open source, GPL v3
OS independent
Command line interface
Written in C/C++
http://www.mothur.org
141. Deposition of 16S rRNA gene sequences to the SRA has
been a major problem
Worked with SRA staff to make a customized portal to
simplify submission of PCR-generated 16S rRNA gene
sequences
Command enforces co-submission of sample and
processing metadata
Originally released in March 2015. So far 86 submissions,
61 studies, 6367 runs, 116 Gbp total
http://www.mothur.org/wiki/make.sra
142. 1. Provide the necessary MIMARKS* metadata data about
samples with get.mimarkspackage
2. Create a project file describing user and their project
using supplied template file
3. Parse MIMARKS, project file, and sff or fastq files to
generate an xml file for submission using make.sra
4. Email the SRA to let them know about submission using
mothur created files and await further instructions
* http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3367316/
http://www.mothur.org/wiki/Creating_a_new_submission
143. MIMARKS: minimum information about a marker gene
sequence (doi:10.1038/nbt.1823)
Command supplies user with a blank text file with sample
names and necessary metadata for their environment.
User fills in details.
For each environmental package and environment there
is a wiki page with required and optional parameters and
allowed values. Can also be extended to include
additional metadata
http://www.mothur.org/wiki/Get.mimarkspackage
http://www.mothur.org/wiki/Human_gut
144. USERNAME [UserName]
Last [LastName]
First [FirstName]
EMAIL [Email@mail.com]
CENTER [University or Center Name]
TYPE institute
WEBSITE [www.Website.org]
ProjectName [ProjectName]
ProjectTitle [Project Title]
Description [Project Description]
Grant id=[GrantID], agency=[GrantAgency],
title=[GrantTitle]
User completes information in brackets
http://www.mothur.org/wiki/Project_File
145. CyVerse-Enabled SRA Submission Pipeline
• 10-year, $100 million, NSF-funded mandate to support all of Life
Science (formerly The iPlant Collaborative)
• “Transforming science through data-driven discovery.”
• People + Cyberinfrastructure, empowering researchers, and
fostering interoperability
www.cyverse.org jdebarry@cyverse.org
146. GUI connection to CyVerse Data Store
Built on iRODS, supports ~30k users and >1.25 PB of data
A platform that can run almost any bioinformatics application
Seamlessly integrated with data and HPC resources
Point and click tools for data and metadata management
Easy, fast, secure data transfer, management, and analysis
CyVerse Discovery Environment
Submit to SRA via Discovery Environment
• Submission package = sequence data and metadata XML
• Checksums, etc. are automatically created
• Create or update BioProject during SRA submission
• Create BioSample(s) during SRA submission
147. Submission tutorial (https://goo.gl/LMe5kQ)
with video instructions and example
submission package in the CyVerse wiki
1. Efficient command line and point and
click tools available to transfer data to
the Discovery Environment
CyVerse supports
Apps available to compress data if needed
2. Create submission package folder with
dedicated tool
Single BioProject folder contains one or
more BioSample folders, each with one or
more Library folders
Drag and drop sequence files to organize
submission package
4) Submit data
and metadata to
SRA
2) Create
submission
package
3) Enter and
save metadata
5) Submission
notification
from SRA
1) Upload data
to Discovery
Environment
6) Error
Correction
(if needed)
148. 3. BioProject, BioSample, Sequencing Library metadata entered via templates
Choose from available templates to create/update BioProject, appropriate BioSample type
Metadata term guide in the Discovery Environment defines metadata fields
Metadata applied to each folder, copy metadata to folders to limit entry for large submissions
After metadata entry, save single file of metadata for submission package
4) Submit data
and metadata to
SRA
2) Create
submission
package
3) Enter and
save metadata
5) Submission
notification
from SRA
1) Upload data
to Discovery
Environment
6) Error
Correction
(if needed)
149. 4. Discovery Environment App requires only top-level BioProject folder and file of
saved metadata as input
App creates XML metadata file and submits to SRA via Aspera Connect
5. SRA uses contact email in submission metadata to transfer ownership to
submitter’s NCBI account
Notification emails for successful submission or necessary error corrections sent to submitter
6. Discovery Environment App can retrieve error report from SRA
To correct, edit metadata and submission package and resubmit from Discovery Environment
4) Submit data
and metadata to
SRA
2) Create
submission
package
3) Enter and
save metadata
5) Submission
notification
from SRA
1) Upload data
to Discovery
Environment
6) Error
Correction
(if needed)
150.
151.
152.
153. Pro tip:
Before starting your
submission;
<google> for
“biosample template”
and collect your
metadata
169. My View of Data Transfer Principles
• Metadata Search
• Rapid NoSQL (for now)
• Integration
• Non-ambiguous identifiers
• Transferring Small amounts of Data
• Data still gets transferred in the cloud
• Underlying structure
• Finding specific data from validated formats
• Democratization of Data
• Rapid comparison by domain experts
• Reporting
• Metrics to report data upload and [unique IP] download of datasets
• Post-publication User Review
Notes de l'éditeur
Since I’m a little unsure of what the possible conditions could be responsible for this patient’s family history, I’ll start by using the MedGen advanced search. [CLICK} From the homepage, I click the link under the main search bar, shown here as ‘advanced’.
Mention ISCA as example of expert curation group
.
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
163 studies with >50,000 cancer patients, generally with matched controls
Make sure you make metadata points here!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
Now… with AMR data!
All cancer cells arise from a normal somatic cell, therefore most primary cancers express adequate amounts of HLA
identify the specific peptides that mark the tumor as 'dangerous’
T cells recognize peptides that are presented by human leukocyte antigen
tumors harbor hundreds of putative neoepitopes
without the benefit of information from T cell responses, it’s virtually impossible to develop a vaccine, but we can aim at narrowing down the candidate peptides
Now… with AMR data!
BaseSpace is the Illumina cloud-based genomics hub. BaseSpace provides:
Tight instrument integration. In BaseSpace users can prepare NeoPrep sample prep runs, and create and pool libraries for sequencing on the MiniSeq and NextSeq desktop sequencing instruments
Additionally, users can monitor their sequencing runs, from MiniSeq, MiSeq, NextSeq, HiSeq, and HiSeq X, in real-time. View metrics charts including %Q30 and per-lane metrics as the sequencing run progresses
Automatic conversion of raw run data (basecalls, or BCL files) to FASTQ files for use in downstream analysis
Over 70 powerful push-button analysis applications providing solutions for the most popular sequencing applications including RNA-Seq, whole genome resequencing, 16S metagenomics, data quality assessment tools, and more!
Ability to instantly and easily share/transfer data with collaborators and peers
Ability to submit BioSamples to the SRA database
Ability to download BioSamples/BioProjects into BaseSpace from the SRA database
An open platform which allows users to import their own pipelines and tools for use within BaseSpace. Users can keep these “apps” private, share with peers, or submit it for publication for general use.
An iOS mobile app to enable run and analysis monitoring on-the-go
BaseSpace is free to sign up for and use.
Importing data from the SRA into your BaseSpace account is simple and easy:
Launch the SRA Import app from the Apps page in BaseSpace
Enter the SRA Accession number(s) for the data you wish to import into your account
Note: The SRA Import app currently only accepts data generated on Illumina instruments. We plan to remove this restriction.
Note: There is a maximum import size of 25GB per app session. We plan to remove this restriction soon.
View the downloaded FASTQs (Samples) in your BaseSpace project
Submit your BaseSpace datasets to the SRA without headaches in a few steps:
Register a BioSample and BioProject with the NCBI
Launch the SRA Submission app from the Apps page in BaseSpace
Enter your submission information on the input form. This includes BioProject and BioSample information, BaseSpace samples to submit, and sample-type information
Receive an email from NCBI confirming receipt of your submission
Receive email confirmation that your data is available in SRA
CyVerse introduction slide
CyVerse aims to enable, empower, and train the next generation of Life Scientists
Everything is free to the users.
CyVerse-enabled SRA submission are made through our flagship GUI, the Discovery Environment
The Discover Environment is the main interface to the CyVerse Data Store, built on the iRODS software.
From the Discovery Environment, users can manage and analyze big data with hundreds of bioinformatics algorithms
An SRA submission package is composed of sequence data, and a metadata XML
Users can create BioProjects and BioSamples simultaneous with submitting to the SRA
Create an account and load data and do an SRA submission. Dont have to be a regular user.
No restrictions associated with that.
Not showing data upload options on slide
SRA submission folder tool in the Discovery Environment creates folders for BioProject, BioSamples, and Sequencing libraries.
Tiny URLs can be sent to people
Metadata definitions in term guide harvested from SRA webpages
Metadata fields are pulled directly from BioSample, etc...
Ontologies are linked in metadata guide
3 Apps total: One for creating a BioProject, one for updating an existing BioProject, one for retrieving SRA submission report
If error correction is needed, users can edit submission package in Discovery Environment and resubmit there