Grand Challenges in Genomics
A Joint NHGRI and Wellcome Trust Strategic Meeting
25 and 26 February 2019
https://www.wellcomeevents.org/WELLCOME/media/uploaded/EVWELLCOME/event_661/Draft_agenda_for_WT_December_2018.pdf
Join lecture: Nicky Mulder, Han Brunner and Joaquin Dopazo
2. Areas of health ecosystem, role of genomics
Treatment
Screening
/surveillance
Diagnostics,
risk assessment
Genomics/omics Microbiome
Environmental
exposuresmHealth
Biorepositories DatabasesHealth systems
Data
Infrastructure
Tools Genetic tests Omics facilities Analysis tools
Interpretation
tools
Clinical facilities
Pathogen outbreak
& response
Risk
stratification
Drug
discovery
3. Other applications of genomics
• Agriculture (>1.2 million species of plants & animals)
• Animal and plant health
• Food security
• Aquaculture
• Biodiversity (Earth biogenome project)
• Bio-products
• …..
4. General requirements for exploiting genomics data (for health)
• Consent from participants
• Research perspective
• Reference datasets -need background information on the healthy/normal state
• Access to enough samples
• Access to other data
• Good phenotyping
• Data management and analysis tools
• Clinical perspective
• Adequate screening and diagnostic tools (array versus WGS)
• Access to latest research in the field
• Evidence for genotype-phenotype link
• Evidence of clinical actionability
• Resources and skills for data analysis: training in genomics, genomic medicine,
clinical data interpretation, data governance and ethics, genetic counselling
5. Technical requirements for genomics data
• Reference datasets and (meta)databases
• Mobile device data collection and integration
• Integration of genomic data with EHRs
• Clinical decision support tools
• Data must be well curated, including provenance
• Data must be harmonized or standardized
• Data storage facilities
• Data transfer facilities
• Data submission (& utilization) facilities
• Authentication and authorization
• Training in all data related skills
6. Limits to sharing and reuse of genomics data
• Clinical data:
• Privacy
• Fear of it getting into hands of the wrong people (medical insurance)
• Human genetic data:
• Anonymized, but risk of identification
• History of vulnerable populations and exploitation
• Pathogen data
• Potential for discovery or commercialization?
• Research perspective for data reuse:
• Recognizing the contribution of researchers who generated the data
• Maximizing the timely availability of research data
• Ensuring responsible secondary use of data
• Robust data sharing model with implementation strategy for data access and
transfer, data access agreements and MoUs
7. Barriers to accelerating benefits of genomics and data sharing in
Africa (LMICS)
• Data infrastructure challenges:
• Data transfer and storage
• Data processing, analysis and interpretation
• Data curation and submission, skills for access
• Previous exploitation –fear of loss of scientific discovery
• Conservative ethics review system that considers specific consent to be sacred
• Resource limitations (tools and skills) for analysing and exploiting data
• Sample sizes (budget)
• Researchers & clinicians usually don’t budget for data
• Meta data is not well curated, data quality and accuracy is not a high priority for
clinicians and some researchers
• Insufficient training in data management and analysis
• Intellectual property rights management
• Capacity for innovation and translation (practically)
8. Technical solutions for global use of genomics data
• Improved resources for safe, responsible sharing of data (is local repository
recognised?)
• Data compression formats
• Tiered access to data:
• Complete data files available under controlled access (e.g. EGA, dbGAP)
• Pooled summary data available under restricted access -EGA or locally managed website
for registered users?
• Minimum data available with no restrictions, e.g. Beacons –need extensions
• Is it findable? Metadata available for searching in catalogues
• Authentication and authorization
• Federated data analysis tools (though still some limitations if can’t move raw data)
• New tools
• New genotyping arrays –screening versus discovery
• Reference graphs
• Variant calling
• Meta analysis
9. Future perspective
• Embrace the 4th Industrial Revolution: fusing physical, digital and
biological
• Big data available, use AI to turn it into information then knowledge
(clinical decision support, biomarker discovery, drug repositioning)
• Is priority data generation or harmonizing and using what we have
already so that new data can be more rapidly interpreted?
• Move to functional/validation studies
• Look at the bigger picture (genotype + epigenetics + phenotype +
environment + microbiome + other omics)
• Improve the interface between the data generator, data analyst and
end user (e.g. clinician) ->interpreters and interpretation tools
• Go global
• population (& pathogen) migration and admixture
• Population comparisons (common versus novel but relevant)
• Lift as you rise!
10. The future of Genomics in the
clinical space
Han G. Brunner
Grand Challenges
London February 2019
11. Let’s prove that Genomics diagnostics pays for itself
Now is the time to study the environment
Saturation Genome Editing to understand the coding genome
Towards Universal Genomic NIPT
Preconception Carrier screening for consanguinous couples
How much Genomic quality control of IVF embryos?
From trait to state?
15. Now is the time to study the environment
It often takes more than just bad habits to have a disease
Alpha 1 antitrypsin deficiency + smoking = emphysema
PGD deficiency + fava beans = hemolysis
Pharmacogenetics
DYPD deficiency + 5-FU = liver failure
16. Now is the time to study the environment
Job Verdonschot unpublished
17. Now is the time to study the environment
It may take more than just bad habits to have a disease
Alpha 1 antitrypsin eficiency + smoking = emphysema
PGD deficiency + fava beans = hemolysis
Pharmacogenetics
DYPD deficiency + 5-FU = liver failure
Titin mutation + Environment = Cardiomyopathy
18. We need Saturation Genome Editing to understand the
human coding genome
There is not enough observational evidence to guide
diagnostics
19. We need Saturation Genome Editing to understand
the human coding genome
There is not enough observational evidence to guide
diagnostics
Every clinical exome generates ~25 new UNIQUE variants
~ 25% of missense variants in disease genes are
Variants of Unknown Significance
20.
21. We need Saturation Genome Editing to understand the
human coding genome
There is not enough observational evidence to guide
diagnostics
22. >60% of SEVERE HANDICAP is by new mutations
Gilissen et al. Nature, 2014
No
diagnosis
De novo
SNVs
De novo
SVsInherited
recessive
2%
De novo
60%
24. So should we offer prenatal testing to everyone?
We cannot prevent de novo mutations
Non invasive prenatal testing NIPT
Why offer this for Down syndrome (risk 1/1000)
and not for other forms of ID (risk 5/1000)?
Towards universal NIPT by Genome
29. Preconception Carrier Testing is efficacious for
consanguinous couples
And the risks are high
7/22 couples
tested prospectively
are at 25% riskCOL7A1 ZMPSTE24
35. Can Medical Genomics move from Trait to State?
Can we think of other things than cancer that we
would want to catch early??
Genomics in the clinical space
36. Let’s prove that Genomics diagnostics pays for itself
Now is the time to study the environment
Saturation Genome Editing to understand the coding genome
Towards Universal Genomic NIPT
Preconception Carrier screening for consanguinous couples
How much Genomic quality control of IVF embryos?
From trait to state?
37. The future of Genomics in the
clinical space
Han G. Brunner
Grand Challenges
London February 2019
38. Personalized medicine: current scenario
Intuitive
Based on trial
and error
Identification of
probabilistic
patterns
Decisions and
actions based
on knowledge
Intuitive Medicine Empirical Medicine Precision Medicine
Today Tomorrow
Degree of personalization
Data generation
Knowledge generation
Future challenges
in the way in
which we
generate:
• Data
• Information
• Knowledge
and the way in
which we store
and manage them
Clinical use of data
39. Personalized medicine
First tier: use of patient genomic data for
precision diagnosis (typically RDs) and
treatment recommendation (cancer).
Extensively implemented in hospitals
Requires information on gene to phenotype
association
Second tier: use clinical data (eHR) along with genomic data
for preventive medicine and biomarker discovery.
Andalusian Population Health Database, with over 12M people
since 2001.
Aim: converting the whole Public Health System (SAS) into a
huge prospective clinical study (GDPR compliance within SAS)
40. Transition to models that integrate omic and clinical data
…
…
Genome Clinia
Clinical study
• Treatment of genomic data for
research purposes (GDPR)
• Principle of use of minimal
personal data
• Data pseudoanonimization
• Each study requires of a
specific genomic and clinical
data collection into an external
database
• Serious security concerns
(genomic + clinical data outside
the hospital)
• Static clinical data (e.g. if a
control becomes a case the
external DB will not be
updated)
• Limited genomic data reuse for
purposes different from the
original study.
…
…
Genome Clinic
….
Study1 ….. Studyn
Query engine
• Clinical data dynamically
associated to genomic data
• Possibility of many clinical
studies by reanalyzing
genomic data under diverse
perspectives (with no extra
investment)
• Growing genomic DB with
increasing study possibilities
• The whole health system
becomes a enormous
potential prospective clinical
study
Today’s knowledge generation Possibilities in systems with universal eHR
Possibly the largest database ever
created with detailed clinical
data, storing information on
12.083.681 patients since 2001
41. Possible future models for large-scale data sharing
…
Study1
Risk. Data
encryption
Genomic Clinic
…
Risk
….
Study1 ….. Studyn
Federated External repository
42. Data generation
1$ genome?
• DNA sequencing prices
will soon be comparable
to any other
conventional test.
• Actually, they are
already, if the whole
treatment is considered
Panel WES WGS
Obstacles:
Lack of information
about most of the
findings
43. Generation of information: rare diseases and
cancer
About 6000 rare disease over 80% with
genetic cause
In less than 5 years most of the rare
variation will be known
Today 1-3 new therapies enter in hospitals
every Q
In less than 5 years WES and WGS will
increase therapeutic options for patients
RD & cancer are
genetic diseases
with strong
penetrance
44. Generation of information: the reality of cancer and
complex diseases beyond biomarkers
• Conventional single-gene biomarkers have a demonstrated clinical utility. However, their success is purely
probabilistic, often modest and frequently lack any mechanistic anchoring to the fundamental cellular processes
responsible for the disease or therapeutic response. Modular nature of genetic diseases: Causative genes for the
same or phenotypically similar diseases may generally reside in the same biological module. More sophisticated
biomarkers (mechanistic models of cell activity) need to be considered.
• Complex diseases: complex genetics plus the strong role or the environment: other omics need to be considered
(transcriptomics, methylomics, metabolomics, human microbiome…)
Mechanistic models will play a major role as
dynamic biomarkers. Will predict the effect
of interventions.
Environment constantly
influences cell behaviour causing
changes in epigenome,
transcriptome, metabolome and
microbiome, that must be
dynamically interpreted in the
context of the genomic profile
and related to phenotypes: multi-
omic data integration
Currently, only one third of
the genome can be
modelled
45. Knowledge generation (AI in medicine)
Topol, 2019, Nat. Med.
Variables
Samples
Variables
Samples
Curse of dimensionality
Learning biological knowledge from
the data is currently quite complex.
New methods for feature selection,
dimensionality reduction, multi-
view learning and network learning
need to be developed.
Optimal ML
scenario
46. Precision systems medicine
Intuitive
Based on trial
and error
Identification of
probabilistic
patterns
Decisions and
actions based
on knowledge
Intuitive Medicine Empirical Medicine Systems Medicine
Today Tomorrow
Degree of personalization
The real disruption will
come with the leap from
empirical medicine,
based on pattern
identification to Systems
medicine, based on
mechanistic biological
knowledge.
Mechanistic models of
cells, organs, etc. will
allow even the
management of new, yet
unseen pathologic
scenarios.