17. How often are we hurt by going from
the particular to the general
in very complex systems driven by context?
Is this going from the particular to the general
a central problem in
Hypothesis Driven Biomedical Research?
How often do we inappropriately praise
findings that go on to have awkward adjacencies?
30. Open Social Media allows citizens and experts to use gaming to solve problems
31. 1- Now possible to generate massive amount of human “omic’s” data
2-Network Modeling Approaches for Diseases are emerging
3- IT Infrastructure and Cloud compute capacity allows
a generative open approach to biomedical problem solving
4-Nascent Movement for patients to Control Sensitive information
allowing sharing
5- Open Social Media allows citizens and experts to use gaming to
solve problems
A HUGE OPPORTUNITY -- A HUGE RESPONSIBILITY
32. We focus on a world where biomedical research is about
to fundamentally change. We think it will be often
conducted in an open, collaborative way where teams of
teams far beyond the current guilds of experts will
contribute to making better, faster, relevant discoveries
34. 1) Identifying key disease systems and genes- Alzheimer’s
Gaiteri et al.
1.) Identify groups of genes that move together – coexpressed “modules”
- correlated expression of multiple genes across many patients
- coexpression calculated separately for Disease/healthy groups
- these gene groups are often coherent cellular subsystems, enriched in one or
more GO functions
Example “modules” of coexpressed genes, color-coded
35. 1) Identifying key disease systems and genes- Alzheimer’s
1.) Identify groups of genes that move together – coexpressed “modules”
2.) Prioritize the disease-relevance of the modules by clinical and network measures
Prioritize modules through
expression synchrony with clinical
measures or tendency too
reconfigure themselves in disease
vs
36. 1) Identifying key disease systems and genes- Alzheimer’s
1.) Identify groups of genes that move together – coexpressed “modules”
2.) Prioritize the disease-relevance of the modules by clinical and network measures
3.) Incorporate genetic information to find directed relationships between genes
Infer directed/causal relationships
Prioritize modules through expression
and clear hierarchical structure by
synchrony with clinical measures or tendency
too reconfigure themselves in disease incorporating eSNP information
(no hair-balls here)
vs
37. 1) Identifying key disease systems and genes- Alzheimer’s
Example network finding: microglia activation
Module selection – what identifies these modules as relevant to Alzheimer’s disease?
The eigengene of a module of ~400 probes correlates with Braak score, age, cognitive
disease severity and cortical atrophy. Members of this module are on average differentially
expressed (both up- and down-regulated).
Evidence these modules are related to microglia function
The members of this module are enriched with GO categories (p<.001) such as “response to biotic
stimulus” that are indicative of immunologic function for this module.
The microglia markers CD68 and CD11b/ITGAM are contained in the module (this is rare – even when a
module appears to represent a specific cell-type, the histological markers may be lacking).
Numerous key drivers (SYK, TREM2, DAP12, FC1R, TLR2) are important elements of microglia signaling .
Alzgene hits found in co-regulated microglia module:
38. 1) Identifying key disease systems and genes- Alzheimer’s
Figure key:
Five main immunologic families
found in Alzheimer’s-associated
module
Square nodes in surrounding network
denote literature-supported nodes.
Node size is proportional to
connectivity in the full module.
Core family members are shaded.
(Interior circle) Width of
connections between 5
immune families are
linearly scaled to the
number of inter-family
connections.
Labeled nodes are either highly
connected in the original network,
implicated by at least 2 papers as
associated with Alzheimer’s disease,
or core members of one of the 5
immune families.
39. 1) Identifying key disease systems and genes- Alzheimer’s
Transforming networks into biological hypotheses
40. 1) Identifying key disease systems and genes- Alzheimer’s
Design-stage AD projects at Sage
Fusing our expertise in… Gene regulatory networks
Diffusion Spectrum Imaging
Feedback
Microcircuits &
neuronal diversity
Join us in uniting genes, circuits and regions
to build multi-scale biophysical disease models.
Contact chris.gaiteri@sagebase.org
41. 2) Identifying genetic biomarkers of statin response from
cellular expression changes in treated LCLs
Clinical simvastatin trial Cellular Simvastatin exposure
Control
2M simvastatin
N=480
N=944,
P<0.0001 Genotypes
N=587
P<0.0001
Differential eQTL analysis
Identifying local “cis” acting genetic effects
Differential network analysis
Identifying “trans” acting genetic effects.
Lara Mangravite
42. Differential eQTL analysis identifies loci for which genetic association
with gene expression is altered by statin treatment
Control Simvastatin Difference Control vs. Simvastatin
AA AG GG AA AG GG AA AG GG
log10BF=0.52 log10BF=7.1* log10BF=5.7*
Diff-eQTL locus is associated with reduced incidence of statin-induced
myopathy
Lara Mangravite
43. Differential network analysis:
By integrating statin-mediated
changes in gene correlation with
eQTLs, we identify genes
predicted to alter cholesterol
homeostatis and lipoprotein
metabolism.
(including one involved in creatine biosynthesis)
78.1±8.0% gene knockdown, Huh7 cells
Knockdown of candidate gene in
hepatocytes confirms alterations in
lipoprotein metabolism
Partial correlation,
FDR=5% and PP>0.90 Lara Mangravite
44. 3) Classification of transporter-mediated hepatotoxicity
Bile Salt Exporter BSEP (Amgen)
1. Characterization of differential 2. Classification of response to compounds
expression following compound by BSEP Inhibitor Status (rat IC50)
exposures in rat liver
3. Development of 4. Validation
classifier for predicting
BSEP inhibition of
unknown compounds
AUC=0.98
Mangravite, Jang, Mecham, Derry 5-fold crossvalidation
45. How It All Fits Together
Synapse
FEDERATION Access to
DREAM Data Sets
Challenges
Portable
Legal Consent
BRIDGE
Data
Data
Activation
Generation
2009-2010
On-Line Open
Generative 45
Communities
46. How It All Fits Together
FEDERATION Synapse
DREAM
Challenges
Portable
Legal Consent
BRIDGE Data
Data Activation
Generation
2010-2011
On-Line Open
Generative 46
Communities
47. TECHNOLOGY PLATFORM
two approaches to building common scientific knowledge
Every code change versioned
Every issue tracked
Text summary of the completed project Every project the starting point for new work
Assembled after the fact All evolving and accessible in real time
Social Coding
48. Synapse is GitHub for Biomedical Data
• Every code change versioned
• Every issue tracked
• Every project the starting point for new work
• Data and code versioned • Social/Interactive Coding
• Analysis history captured in real time
• Work anywhere, and share the results with anyone
• Social/Interactive Science
49. Data Analysis with Synapse
Run Any Tool
On Any Platform
Record in Synapse
Share with Anyone
50. “Synapse is a nascent compute
platform for transparent, reproducible,
and modular collaborative research.”
52. Download analysis and meta-analysis
Download another Cluster Result Download Evaluation and view more stats
• Perform Model averaging
• Compare/contrast models
• Find consensus clusters
• Visualize in Cytoscape
54. Objective assessment of factors influencing model
performance (>1 million predictions evaluated)
Sanger CCLE
Cross validation prediction accuracy (R2)
Prediction accuracy
improved by…
Not discretizing
data
Including
expression data
Elastic net
regression
130 compounds In Sock Jang 24 compounds
55. How It All Fits Together
Synapse
DREAM
Challenges
Portable
Legal Consent
BRIDGE FEDERATION
Data
Data Activation
Generation
2011-2012
On-Line Open
Generative 55
Communities
57. How It All Fits Together
DREAM
Synapse
Challenges
Portable
Legal Consent
BRIDGE FEDERATION
Data
Data Activation
Generation
2012-2013
On-Line Open
Generative 57
Communities
58. Sage-DREAM Breast Cancer Prognosis Challenge #1
Building better disease models together
Caldos/Aparicio
breast cancer data
154 participants; 27 countries
334 participants; >35 countries
Sep 26 Status
Challenge Launch: July 17
>500 models posted to Leaderboard
59. How It All Fits Together
DREAM
Synapse
Challenges
BRIDGE FEDERATION
Portable Data
Data Legal Consent Activation
Generation
2012-2013
On-Line Open
Generative 59
Communities
60. GOVERNANCE: PORTABLE LEGAL CONSENT
Control of Private information by Citizens allows sharing
weconsent.us
John Wilbanks
John Wilbanks • Online educational wizard
TED Talk • Tutorial video
• Legal Informed Consent Document
“Let’s pool our medical data” • Profile registration
weconsent.us • Data upload
61. How It All Fits Together
DREAM
Synapse
Challenges
BRIDGE
Data
Generation
FEDERATION
Portable Data
Legal Consent Activation
2012-2013
On-Line Open
Generative 61
Communities
64. How It All Fits Together
On-Line Open
Generative
Communities
DREAM
Synapse IMPACT
Challenges
BRIDGE
Data
Generation FEDERATION
Portable
Data
Legal Consent
Activation
2013-2014
64
65. A ‘clearScience’ way of sage bionetworks
modeling PI3K pathway metaGenomics/pan-cancer project
collaboration with david haussler @ ucsc for
activation in breast cancer “analysis-ready” tcga data
tcga breast RNAseq data
tcga breast exome seq data
R code for a pathway heuristic
web-accessible random forest model of pi3k
activation
DATA
web-accessible
executable pi3k model
SOURCE CODE binary
web-accessible
MODEL
web-accessible
PROVENANCE world wide web consortium (w3c) specification PROVENANCE for
all the interconnections above
all of these elements can be housed in an
virtual machine
66. THE DREAM PROJECT JOINS
SAGE BIONETWORKS TO ENABLE
COLLABORATIVE SCIENCE
66
67. How to incent the joint evolution of ideas in a rapid
learning space- prepublication?
How to fund where data generators and analysts are
not always the same people- repeatedly?
Should we consider
Centralized Guilds and Distributed Dynamic Teams to
perform gene-environment model building?
68. SYNAPSE
If not FEDERATION
PORTABLE LEGAL CONSENT
CHALLENGES
BRIDGE
CITIZEN ENGAGEMENT