Technology R&D Theme 1: Differential Networks

TRD 1: DIFFERENTIAL NETWORKS – PROJECT SUMMARY
A major limitation of most network mapping and analysis efforts is that they implicitly consider the
system under static conditions, while real biological systems are under constant change. The
dynamics of these biological systems are a reflection of context specificity (e.g., cell type), responses
to environmental perturbations (e.g., chemical perturbations or viral infections), and genetic alterations
(e.g., somatic mutations). Ultimately, we must understand how these dynamics affect – or are affected
by – the underlying physical and genetic networks active at a particular time. Differential analysis of
biological systems under multiple conditions (or in multiple systems) allows us to gain fundamental
understanding of these biological responses and how biological networks are re-wired in response to
perturbations and alterations. In this project, we will develop a series of tools and methodologies for
conducting differential analyses of biological networks altered under multiple conditions. We will
pursue novel algorithmic methods that allow us to make use of high-throughput, proteomic-level data
to recover biological networks under specific biological perturbations. The software tools developed in
this project allow researchers to further predict, analyze, and visualize the effects of these
perturbations and alterations, while aggregating additional information regarding the known roles of
the dynamic interactions and their participants.

TRD 1: DIFFERENTIAL NETWORKS – PROJECT NARRATIVE
Network models are frequently used to integrate molecular data with prior biological knowledge, with
the goal of elucidating disease pathways and identifying potential drug targets. We will develop novel
bioinformatic tools that allow researchers to use high-throughput proteomic and genomic data to
model the effects of dynamic network perturbations and gain an understanding of how these
perturbations re-wire biological networks. These tools will help make clinically relevant diagnoses and
predictions about an individual and their potential response to therapeutic interventions.

TRD 1: DIFFERENTIAL NETWORKS – SPECIFIC AIMS
Network Biology has achieved significant advances in biological data integration, by enabling
experimental data to be ‘explained’ via mapping to large networks of known biological interactions.
Such techniques are enabled by a series of infrastructure developments, including the development of
visualization and analysis tools (e.g., Cytoscape) and development of large-scale databases (e.g.,
BioGRID) that describe molecular interactions from literature and high-throughput experimentation.
However, much of this analysis and infrastructure has been developed without considering the
dynamics of biological networks across diverse physiological, environmental, disease and
evolutionary contexts. Networks active in different contexts are re-wired in different ways and may
possess different properties. Our premise is that understanding these differential networks will aid the
ability to predict cellular responses to network perturbations. This TRD focuses on tools for modeling
the differential states of biological networks under various perturbations and genomic alterations. It
has the following Specific Aims:
Aim 1: Tools for inference of differential networks from dynamic protein states. We will develop
algorithms and tools for inference and analysis of dynamic protein networks. The proposed approach
combines perturbation response data (e.g., cell growth in response to targeted drugs or growth
factors) and pathway information (e.g., information from signaling pathway databases). The
algorithms will generate network models that provide information on collective molecular changes in
response to perturbations. We will also extend existing network visualization software to support the
results of differential network analysis.
Aim 2: Protein network alignment algorithm and viewer technology. Similar to sequence
alignment algorithms and viewers, network alignment algorithms and viewers will enable the study of
networks altered over evolution and via genetic and other disruptions within a population. Network
alignment algorithms are under development, or have already been developed, by multiple groups,
but very little work on network alignment visualization has taken place. We will 1) implement and
improve network alignment algorithms, including novel algorithms that we have developed for multi-
scale protein interface interaction networks (IIN), 2) develop a network alignment viewer in 2D and 3D
in Cytoscape and 3) aggregate and make IIN data available to Cytoscape via standard PSICQUIC
web services.
Aim 3: Facilitating the interpretation of affinity purification mass spectrometry (AP-MS) data as
interaction networks. We will develop tools to streamline mass-spectrometry analyses of protein
interactions. The developed tools will enable importing, augmenting and clustering, as well as
visualization of MS-derived data within Cytoscape. Additional tools will be developed that will enable
users to access public repositories of AP-MS data for the purposes of data aggregation and
annotation. These tools will support the quantitative analysis and visualization of AP-MS experiments,
as well as allow these results to be viewed in the context of other ‘omics data.

TRD 1: DIFFERENTIAL NETWORKS – RESEARCH STRATEGY
A cell is a collection of molecules that evolved to robustly achieve particular functions while under
constant influence of environmental (e.g., changes in levels of nutrients and growth factors) and
internal (e.g., mutations) perturbations. In the last decade, cellular systems have been frequently
modeled as networks of interacting molecules, where functional information flows through pathways of
interactions between signaling molecules. However, much of this analysis and infrastructure has been
developed without considering the dynamics of biological networks across diverse physiological,
environmental, disease, temporal, and evolutionary contexts. The focus of this TRD is to understand
how biological systems, modeled as networks, respond to various forms of perturbation and alteration.
In particular, we will develop algorithmic technology and supportive software tools for differentially
perturbed network analysis and visualization. These capabilities will enable new comparative
analyses looking at how protein-protein interaction networks change across contexts, such as over
time, between uninfected and infected states, and in normal vs. cancerous tissues. We will also
develop inference methods to predict network level responses to perturbations such as drug
treatment.
A major goal of this work is its application to human health, where despite the success of network
models for describing cellular systems, predicting cell-type specific responses to perturbations still
remains a grand challenge. For instance, it remains extremely difficult to predict individual patient
response to targeted cancer therapies as diverse oncogenic alterations may exist in various
combinations in each tumor and vary substantially among different patients. Cancer cells form
heterogeneous systems that adapt rapidly to perturbations (e.g., targeted cancer drugs) through
diverse mechanisms. Understanding the variations in molecular response to perturbations in different
conditions, genomic backgrounds and time points will enable us to better predict the response of
cellular systems to perturbations.
1.1 TOOLS FOR INFERENCE OF DIFFERENTIAL NETWORKS FROM PROTEIN
STATES AND ABUNDANCES OVER TIME
Project Leader: Chris Sander (MSKCC)
Overview. In this project, we will develop methods for the statistical analysis of molecular response
data (e.g., total and phosphorylated protein abundances) to targeted perturbations using two different
methods: 1) the perturbation biology method and its extensions and 2) a causal pathway analysis
using differential network response. In the first method, we propose improvements in the perturbation
biology method developed in the Sander lab. Perturbation biology involves inferring predictive models
of cell signaling based on response to rich perturbations (i.e., hundreds of combinations of targeted
drugs). Cellular response (e.g., cellular growth) to untested perturbations is predicted using these
models and then experimentally tested; used in previous work on cancer drug resistance by the
Sander lab. In the second method, we propose developing a methodology for pathway analysis for
use when perturbation data is only available for low numbers of targeted agents. This new
methodology will map the statistically significant differential responses between varying conditions to
biological networks and identify the responsive modules. Further analysis of these differential
networks will reveal potentially causal links between drug perturbations and module activities. Finally,
we will extend existing tools to support the visualization of the results for the above methods.
Objectives. Prediction of cellular response (e.g., cellular growth, apoptosis, migration) to targeted
perturbations is a central challenge in biology for the development of mechanistic explanations and
the design of effective therapeutic interventions.
Task 1: Improve perturbation biology models using cell-type specific prior information. In order to
improve the perturbation biology method (described below), we developed the Pathway Extraction
and Reduction Algorithm (PERA) to build a cellular signaling network model based on a list proteins

and post-translationally modified proteins. PERA-derived models serve as prior information in the
inference of models based on the perturbation biology method. Use of this prior information
significantly improves the power of generated network models to predict cellular response to untested
perturbations. We will improve the PERA to make use of a novel confidence score to generate cell-
type specific prior information models using text-mining approaches to collect cell-type specific
information for PERA-generated interactions. This will result in more accurate models with improved
predictions of cellular responses1-4
.
Task 2: Develop a method for causal pathway analysis using differential networks. We will develop a
method for pathway analysis to uncover causal relationships between perturbations and changes in
protein abundances. This method will be used to analyze response data from projects with a limited
number of perturbation experiments. Differential responses to perturbations will be statistically filtered
(e.g., correlation-based filtering) to select proteins with a dose dependent response and significant
differential response over multiple conditions. These selected proteins will be used to extract a
biological sub-network from the Pathway Commons database and map the differential responses to
this sub-network. Finally, responsive modules will be identified using a modified version of module
detection software developed by the Sander lab known as NetBox. Increases in module activities in
response to perturbations can be associated with adaptive responses and their causality tested.
These modules may potentially be targeted therapeutically to overcome resistance in cancer cells.
Task 3: Develop software tools for the visualization of differential networks models. To support Tasks
1 and 2, we will develop software tools that allow users to visualize the differences between multiple
biological perturbations, enabling the visualization of absolute values, differences and differences of
differences using a configurable interface. For projects involving time series or multi-condition data,
this software will be able to handle changes in network topology due to response differences of the
biological entities within the network. Lastly, this software will be able to visualize significantly varying
sub-networks as a mechanism for managing network complexity.
Background and Significance
Task 1: Perturbation biology method. Quantitative characterization and prediction of the response
landscape to perturbations is a central challenge in biology for the mechanistic understanding of
biological systems and design of effective therapies. Quantitative models (e.g., differential equation
models) that link perturbations to cellular response can capture predictive details that qualitative
interrogation (e.g., Boolean network models) of cellular processes cannot. However, the building of
such models is hard due to astronomically large number of possible models that represent underlying
biological processes and often the lack experimentally tested model parameters; a review of modeling
formalisms in systems biology was recently conducted5
. To address this challenge, the Sander lab
recently developed a computational and experimental method, termed perturbation biology. The
perturbation biology method involves construction of quantitative network models using proteomic
response data to large numbers of perturbations and network inference algorithms. These models are
then used to predict system response (i.e., phospho-proteomic and cellular response such as cell
viability) to novel perturbations through quantitative simulations (Figure 1). The reason for the use of
proteomic-level data is that primary targets for most targeted therapeutic inhibitions (used by this
method as perturbations) are phospho-proteins, and many drugs with a known mechanism may not
correlate with the gene expression of their targets6
. The significance of this work has been in the first
time application of a statistical physics method known as belief propagation to effectively deal with this
complexity and accurately predicting phenotypic responses to untested perturbations1,3
.
In the belief propagation inference method, we first calculate probabilistically the most likely
interactions in the vast space of all possible solutions and then derive a set of individual, highly
probable solutions in the form of executable models (Figure 1). In the generated network models, the
nodes represent measured levels of proteins, phospho-proteins, or cellular phenotypes; and the edge
weights represent the influence of upstream nodes on the time derivatives of their downstream
effectors in a manner similar to ODE-based mathematical descriptions of models. Using the resulting

models, it is possible to perform quantitative simulations to in silico perturbations and predict system
responses (e.g., cell viability or changes to protein abundance) to novel perturbations of interest and
the most promising predictions are experimentally tested. The perturbation biology method is
particularly powerful to nominate synergistic and effective drug combinations to overcome drug
resistance in cancer1
. For example, work in the Sander lab using perturbation biology models has
accurately predicted 1) the dependence of synergistic response to CDK4 and IGF1R inhibition to AKT
pathway activity and 2) synergistic effects of combined BRAF and MYC inhibition in RAF inhibitor
resistant melanoma1,2
.
In the work on combined BRAF and MYC inhibition, use of prior information to narrow the search
space in inference significantly improved the predictive power of the network models; shown by cross
validation studies. Prior data was gathered using the Pathway Extraction and Reduction Algorithm
(PERA)1,7,8
, which uses a list of proteins and phospho-proteins to build a molecular interaction network
using information from the Pathway Commons resource (developed by the Sander and Bader labs).
Currently, the perturbation biology method does not support all available detail in the network
captured from Pathway Commons. Therefore, a network reduction step is used to simplify the network
and map Pathway Commons nodes one-to-one to experimental observables used in the perturbation
biology method.
Task 2: Causal pathway analysis. In this task, we introduce a causal pathway analysis for perturbation
studies when the perturbation biology method is not applicable due to a limited number of
experimental perturbing agents. Often, available proteomic datasets, such as those collected by the
White lab (DBP), contain measurements of a limited number of perturbations; the community has
seen surge of these datasets and there is the expectation for more through programs such as NIH
LINCS9
. To determine causes of molecular response to drug perturbations for these datasets, we
propose an alternative analysis method that makes use of some of the developments from the
perturbation biology method, such as the PERA. Development of methods for causal pathway
analysis has gained attention within the community as the number of large-scale datasets increases10-
14
. Our proposed methodology will differ from these previous efforts in that we will focus on the
analysis of drug perturbations 1) using proteomic and phospho-proteomic data (proteomic-level
measurement provide a better reflection of drug activity) unlike several studies that have focused on
gene expression11-13
and 2) the use of module based pathway analysis using NetBox, as a less biased
form of pathway analysis unlike work that makes use of canonical pathways, which differ in definition
between resources11,15
.
Figure 1. The Perturbation
Biology method involves
systematic perturbations of cells
with targeted drug combinations
(Boxes 1-2), high-throughput
measurements of response
profiles (Box 2), automated
extraction of prior signaling
information from databases
(Boxes 3-4), construction of
ordinary differential equation
(ODE) based signaling models
(Box 5) with inference algorithms
(Box 6) and prediction of system
response to novel perturbations
(Box 7)1

As part of the collaboration between the Forest White (MIT) and Chris Sander labs (MSKCC), we aim
to determine, using our proposed methodology, the proteomic and pathway module level predictors of
response and mechanisms of drug resistance to CDK4 inhibition in dedifferentiated liposarcoma.
Dedifferentiated liposarcoma is a rare, but aggressive cancer with a high recurrence and low
response rates to targeted therapies. We recently showed that targeting CDK4, either singly or in
combination with other kinases may be a strong candidate for targeted therapy in liposarcoma2
.
Cyclin-dependent kinase 4 (CDK4) is important for cell cycle G1 phase progression. CDK4 is up-
regulated in >40% of all soft-tissue sarcomas and nearly all cases of dedifferentiated liposarcomas
either at the copy number or mRNA expression level (based on genomic profiling data from 207 soft
tissue sarcoma patients, of which 50 are dedifferentiated liposarcoma patients)16
. In our ongoing
collaboration, a liposarcoma cell line (DDLS8817) is perturbed using different doses of CDK4 inhibitor
and proteomic response data is collected at different time points (30min, 8hrs, 24hrs, 72hrs) using
mass spectrometry. In initial experiments, phospho-peptide enrichment, performed using a cocktail of
anti-phospho-tyrosine antibodies, yielded readouts for 190 peptides corresponding to 170 proteins.
These data are used as input to our methods.
Task 3: Develop software tools for the visualization of differential networks models. One focus of the
Sander and Bader labs has been the development of standardized formats for community use in
systems biology, including the use of the Biological Pathway Exchange (BioPAX) format in Pathway
Commons and the Systems Biology Graphical Notation (SBGN) for the visual representation of
molecular interaction networks; usage of these standards simplifies pathway data reuse and helps
user readability of this data8,17
. Of the nearly 30 software tools that support the SBGN notation
(http://www.sbgn.org/SBGN_Software) only two provide a direct interface for Pathway Commons
(Cytoscape and ChiBE), and of these two only ChiBE natively supports the SBGN notation18,19
. We
will focus on the extension of these tools for the visualization of results from the analyses conducted
in this project. Both Cytoscape and ChiBE natively support the visualization of user data and
comparison of changes within a network across two different contexts. Visualization of these
differential networks is powerful method for understanding how processes are modified. Examples
include changes in (i) proteomic response to drug perturbations between two cell lines, (ii) emergence
of genomic alterations in different tumors in response to treatment (ii) genetic conservation of
enzymes in metabolic networks between evolutionary branches, (iii) gene expression during
development in different lineages, (iv) different sets of host-pathogen interactions, and (v) longitudinal
change over time in the Framingham study network (Table 1).
Table 1. Types of networks and corresponding perturbations
Network Type Context Perturbation/Change Observation DBP
Kinase Cell lines Drug concentration Protein levels NCI-60
Kinase
Signaling Tumor subtypes Before/After treatment Genetic alterations Vidal,
Sage,
ICGC
Gene
Regulation
Cell lineages Developmental stage Gene expression Sage,
ICGC
Metabolic Evolutionary branch Evolutionary time Genetic conservation ICGC,
NCI-60
Host-pathogen Cell lines Virus type Protein interactions Krogan
Social Friend network Time E.g. Obesity, Heart
disease
Social
We have recently prototyped support for phospho-proteomic data visualization within ChiBE (Figure
2). We use background color of the nodes to visualize differences of values between two proteomic
measurements. It also can visualize total abundance versus phospho-protein levels using a node

decoration. Currently, this visualization mechanism is not available within Cytoscape using CySBGN,
the Cytoscape App that provides SBGN support, because there is no simple method for mapping
SBGN elements to biological entities within SBGN files. This useful feature is missing from SBGN-ML,
the SBGN file format20,21
, thus we will add it to the SBGN programming interface, libSBGN, to support
NRNB tools that use SBGN (Cytoscape, ChiBE, and PathVisio) and the analyses described here. This
will also support of NCI-60 analysis results from the Pommier DBP – see TRD2.
Secondly, to support the analyses described in this project, there is a
need for a mechanism for moving between detailed network views and
more abstract representations. This multi-scale representation
functionality allows researchers to address a range of biological questions
(from qualitative to quantitative) that make use of the same underlying
data. Through apps, Cytoscape (and independent tools, such as VisANT)
supports related features for simple graphs, but none of these Apps
supports this functionality in the detailed SBGN notation22,23
. There is some previous work on this
topic by Vogt et al. and by members of the Sander and Bader labs in the reduction of detailed
BioPAX, however, neither of these methods is supported by libSBGN24,25
.
Methods
Task 1: Perturbation biology method. Perturbation biology is a powerful network modeling method
based on belief propagation inference algorithm to predict system response to experimentally
untested perturbations of
interest. The method has
been enhanced by use of
prior information network
models extracted from the
Pathway Commons
database Sander and Bader
develop. We will develop a
new version of the PERA tool
to generate confidence
scores related to each
interaction in the prior model.
The confidence score (i.e., a
measure of the existence of
an interaction across various
conditions) for each
interaction will be estimated
based on co-citation results
under particular conditions
using literature mining
approaches, co-expression
data for input samples, and
evidence code information
(indicating how particular
interactions are supported).
For example, the count for
co-citation of RAF and MEK
phosphorylation from the
White DBP lab at MIT (e.g.,
pMEK, pRAF) events in melanoma may be higher than the count for the same phosphorylation events
Figure 2. A prototype application that uses background colors to visualize
changes in a network based on proteomic data which phosphorylation state data.
Figure 3. A flowchart of the integrated pathway and statistical tools for
analysis of response to perturbations in biological systems.

in breast cancer. This literature-curated interaction information will also be used in network models for
melanoma samples and the perturbation biology method. Methods for extraction of co-citation
information from literature have been previously developed in projects, such as iHOP and we will use
this as a resource26
. In parallel, distance metric learning algorithms from machine learning will be
used to further improve the prior confidence scores27
. The goal of distance metric learning algorithms
is to generate distance functions for particular tasks. In this scenario, we will use them in combination
with text-mining approaches to identify textual fragments that are likely to contain novel interaction or
cell-type specific information. This will be determined by the similarity of the new textual data and text
identified in Pathway Commons (in the form of annotations) as containing interaction information.
These innovations will improve the prior data used in the PERA algorithm and the perturbation biology
method.
Task 2: Causal pathway analysis. In cases where it is not feasible to use our perturbation biology
method due to a low number of perturbation experiments, we will employ an analysis pipeline, which
we call causal pathway analysis (Figure 3). The first step in this pipeline is the statistical filtering out
of biological entities that do not respond significantly to perturbations. For instance, in the
collaboration between the Sander and White labs (DBP), a combination of 190 peptides (i.e., total
protein levels and phospho-protein levels) were collected across four different time points in the
liposarcoma cell line DDLS8817 where the cell line was treated with a CDK4 inhibitor. For each
individual time point, we will develop network models by first applying a correlation-based (e.g.,
Spearman correlation) filter of peptide levels in relation to cell viability measurements in different
doses. In the second step, we will extract biological sub-networks from the Pathway Commons
database using established PERA methodology and map response data values on to these network
models. The third step will apply the NetBox module detection software we developed on the resulting
network models. NetBox uses the edge betweenness algorithm by Girvan and Newman for modularity
detection15,28
. One advantage of the NetBox software is its ability to consider linked proteins that are
outside of the context of the experimentally measured proteins with additional filtering for proteins with
many interaction partners. This capacity will potentially enable us to observe affected modules that
may not otherwise be seen with available peptide measurements. The final step of this analysis will be
to examine changes in pathway modules, for example across time (e.g., between early and late time
points) and determine 1) if peptides within a give module are significantly altered between time points
and 2) if a given module has a significant relevance either through a gene set enrichment analysis as
was originally developed for use with the NetBox algorithm15,29
. Examination of the affected modules
may reveal additional therapeutic intervention points in liposarcoma; examination of known drug
targets will be conducted using the PiHelper tool developed by the Sander lab30
to identify these.
Task 3: Develop software tools for the visualization of differential networks models. As discussed
above, we will develop differential network result visualization methods. The first will be the
development of a mapping mechanism for the SBGN notation of the biological pathway models used
above and the second will be development of software to support multi-scale representation in SBGN
(also linking to TRD3). For mapping, we will extend the SBGN-ML format and the libSBGN
programming interface to support a simplified mechanism for annotating nodes to external database
references for biological entities. To do this, we will make use of the extension mechanism available
for SBGN and build support for Identifiers.org and MIRIAM-based identifiers (community projects that
support unique and perennial identifiers for data)31
that we already use in Pathway Commons. This
will enable users to map the results of differential response analyses onto annotated SBGN-based
network views; use of this feature in SBGN-compatible tools (such as Cytoscape and PathVisio) will
be necessary for widespread use. For multi-scale representation, we will implement support for
network reduction rules provided by Paxtools (a programming interface for BioPAX used by the PERA
tool) and export these to SBGN-ML files, which Paxtools supports. We will support a detailed
representation (i.e., the SBGN Process Description language) and a simplified representation known
as the SBGN Activity Flow (AF) language; this work will involve developing a mapping between the
network reduced representation from Paxtools and the SBGN AF notation; several of the members of
the Sander lab are involved in both SBGN and BioPAX efforts and are experts in both formats.

1.2 DEVELOP PROTEIN NETWORK ALIGNMENT ALGORITHM AND VIEWER
TECHNOLOGY
Project Leaders: Gary Bader (University of Toronto) and Trey Ideker (UCSD)
Overview. In this project, we will develop technology to align networks across contexts, with a
particular focus on multi-scale protein-protein interface interaction networks (IINs). Similar to
sequence alignment algorithms and viewers, network alignment algorithms and viewers will enable
the study of networks altered over evolution and via genetic and other disruptions within a population
leading to a better understanding of what network elements are shared and thus generally important,
and which are different and thus important for specific contexts. This work will be driven by the need
to study network changes in the context of protein sequence changes over evolution and to analyze
edgetic interaction networks that identify interactions gained or lost in response to protein sequence
mutation, mapped at Marc Vidal’s Center for Cancer Systems Biology at the Dana Farber Cancer
Institute in Boston.
Objectives. The major goal of this project is to develop new algorithmic and visualization technology
for comparing networks. We will focus on protein-protein interaction networks and the binding site
interfaces that mediate the interactions and can be affected by mutations across diseases or evolution
to rewire the network, as mapped in our DBP. Specifically, we will:
• Implement and improve network alignment algorithms for IINs. We will implement the GreedyPlus
IIN alignment algorithm and a selected set of state-of-the-art network alignment algorithms as a
Cytoscape App, called NetworkAligner.
• Develop a network alignment viewer. Protein and DNA sequence alignment viewers are at the
core of the bioinformatics toolset. As the need to compare networks across contexts grows, we
envision increased need for network alignment viewers, which currently don’t exist as a class of
software. We will develop a Cytoscape App, called NetworkAlignmentViewer, for this purpose.
• Aggregate and make IIN data available to Cytoscape via PSICQUIC server. No central source of
IIN data exists, thus we will collect this data from multiple sources, including our DBP, normalize it
using the PSI-MI standard format and make it available to Cytoscape via a web service.
Background and Significance. The increasing ease and accuracy of experimental methods to
detect protein-protein interactions has recently resulted in the ability to map the interactomes of
multiple species and experimental contexts. We now have an opportunity to compare these
interactomes to better understand how they evolve over time – to identify which regions are
conserved, and thus globally important, and which are not conserved, and thus likely important for
species- or context-specific traits. Comparison tools such as network alignment algorithms are
essential for our ability to exploit and extract information from the many protein interaction-mapping
efforts currently underway. The alignment of protein-protein interaction networks (PPINs) serves as a
systems-level analog to biological sequence alignments. Such alignments enable inferences to be
drawn between proteins of different species, which may agree or disagree with conclusions drawn
from sequence-based alignments. Examining aligned networks or sub-networks will help answer how
these interactomes evolved, just as sequence alignment has done with genes, proteins and genomes.
Network alignment will also reveal interaction network “mutations” just as sequence alignments have
done for genetic mutations. A number of network alignment methods have been developed for
protein-protein interaction networks. These are broadly categorized into local and global alignment
algorithms. Local alignment algorithms seek small subnetworks that are topologically similar,
emphasizing regions of high-confidence alignment between the two networks. The first network
alignment algorithms, developed by the Ideker group, were local (e.g., PathBLAST32
and
NetworkBlast33
). Others have been developed, such as NetAligner34
and MaWISH35
. Global network
alignment methods attempt to align all or most of the proteins in two or more PPINs and include

IsoRank36
, IsoRankN37
, GRAAL38
, H-GRAAL39
, MI-GRAAL40
, C-GRAAL41
, Graemlin42
and Graemlin
2.043
and others. Metabolic pathway alignment algorithms have also been developed44-47
.
While network alignment is well-studied, it does not consider important aspects of PPINs, such as
binding interfaces that are responsible for mediating the interaction. Interface-interaction networks
(IINs) are a refinement of PPINs where proteins are subdivided into their separate interaction
interfaces48
. In an IIN, a vertex represents a binding site and an edge represents a direct physical
interaction between two binding sites on their respective proteins. The higher resolution of IINs
supports new biological insights that cannot be derived from standard PPINs. For example, IINs can
distinguish between “date hubs” – proteins that interact with many partners, but not at different times
or in different locations – and “party hubs” – proteins that interact with many partners
simultaneously49
, while these distinct hub types appear identically in a PPIN. The study of IINs will
also help interpret how domain and binding site gain and loss affect the PPIN and how binding site
sequence mutations over evolution and in disease cause PPI gain and loss50-52
. This latter point is the
goal of our DBP with Marc Vidal’s Center for Cancer Systems Biology (CCSB) at the Dana Farber
Cancer Institute. The Vidal group is leading a project to identify human disease mutations in binding
sites that affect specific interactions and not others of a given protein. Given a large number of these
“edgotypic” interactions, network alignment algorithm technology that we develop will be valuable to
automatically identify and visualize gain, loss or swap of protein interactions and the changes to
mapped or known binding sites that are responsible. Such applications will be relevant to a broader
range of our DBPs where networks will be perturbed in any manner over conditions or time, including
Nevan Krogan’s AP-MS-derived host-pathogen network comparisons, ICGC, Sage, NCI-60 and social
networks.
In preliminary work, the Bader group has shown that traditional network alignment algorithms do not
function well with IINs due to the topology differences present. To address this, we have developed a
novel alignment algorithm for IINs, called GreedyPlus (Figure 4). In this project, we will develop
technology to implement select network alignment algorithms, including GreedyPlus, and develop a
network alignment viewer within Cytoscape.
Figure 4. Example network alignment using GreedyPlus.

Methods
Task 1: Implement and improve network alignment algorithms for IINs. We will implement the
GreedyPlus IIN alignment algorithm and a selected set of state of the art network alignment
algorithms as a Cytoscape App, called NetworkAligner. Implementation will be similar to our popular
ClusterMaker app, which makes available a range of network clustering algorithms within
Cytoscape53
. ClusterMaker was developed in collaboration with authors of various network clustering
algorithms, thus benefitted from crowdsourcing development, motivated by co-authorship on a paper.
We will follow the same model to build NetworkAligner. We will define an application programming
interface (API) for network alignment algorithms to make this easier, as multiple developers can then
code to the API. We will select algorithms to implement to cover major classes of local and global
aligners, in addition to those already available as open source Java code, or those that can be ported
to Java easily by the original authors. We have already implemented our own Java versions of
GreedyPlus, IsoRank36
and GRAAL54
that will form the basis of the initial NetworkAligner app.
Task 2: Develop network alignment viewer. Protein and
DNA sequence alignment viewers, such as JalView55
, are
at the core of the bioinformatics toolset. As cross-species
network information grows, we envision increased need
for network alignment viewers, which currently don’t exist
as a class of software. Cytoscape can be used to show a
network alignment result, and a rudimentary viewer is
available in the recently developed GASOLINE app56
(Figure 5). However, a general network alignment viewer
requires additional features, including: 1) a standard
network alignment file format useful for importing
alignments from third-party alignment tools, 2) an aligned
node viewer showing information supporting the
alignment, such as a protein sequence alignment or
functional similarity score, 3) support for multi-scale
network alignment results, as determined by e.g.,
GreedyPlus, and 4) highlighting of missing and gained nodes and interactions. We will develop these
features in a Cytoscape App, called NetworkAlignmentViewer, including interoperability with third-
party alignment tools as a key feature. We will develop the viewer in 2D, but also take advantage of a
3D rendering system the Bader lab has developed (beta release at
http://wiki.cytoscape.org/Cytoscape_3/3D_Renderer) to develop a 3D visualization mode, where
networks are shown as planes with alignment links connecting the planes (similar to what is simulated
in 2D by the Gasoline app, Figure 5). This will enable larger networks to be visualized. We will work
with the network alignment community to support additional features.
Task 3: Aggregate and make IIN data available to Cytoscape via PSICQUIC server. While protein
interaction network information is widely available, interface interaction network (IIN) data is currently
difficult to collect from multiple heterogeneous sources, such as the DOMINO57
, atomic level
molecular interaction structures in the PDB58
, multiple PDB-derived databases, such as BioLip59
and
Interactome3D60
, and experimental data such as that generated by the Bader and Vidal labs50,61,62
.
We will collect protein binding site level information from multiple sources, including those cited
above, and make it available in the standard PSI-MITAB version 2.7 format, which supports binding
site features. We will focus on supporting organisms with the largest amount of this type of data, such
as worm, yeast and human. This data will be made available as a PSICQUIC web service63
. The
Bader lab already maintains three public PSICQUIC web services – BIND64
, GeneMANIA65
, and
InteroPorc66
– thus it will be straightforward to set up additional servers. Once a PSICQUIC server is
set up and registered with the central registry, it automatically becomes available for querying within
Cytoscape based on previously developed PSICQUIC import functionality. However, Cytoscape
currently does not recognize binding site information returned from a PSICQUIC server, thus we will
Figure 5. NetworkAlignmentViewer

implement that feature. This will enable a workflow where a user downloads IIN data for alignment
using GreedyPlus and visualization with the NetworkAlignmentViewer app. IIN data will also be
available for other Cytoscape apps that consider binding sites and will connect with our goals for
multi-scale network analysis and visualization in the Multi-scale Representations TRD.
Links with other TRDs. GreedyPlus represents the first multi-scale network alignment algorithm, and
thus relates to work in the Multi-scale Representations TRD. The NetworkAlignmentViewer app will
benefit from multi-scale network visualization knowledge and technology gained in this TRD, which
will support the development of multi-scale network alignment visualization options.
1.3 FACILITATING THE INTERPRETATION OF AP-MS DATA AS INTERACTION
NETWORKS
Project Leaders: Alexander Pico (Gladstone Institutes) and John H. Morris (UCSF)
Overview. Affinity purification mass spectrometry (AP-MS) is a proven technique for determining
large-scale and high quality protein-protein interaction networks (c.f. 67-69
). AP-MS is now being used
to map networks across biological contexts, e.g., species, viruses, cell lines, conditions, host states.
Recent advances in AP-MS techniques and instrumentation promise a significant increase in the
scale of protein-protein interaction networks derived from AP-MS and the variety of use cases for AP-
MS (c.f. 70,71
). In addition, new public data repositories are becoming available to support mass
spectrometry proteomics experiments. Combined with efforts to determine the entire HEK293T human
interactome through AP-MS72
these repositories will enable new differential analyses using targeted
AP-MS to explore network changes in response to disease, infection, or other perturbations.
The networks resulting from AP-MS have the significant advantages of being quantitative – giving a
measure of the abundance of the association; and providing additional biological information such as
the state of post-translational modifications, which are critical to understanding molecular
function. However, the quantitative nature of AP-MS data also leads to one of the challenges of
interpreting the data, requiring both computational and visualization advances over traditional protein-
protein interaction networks (e.g., Y2H-derived)73
.
The goal of this project is to support the quantitative analysis and visualization of AP-MS experiments
and enable the broader community to access quantitative AP-MS networks and visualize those results
within the context of other -omics data.
Objectives. We will develop tools to make specialized methods more accessible and to bridge gaps
to improve frequent workflows. The project has two specific tasks:
• Support researchers using MS-derived data by augmenting Cytoscape with tools to streamline the
typical MS analysis pipeline. These tools will enable data import, filtering, scoring and clustering,
as well as visualization.
• Support the broader research community by augmenting Cytoscape with tools to access public
repositories of quantitative AP-MS data and analyze and visualize that data in context with other -
omics data already supported by Cytoscape. These tools will focus on annotation, data
integration, network augmentation and network comparison, taking advantage of differential
network analysis technology developed in this TRD.
Background and Significance. MS proteomics experiments have been used to explore the
interactome of yeast67-69
, host-pathogen interactions74-77
, signaling networks78,79
, network rewiring in
cancer80
, and even as a component of protein complex structural understanding81,82
. In all of these
biological applications, a major result is (either an explicit or implicit) network model capturing the
interactions of the proteins or, in the case of structural analysis using crosslinking, the interactions of
individual amino acids within those proteins. This creates a natural affinity between network
visualization and analysis tools such as Cytoscape and AP-MS derived data.

As our primary DBP for this work, Dr. Krogan has pioneered strategies for large-scale protein-protein
interaction analysis and is applying these methods to the study of host-pathogen interactions. A
comprehensive and unbiased survey of host-pathogen interactions is revealing critical biology
underlying virus protein homeostasis and evolution. Such an approach will yield global insight towards
chaperone networks, quality control networks, and how these modulate virus replication efficiency,
adaptation and pathogenesis. The differential network analysis of HIV protein interactions with two
human cell lines, for example, served as the driving biomedical project behind the recent Nature
Protocol co-authored by Drs. Pico and Morris83
, and drives the technology develop project described
here to facilitate the network analysis of AP-MS data.
Expressing AP-MS data as a network and providing users access to quantitative AP-MS data is
challenging, as we learned while preparing the 30-page protocol and 16-page supplemental tutorial.
There is much room for improvement in streamlining this workflow, which will save substantial AP-MS
analysis time. The standard scoring protocol for raw AP-MS data can be completed in 2-3 hours,
resulting in a table of scored interactions that is then imported, augmented, analyzed and visualized in
Cytoscape. The import step alone can take up to 2 hours; and network augmentation can take
another 4 hours, which has been highly frustrating for Krogan lab members and creating a huge
barrier-to-entry for AP-MS practitioners in general. Providing tools to streamline these two steps 10-
fold would significantly expedite the availability and utility of AP-MS networks and increase the
throughput of the Krogan lab and other AP-MS labs. This would also enable researches using AP-MS
to use the wealth of Cytoscape apps and publicly available data to visualize and analyze their data in
a biologically meaningful context.
In addition to challenges associated with direct analysis of AP-MS data, there are also challenges
associated with providing AP-MS data in a form that can be used by other researchers. Currently,
most AP-MS data is deposited as binary interactions between proteins in public repositories, such as
IMEx84
. Most journals do not yet require deposition of raw data, though repositories such as PRIDE85
exist. A wealth of information present in AP-MS data is lost when the quantitative information on
abundance, reproducibility and specificity is reduced to a simple binary interaction73
. The results of
scoring protocols can provide more subtle information relevant to distinguishing indirect vs. direct
associations or potentially weak associations that might indicate transient interactions. Furthermore,
while existing interaction repositories provide information about proteins, AP-MS data can include
information about protein post-translational modification state (PTMs), providing another source of
biologically meaningful information (see TRD1.1 and White liposarcoma and Pommier NCI-60 DBPs).
Providing appropriate tools to access AP-MS repositories that include quantitative information and
support appropriate visualizations of the information, including information about scoring results,
abundances, and PTM data will provide researchers with additional biologically meaningful
information that is currently not considered in traditional analysis workflows.
The PTM data and quantitative results from our DBPs will generate networks in various states. The
quantitative information associated with the associations will be used, for example, to find
associations, which are transitory or weak in the base network, but much more tightly bound in the
perturbed network. This may indicate a pathological condition due to change in PTM state or protein
mutation. To support differential network analysis that uses this quantitative data, we will work with
TRD1.2 to incorporate quantitative information as edge weights into network alignment algorithms,
through modification of the alignment-scoring step.
Methods
We will develop a set of tools for mass spectrometry practitioners to seamlessly transition into
network-based visualizations and analyses, without needlessly reducing the wealth of information
contained in AP-MS data. As discussed above, and directed by our collaborators and DBPs, these
tools will address the challenges of visualizing and interpreting AP-MS data from scoring and
annotation to network augmentation and comparison. Building on the Cytoscape platform, these tools
will be implemented as a set of interoperable apps, maximizing accessibility and ease-of-use.

Task 1: Tools to support the use of Cytoscape by AP-MS practitioners. These tools will streamline the
formatting and import of a typical AP-MS data set, based on what we have learned by working closely
with Krogan lab members on their analysis. As captured in the Morris et al. protocol (in press), even
after the scoring procedure is completed, the process of importing AP-MS results into Cytoscape is an
unnecessarily complicated set of steps. This initial barrier effectively excludes all but expert
Cytoscape users from working with AP-MS data sets in Cytoscape. The first tool will thus be a
specialized file importer that takes AP-MS data file and performs network import, edge attribute import
and prey attribute import and bait attribute import, all in a single step. With protein identities and
interactions properly associated with quantitative information in Cytoscape's network and table model,
these data can then leverage relevant existing apps and tools in Cytoscape, such as the clustering
and visualization tool, clusterMaker86
, which is used to identify modules and protein complexes in AP-
MS data. We will also develop AP-MS specific apps, such as one for viewing abundances, MS
spectra, and peptide lists associated with proteins in the network, and one for accessing public AP-
MS data from repositories, such as MassIVE [http://massive.ucsd.edu]), including information about
scores and (if available) information such as abundance counts.
Task 2: Tools for augmenting AP-MS interaction networks. Once AP-MS data is more easily
accessed, imported and assessed from within Cytoscape, a host of methods become more
applicable. In our work with the Krogan DBP and other collaborators, one of the most frequently faced
challenges relates to network augmentation, which involves loading additional data onto a network to
support integrative analysis. Augmentation is often a prerequisite for other types of network analysis,
to remedy sparse interaction data, and leverages context from orthogonal and curated sources.
Relevant data types include:
• other protein oriented data (e.g., protein interactions from GeneMANIA and other AP-MS data)
• human genetic information, including disease linked genes (e.g., GWAS; see TRD2)
• known pathways (e.g., KEGG, WikiPathways, Pathway Commons)
• gene function information (e.g., GO; see TRD3)
• gene expression datasets (e.g., ArrayExpress, GEO)
While many of these are already accessible via apps in Cytoscape, users face a convoluted set of
steps–once again requiring expertise–to distinguish bait proteins from prey proteins when forming a
database query or when filtering returned results. In host-pathogen networks, for example, this
distinction is relevant to organism context in the augmentation step. For this, we will develop an
dedicated interface for augmenting AP-MS interaction networks that considers these factors (based
on the information provided by our importer tool) and provides lists or calls to corresponding
Cytoscape apps and web services.
Analysis and visualization of the differential networks being mapped using AP-MS methods is crucial
to gaining insight into changes to pathways and complexes during infection, disease and other
perturbations. By providing tools in the main areas described above, the Krogan lab and others will be
able to increase the throughput and analyses of AP-MS data sets by streamlining both the processing
and contextualization of AP-MS data with other biological data sets. Furthermore, the broader
research community will have better access to quantitative AP-MS data to use as a platform for
viewing their own data in context or as a baseline for comparing against perturbed (e.g., diseased)
sub-networks.

TRD 1: DIFFERENTIAL NETWORKS –
BIBLIOGRAPHY AND REFERENCES CITED
1. A., K. et al. Perturbation biology models predict c-Myc as an effective co-target in RAF inhibitor
resistant melanoma cells. Biorxiv (2014).
2. Miller, M.L. et al. Drug synergy screen and network modeling in dedifferentiated liposarcoma
identifies CDK4 and IGF1R as synergistic drug targets. Sci Signal 6, ra85 (2013).
3. Molinelli, E.J. et al. Perturbation biology: inferring signaling networks in cellular systems. PLoS
Comput Biol 9, e1003290 (2013).
4. Nelander, S. et al. Models from experiments: combinatorial drug perturbations of cancer cells.
Mol Syst Biol 4, 216 (2008).
5. Machado, D. et al. Modeling formalisms in Systems Biology. AMB Express 1, 45 (2011).
6. Li, K.C. & Yuan, S. A functional genomic study on NCI's anticancer drug screen.
Pharmacogenomics J 4, 127-35 (2004).
7. Cerami, E.G. et al. Pathway Commons, a web resource for biological pathway data. Nucleic
Acids Res 39, D685-90 (2011).
8. Demir, E. et al. The BioPAX community standard for pathway data sharing. Nat Biotechnol 28,
935-42 (2010).
9. Schurer, S.C. & Muskal, S.M. Kinome-wide activity modeling from diverse public high-quality
data sets. J Chem Inf Model 53, 27-38 (2013).
10. Alekseyenko, A.V. et al. Causal graph-based analysis of genome-wide association data in
rheumatoid arthritis. Biol Direct 6, 25 (2011).
11. Kim, Y.A., Wuchty, S. & Przytycka, T.M. Identifying causal genes and dysregulated pathways
in complex diseases. PLoS Comput Biol 7, e1001095 (2011).
12. Kramer, A., Green, J., Pollard, J., Jr. & Tugendreich, S. Causal analysis approaches in
Ingenuity Pathway Analysis. Bioinformatics 30, 523-30 (2014).
13. Li, J. & Lu, Z. Pathway-based drug repositioning using causal inference. BMC Bioinformatics
14 Suppl 16, S3 (2013).
14. Shin, S.Y. et al. Interrogating causal pathways linking genetic variants, small molecule
metabolites, and circulating lipids. Genome Med 6, 25 (2014).
15. Cerami, E., Demir, E., Schultz, N., Taylor, B.S. & Sander, C. Automated network analysis
identifies core pathways in glioblastoma. PLoS ONE 5, e8918 (2010).
16. Barretina, J. et al. Subtype-specific genomic alterations define new targets for soft-tissue
sarcoma therapy. Nat Genet 42, 715-21 (2010).
17. Le Novere, N. et al. The Systems Biology Graphical Notation. Nat Biotechnol 27, 735-41
(2009).
18. Babur, O. et al. Integrating biological pathways and genomic profiles with ChiBE 2. BMC
Genomics 15, 642 (2014).
19. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular
interaction networks. Genome Res 13, 2498-504 (2003).
20. Goncalves, E., van Iersel, M. & Saez-Rodriguez, J. CySBGN: a Cytoscape plug-in to integrate
SBGN maps. BMC Bioinformatics 14, 17 (2013).
21. van Iersel, M.P. et al. Software support for SBGN maps: SBGN-ML and LibSBGN.
Bioinformatics 28, 2016-21 (2012).
22. Hu, Z. et al. VisANT 4.0: Integrative network platform to connect genes, drugs, diseases and
therapies. Nucleic Acids Res 41, W225-31 (2013).
23. Royer, L., Reimann, M., Andreopoulos, B. & Schroeder, M. Unraveling protein networks with
power graph analysis. PLoS Comput Biol 4, e1000108 (2008).
24. Demir, E. et al. Using biological pathway data with paxtools. PLoS Comput Biol 9, e1003194
(2013).
25. Vogt, T., Czauderna, T. & Schreiber, F. Translation of SBGN maps: Process Description to
Activity Flow. BMC Syst Biol 7, 115 (2013).

26. Hoffmann, R. & Valencia, A. A gene network for navigating the literature. Nat Genet 36, 664
(2004).
27. Xing, E.P., Jordan, M.I., Russell, S. & Ng, A.Y. Distance metric learning with application to
clustering with side-information. . Advances in neural information processing systems 15, 505-
512 (2002).
28. Newman, M.E. & Girvan, M. Finding and evaluating community structure in networks. Phys
Rev E Stat Nonlin Soft Matter Phys 69, 026113 (2004).
29. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for
interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A 102, 15545-50 (2005).
30. Aksoy, B.A. et al. PiHelper: an open source framework for drug-target and antibody-target
data. Bioinformatics 29, 2071-2 (2013).
31. Juty, N., Le Novere, N. & Laibe, C. Identifiers.org and MIRIAM Registry: community resources
to provide persistent identification. Nucleic Acids Res 40, D580-6 (2012).
32. Kelley, B.P. et al. PathBLAST: a tool for alignment of protein interaction networks. Nucleic
Acids Res 32, W83-8 (2004).
33. Kalaev, M., Smoot, M., Ideker, T. & Sharan, R. NetworkBLAST: comparative analysis of
protein networks. Bioinformatics 24, 594-6 (2008).
34. Pache, R.A. & Aloy, P. A novel framework for the comparative analysis of biological networks.
PLoS One 7, e31220 (2012).
35. Koyuturk, M. et al. Pairwise alignment of protein interaction networks. J Comput Biol 13, 182-
99 (2006).
36. Singh, R., Xu, J. & Berger, B. Global alignment of multiple protein interaction networks with
application to functional orthology detection. Proc Natl Acad Sci U S A 105, 12763-8 (2008).
37. Liao, C.S., Lu, K., Baym, M., Singh, R. & Berger, B. IsoRankN: spectral methods for global
alignment of multiple protein networks. Bioinformatics 25, i253-8 (2009).
38. Kuchaiev, O., Milenkovic, T., Memisevic, V., Hayes, W. & Przulj, N. Topological network
alignment uncovers biological function and phylogeny. J R Soc Interface 7, 1341-54 (2010).
39. Milenkovic, T., Ng, W.L., Hayes, W. & Przulj, N. Optimal network alignment with graphlet
degree vectors. Cancer Inform 9, 121-37 (2010).
40. Kuchaiev, O. & Przulj, N. Integrative network alignment reveals large regions of global network
similarity in yeast and human. Bioinformatics 27, 1390-6 (2011).
41. Memisevic, V. & Przulj, N. C-GRAAL: common-neighbors-based global GRAph ALignment of
biological networks. Integr Biol (Camb) 4, 734-43 (2012).
42. Flannick, J., Novak, A., Srinivasan, B.S., McAdams, H.H. & Batzoglou, S. Graemlin: general
and robust alignment of multiple large interaction networks. Genome Res 16, 1169-81 (2006).
43. Flannick, J., Novak, A., Do, C.B., Srinivasan, B.S. & Batzoglou, S. Automatic parameter
learning for multiple local network alignment. J Comput Biol 16, 1001-22 (2009).
44. Pinter, R.Y., Rokhlenko, O., Yeger-Lotem, E. & Ziv-Ukelson, M. Alignment of metabolic
pathways. Bioinformatics 21, 3401-8 (2005).
45. Cheng, Q., Harrison, R. & Zelikovsky, A. MetNetAligner: a web service tool for metabolic
network alignments. Bioinformatics 25, 1989-90 (2009).
46. Wernicke, S. & Rasche, F. Simple and fast alignment of metabolic pathways by exploiting local
diversity. Bioinformatics 23, 1978-85 (2007).
47. Cakmak, A. & Ozsoyoglu, G. Mining biological networks for unknown pathways. Bioinformatics
23, 2775-83 (2007).
48. Johnson, M.E. & Hummer, G. Interface-Resolved Network of Protein-Protein Interactions.
PLoS Comput Biol 9, e1003065 (2013).
49. Han, J.-D.J. et al. Evidence for dynamically organized modularity in the yeast protein-protein
interaction network. Nature 430, 88-93 (2004).
50. Xin, X. et al. SH3 interactome conserves general function over specific form. Mol Syst Biol 9,
652 (2013).
51. Reimand, J., Hui, S., Jain, S., Law, B. & Bader, G.D. Domain-mediated protein interaction
prediction: From genome to network. FEBS Letters 586, 2751-2763 (2012).

52. Reimand, J., Wagih, O. & Bader, G.D. The mutational landscape of phosphorylation signaling
in cancer. Sci. Rep. 3(2013).
53. Morris, J.H. et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC
Bioinformatics 12, 436 (2011).
54. Kuchaiev, O., Stevanovic, A., Hayes, W. & Przulj, N. GraphCrunch 2: Software tool for network
modeling, alignment and clustering. BMC Bioinformatics 12, 24 (2011).
55. Waterhouse, A.M., Procter, J.B., Martin, D.M., Clamp, M. & Barton, G.J. Jalview Version 2--a
multiple sequence alignment editor and analysis workbench. Bioinformatics 25, 1189-91
(2009).
56. Micale, G., Pulvirenti, A., Giugno, R. & Ferro, A. GASOLINE: a Greedy And Stochastic
algorithm for optimal Local multiple alignment of Interaction NEtworks. PLoS ONE 9, e98750
(2014).
57. Ceol, A. et al. DOMINO: a database of domain-peptide interactions. Nucleic Acids Res 35,
D557-60 (2007).
58. Berman, H.M. et al. The Protein Data Bank. 28, 235-242 (2000).
59. Yang, J., Roy, A. & Zhang, Y. BioLiP: a semi-manually curated database for biologically
relevant ligand-protein interactions. Nucleic acids research 41, D1096-103 (2013).
60. Mosca, R., Ceol, A. & Aloy, P. Interactome3D: adding structural details to protein networks.
Nature methods 10, 47-53 (2013).
61. Tonikian, R. et al. Bayesian modeling of the yeast SH3 domain interactome predicts
spatiotemporal dynamics of endocytosis proteins. PLoS Biol 7, e1000218 (2009).
62. Zhong, Q. et al. Edgetic perturbation models of human inherited disorders. Molecular systems
biology 5, 321 (2009).
63. Aranda, B. et al. PSICQUIC and PSISCORE: accessing and scoring molecular interactions.
Nature methods 8, 528-9 (2011).
64. Isserlin, R., El-Badrawi, R.A. & Bader, G.D. The Biomolecular Interaction Network Database in
PSI-MI 2.5. Database (Oxford) 2011, baq037 (2011).
65. Zuberi, K. et al. GeneMANIA prediction server 2013 update. Nucleic acids research 41, W115-
22 (2013).
66. Michaut, M. et al. InteroPORC: automated inference of highly conserved protein interaction
networks. Bioinformatics 24, 1625-31 (2008).
67. Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by
mass spectrometry. Nature 415, 180-3 (2002).
68. Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of
protein complexes. Nature 415, 141-7 (2002).
69. Krogan, N.J. et al. Global landscape of protein complexes in the yeast Saccharomyces
cerevisiae. Nature 440, 637-43 (2006).
70. Hebert, A.S. et al. Neutron-encoded mass signatures for multiplexed proteome quantification.
Nat Methods 10, 332-4 (2013).
71. Hebert, A.S. et al. NeuCode Mouse and One Hour Proteomes. in Eleventh International
Symposium om Mass Spectrometry in the Health and Life Sciences: Molecular & Cellular
Proteomics (ed. Burlingame, A.L.) (The American Society for Biochemistry and Molecular
Biology, Inc., San Francisco, CA USA, 2014).
72. Huttlin, E. et al. High-Throughput Proteomic Mapping of Protein Interaction Networks: Toward
a Global View of the Human Interactome. in Eleventh International Symposium om Mass
Spectrometry in the Health and Life Sciences: Molecular & Cellular Proteomics (ed.
Burlingame, A.L.) (The American Society for Biochemistry and Molecular Biology, Inc., San
Francisco, CA USA, 2014).
73. Gingras, A.C. & Raught, B. Beyond hairballs: The use of quantitative mass spectrometry data
to understand protein-protein interactions. FEBS Lett 586, 2723-31 (2012).
74. Jager, S. et al. Global landscape of HIV-human protein complexes. Nature 481, 365-70
(2012).

75. White, E.A. et al. Systematic identification of interactions between host cell proteins and E7
oncoproteins from diverse human papillomaviruses. Proc Natl Acad Sci U S A 109, E260-7
(2012).
76. Pichlmair, A. et al. Viral immune modulators perturb the human molecular network by common
and unique strategies. Nature 487, 486-90 (2012).
77. Munday, D.C. et al. Using SILAC and quantitative proteomics to investigate the interactions
between viral and host proteomes. Proteomics 12, 666-72 (2012).
78. Bisson, N. et al. Selected reaction monitoring mass spectrometry reveals the dynamics of
signaling through the GRB2 adaptor. Nat Biotechnol 29, 653-8 (2011).
79. Song, J., Wang, Z. & Ewing, R.M. Integrated analysis of the Wnt responsive proteome in
human cells reveals diverse and cell-type specific networks. Mol Biosyst 10, 45-53 (2014).
80. Song, J., Hao, Y., Du, Z., Wang, Z. & Ewing, R.M. Identifying novel protein complexes in
cancer cells using epitope-tagging of endogenous human genes and affinity-purification mass
spectrometry. J Proteome Res 11, 5630-41 (2012).
81. Stengel, F., Aebersold, R. & Robinson, C.V. Joining forces: integrating proteomics and cross-
linking with the mass spectrometry of intact complexes. Mol Cell Proteomics 11, R111 014027
(2012).
82. Walzthoeni, T., Leitner, A., Stengel, F. & Aebersold, R. Mass spectrometry supported
determination of protein complex structure. Curr Opin Struct Biol 23, 252-60 (2013).
83. Morris, J.H.K., G.M.; Verschueren, E.; Johnson, J.R.; Cimermancic, P.; Greninger, A.L.; Pico,
A.R. Affinity Purification-Mass Spectrometry and Network Analysis to Understand Protein-
Protein Interactions (accepted, pending publication). Nature Protocol (2014).
84. Orchard, S. et al. Protein interaction data curation: the International Molecular Exchange
(IMEx) consortium. Nat Methods 9, 345-50 (2012).
85. Vizcaino, J.A. et al. The PRoteomics IDEntifications (PRIDE) database and associated tools:
status in 2013. Nucleic Acids Res 41, D1063-9 (2013).
86. Morris, J.H. et al. clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC
Bioinformatics 12, 436 (2011).

Technology R&D Theme 1: Differential Networks

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Technology R&D Theme 1: Differential Networks

Similaire à Technology R&D Theme 1: Differential Networks (20)

Plus de Alexander Pico

Plus de Alexander Pico (16)

Dernier

Dernier (20)

Technology R&D Theme 1: Differential Networks