Summary
The Cytoscape Cyberinfrastructure (CI) extends the successful Cytoscape development and community model by enabling network biologists to contribute and leverage microservices deployable at scale. The CI solves many of Cytoscape’s limitations while also delivering novel and dynamic functionality to both Cytoscape and standalone workflows, thus further empowering the already vital network biology community.
Abstract
Cytoscape is an indispensable tool for network data analysis and visualization. One of Cytoscape’s greatest strengths is that it is powered by a vibrant array of developer-contributed apps. However, as network biologists’ requirements evolve, Cytoscape is challenged not only to keep pace, but to lead new and existing developers to create even greater value. Currently, multiscale and multifaceted networks push the memory limits of a Cytoscape workstation, while complex calculations such as Network Based Stratification and Network Based GWAS strain workstation processors. Increasingly, users demand support for collaborative projects, reproducible workflows, and interoperability with external tool chains. Finally, economic pressures favor solutions that promote code and algorithm reusability and evolvability.
In response, we have created the Cytoscape Cyberinfrastructure (CI), which is both an Internet-scale distributed system (based on Microservices [1]) and the network biology community it serves. Its mission is to enable and encourage network biologists to create and deploy high quality, innovative and scalable services focusing on network-based computation, collaboration and visualization.
Microservices can be written in any language, and are highly testable and evolvable. They can run on servers ranging from a single thread to a large cloud-based cluster. They can easily be reused in reproducible workflows or can serve as components in larger services. The CI links microservices via a light weight REST-based aspect-oriented interchange protocol (called CX), which enables tailored data streams while supporting service innovation via evolvable standards. CI infrastructure services support user authentication, long duration job execution, and a service repository that enables researchers to publish their services or discover services published by others. This model builds on the successful Cytoscape app community, which is based on similar mechanisms though at the scale of individual workstations.
Prominent examples of microservices include NDEx [2] (a repository for biological networks), NodeWalker (which uses heat dispersion to identify the most relevant subnetworks containing a given set of genes), cyNetShare [3] (which visualizes a network in a browser) and Cytoscape itself (which can also call CI services). Interfaces are available for Python, IPython, R and Matlab. Future work includes adding clustering, analysis, layout, publishing and display microservices and interfaces to Galaxy and Taverna workflows.
2. Cytoscape’s 3 Wishes
More …
memory for networks
cores for analysis
code reusability
languages/libraries for coding
Better browser presence
Access to long running calculations
Quicker/cheaper novel workflows
Higher quality, more shareable code
Even more vibrant NB community 2
3. Cytoscape Cyberinfrastructure (CI)
Internet-based computing ecosystem that
Complements Cytoscape
Supports producers, consumers, and
operators (as COIs)
Scales and evolves to support data
acquisition, computing, storage, management,
integration, mining, and visualization
Sharable and Testable
Coevolution – community w/ CI / community
Service Oriented Architecture (SOA)
Microservices + data bus + discovery 3
4. Roadmap
Existing Ecosystem
Cyberinfrastructure (CI) & Network Biology
Use Cases
Strategy
Technology
SOA & REST
CX & middleware
CI Now & Later
Support
Call to Community 4
6. CI & Network Biology
Identify
network
Add data
to network
Layout
nodes
Color
nodes
Publish
New Service New Service New Service New Service
BridgeDB
Clients Services
Critical CI Outcomes
Cheap services ~ innovation
Reproducible workflows
Interoperable tool chains
Code & algorithm reusability
Community
Community
Community
6
9. Generic Microservices
Producer Database
OK
StoreData(xxx)
Time
Producer Database
Message Bus
StoreData(xxx)
OK
)(xfy
For a service, the meaning of life:
Benefits
Loose Coupling
Late Binding
Decentralized
Governance
Scalability
Reusability
Distributability
Portability
Composability
Interoperability
Testability
9
10. Cytoscape CI
Cytoscape
Desktop
Message Bus (Internet)Message Bus (Internet)
Analytics Layout
NDEx
(Store/
Retrieve)
Journal
Publishing
NeXO
Personal
Publishing
R/Python/
Matlab
LayoutLayoutLayout
AnalyticsAnalyticsAnalytics
cyNetShare
Gene-
MANIA
BridgeDB MCODE
Data
Model
Layouts
ServicesApplications
10
CX is an aspect-oriented transfer format
CX carries networks and related data
13. API Perspective - Simple
13
ServiceClient
CX Library
Service call (w/CX)
REST
Results return (w/CX)
CX Library
Long running jobs require long running clients
Allows only one service at a time
14. API Perspective - Elaborated
14
Node
Service
Interface
CX Library
Service
Interface
CX Library
Service
Interface
CX Library
Submit
Agent
...
Node
Running
Results
Collector
Results DatabaseResults Database
Client
CX Library
Complete
Monitor DatabaseMonitor Database
Status
Monitor
Service call (w/CX)
Service return (jobID)
Status call (jobID)
Status return
REST
MessageBroker
Service call (w/CX)
MQ
Saveresults
Query status
(jobID)
Results call (jobID)
Results return (w/CX)
Queued
Load
Balancer
16. CI Now
16
Cytoscape
R / Python /
Matlab / C#
cyREST
cyNetShare
cytoscape.js
cytoscape.js
cytoscape.js
ScienceDirect
Cyrface
cytoscape.js
NDEx
cytoscape.js
NAV
Network
Based
Stratification
Heat
Dissipation
ID
Translation
(BridgeDB)
XGMML
.cyjs
App
Store
.cyjs
WS/SOAP
17. CI Later
17
Cytoscape
R / Python /
Matlab / C#
cyREST/CX
cyNetShare
cytoscape.js
cytoscape.js
cytoscape.js
ScienceDirect
Cyrface
cytoscape.js
NDEx
cytoscape.js
NAV
Network
Based
Stratification
?DREAM?
?GIANT?
Heat
Dissipation
ID
Translation
(BridgeDB)
Layouts
Clustering
(?MCODE?)
Network
Prediction
(?GeneMANIA?)
Attribute
Merge
CX
Enrichment
CX
CX
CX
?Taverna?
?Galaxy?
CIAuth
App
Store
18. CI Later w/Reuse
18
Cytoscape
R / Python /
Matlab / C#
cyREST/CX
cyNetShare
cytoscape.js
cytoscape.js
cytoscape.js
ScienceDirect
Cyrface
cytoscape.js
NDEx
cytoscape.js
NAV
Network
Based
Stratification
?DREAM?
?GIANT?
Heat
Dissipation
ID
Translation
(BridgeDB)
Layouts
Clustering
(?MCODE?)
Network
Prediction
(?GeneMANIA?)
Attribute
Merge
CX
Enrichment
CX
CX
CX
?Taverna?
?Galaxy?
CIAuth
App
Store
19. Support
National Resource for Network Biology (NRNB)
Supports software and staging hardware
Pharma & NCI support NDEx
Elsevier
All sources open and on GitHub
19
20. Call to Community
App authorshipCytoscape community thrives
Pride of authorship, listing in App Store
Tangible realization of useful research
Valuable workflows for all to use
Publishable results (e.g., F1000)
CI community inherits all of these! … but also:
More direct path from algorithm to useful code
Wider audience
Easier coding & dissemination
Better coding practices
More resources
20
More Information
bdemchak@ucsd.edu
21. Reading List
http://martinfowler.com/articles/microservices.html
http://home.ndexbio.org/about-ndex-2
http://idekerlab.github.io/cy-net-share
Lincoln Stein. Towards a cyberinfrastructure for the biological
sciences: progress, visions and challenges.
http://www.nature.com/nrg/journal/v9/n9/full/nrg2414.html
Barry Demchak, et al. PALMS: A Modern Coevolution of Community
and Computing Using Policy Driven Development.
https://sosa.ucsd.edu/ResearchCentral/view.jsp?id=203
Stephen Goff, et al. The iPlant collaborative: cyberinfrastructure for
plant biology.
http://journal.frontiersin.org/article/10.3389/fpls.2011.00034/pdf
21