1. Big Data goes 3D: BioLayout Express3D
Prof Tom Freeman
University of Edinburgh
2. Network Graphs of (Biological) Relationships
Many types of data, biological or otherwise, can best be
viewed and interrogated as networks, best visualised as so-
called network graphs.
In biology these may include:
• Social interactions between individuals Spread of TB via contact tracing
• Transmission of disease
• Relationship (evolutionary, homology) between genes and
proteins
• Interactions between proteins (data, co-citation, pathway Protein homology
models)
• ‘omics data
Pathways Protein interaction
3. Example: Microarray Gene Expression Data
• Can sequence and measure tissue-specific activity of 23,000 Microarrays
genes in human body
• Microarrays comprised of 1000s/millions of DNA probes –
routinely used to measure activity across the genome
• Produce highly complex data – analysis/visualisation is Display of statistical hits
challenging
• BioLayout Express3D developed originally to analyse this kind
of data through use of 3D network graphs
Display of clusters
4. Example (cont.): Steps Involved in Analyzing Gene Expression Data
• Microarray data (many measurements over many samples)
imported
• Co-expression defined using correlation measure (read: is gene A
upregulated in the same samples as gene B?)
• Genes (nodes) are connected to each other in a network based on
their level of co-expression (edges) (read: pretty graphs!)
1.25 billion
50,000
calculations
r>
50,000
Correlation
matrix
5. Example (cont.): The program’s work-flow in detail
Data quality control,
normalisation and annotation
Gene-to-gene Pearson
correlation calculated for every
probe set on the array
Filter correlations file based on user defined
threshold (0 - 1.0), i.e. exclude weak correlations
Edges drawn between nodes (genes) based
on correlations > than selected threshold
2D or 3D visualisation
Clustering and visual exploration
CPU or GPU parallelization used for all
computationally intensive algorithms
7. Advantages of Graph-based Analyses of Complex Data using
BioLayout Express3D
• Rapid calculation of networks from primary data
• Support for the visualization of large (10s of thousands of nodes, millions
of edges) network graphs
• Rendering of the networks in 3D space with real-time interactive
navigation
• Full range of tools for network visualization, inspection, querying and
analysis
• Rapid calculations as CPU and GPU are used for parallel calculations
• Can in principle use to visualise data from all kinds of fields as well as
linking to primary data manipulation programs such as Excel
8. Modelling and Visualization of Stochastic Flow through Large
Network Systems – e.g. biological pathways
• Standardized graphical notation system depicts the complex network of
relationships within e.g. biological pathways
• Previously no way of using these models as a basis for the computational
modeling of pathway function
• Biolayout can dynamically model the stochastic flow of ‘activity’ through large
networks/pathways
• Can represent this flow visually
Basically: Can model and animate how components of a complex network
influence each other over time & compare to real data to test the model.
9. Modelling and Visualization of Stochastic Flow through Large
Network Systems
1. Pathway models drawn in yEd
graph editor, parameterized and
saved as .graphml files
2. Models imported
into BioLayout and
used to calculate time-
dependent stochastic
flow through network
3. The results of flow simulations
can be visualised as graphs (mouse-
over function) or viewed as real-
time animations where the size and
colour of nodes is used to represent
their activity
10. What we’re looking for
The code is open source for non-commercial use – we’d love for you to use it in your
research, be it in biology or anywhere else
•Where do you see this tool making an impact in a research setting?
Maybe you’re a programmer who’d like to get involved in adapting the software for:
•Adapting it to new applications
•Integrating it with other tools
•Exploring the visualisation capabilities of the tool in new setting
We’re also looking to develop the technology commercially
•Can you think of any great market opportunities for BioLayout?
•Who should we be partnering with to develop the tool for this application? Who might
want to license it?
Either way, we’d love to hear from you!
11. BioLayout Express3D Team
The Roslin Institute
Tim Angus
Derek Wright
Tom Freeman
EMBL-EBI
Anton Enright
Stijn van Dongen
Thanks to the challenge sponsors