B Kinoshita - Creating biology pipelines with BioUno
LICT Human-Machine-Interface
1. BIODATA ANALYSIS
&
VISUALIZATION
Jan Aerts
Faculty of Engineering - ESAT/SCD
http://saaientist.blogspot.com
@jandot
Tuesday 1 February 2011
2. Involved in genomics research:
•chicken, cow, human genome DNA sequencing
•search for genetic variation responsible for phenotype/
disease
Issues with
•filtering: finding the correct set of parameters
•pattern searching: grasping the significance and effect of
the mutations
=> visual analytics
Tuesday 1 February 2011
3. A. Filtering
Investigating parameter space...
Tuesday 1 February 2011
4. putative mutations
filter 1
filter 2
filter 3
A B C
different settings
for filters
Tuesday 1 February 2011
16. Aim: use interactive visualization of the “raw” data
to:
•peep inside the black box
•get feel for the data
•get feel for how filter settings influence each other
Tuesday 1 February 2011
17. Aim: use interactive visualization of the “raw” data
to:
•peep inside the black box di sease
radic ate
•get feel for the data E
•get feel for how filter settings influence each other
Tuesday 1 February 2011
21. Typical example: gene networks
=> can we identify patterns?
same
network
Tuesday 1 February 2011
22. How do these networks differ?
Tuesday 1 February 2011
23. Hive Plots, taken from http://mkweb.bcgsc.ca/linnet/
Tuesday 1 February 2011
24. Aim: help researchers make sense of complicated
data:
• gene networks
• structural variation in the genome
• linked data
• ...
Tuesday 1 February 2011
25. Aim: help researchers make sense of complicated
data:
dise ase
• gene networks
radic ate
E
• structural variation in the genome
• linked data
• ...
Tuesday 1 February 2011
26. Hurdles:
• big data (millions/billions of datapoints)
=> makes interactivity difficult
solution: indexing methods, data formats,
dimensionality reduction, ...
• visual encoding
Tuesday 1 February 2011
30. So:
•visual analytics: visually identifying patterns in large
datasets to inform on statistical analysis
•use visualization to make sense of complex data
Tuesday 1 February 2011