10 Tips for Better Visualization of Scientific Data
1. 10 Tips for
Better Visualization of
Scientific Data
Sercan Taha Ahi (tahaahi@gmail.com)
Yamaguchi Laboratory @ Tokyo Institute of Technology
2012/7/26
1
2. Before starting
Each plot, each figure, and each drawing is there to communicate a
scientifically interesting idea to a scientific community.
They are not intended to be laundry lists of experimental outcomes.
Please do not forget the purpose, and do not forget the audience.
2
3. 1. Get rid of “empty” dimensions.
4
A pie chart 3 A bar graph
4% 5% 9% 2
1
36%
0
45%
1
2
3
4
A better pie chart 5
4%
5% 3.5
3 A better bar graph
9%
2.5
45% 2
1.5
1
0.5
36%
0
1 2 3 4 5
3
4. 2. Maximize data-ink ratio.
A bar graph A better bar graph
100 100
90 90
80 80
70 70
60 60
50 50
40 40
30 30
20 20
10 10
0 0
1 2 3 4 1 2 3 4
Optimized data-ink ratio (1) is eco-friendly, (2) provides better visibility - even for greyscale prints-, and (3)
communicates ideas more efficiently.
4
5. 3. Show the entire scale.
A line plot A better line plot
100
90
93.5
80
93
70
92.5
60
92 50
40
91.5
30
91
20
90.5
10
100 200 300 400 500 100 200 300 400 500
Is this a significant drop?
5
6. 3. Show the entire scale.
A group of line plots
This one is better
6
7. 4. State the axis labels, units, and title.
A line plot A better line plot
Classification accuracy
100 100
90 90
80 80
70 70
Accuracy (%)
60 60
50 50
40 40
30 30
20 20
10 10
100 200 300 400 500 100 200 300 400 500
Number of training samples
For arbitrary units, use (a.u.)
7
8. 5. Set the aspect ratio appropriately.
2-dim representation of the data by PCA
A
1.5 A 2D plot
1
0.5 Although the ranges of x and y coordinates of the
C data samples are unequal, the left figure has equal
PC#2
0 length x and y axes, which might mislead the viewer
into believing the distance between cluster A and B is
-0.5
equal to the distance between cluster A and C.
-1
B
-1.5
-4 -2 0 2 4 6
PC#1
2-dim representation of the data by PCA
A
1 C
PC#2
A better 2D plot 0
-1
B
-4 -2 0 2 4 6
PC#1 8
9. 6. Indicate and label uncertainty.
Mean plot
One data sample
Mean plot
Normalized absorption coefficients (a.u.)
Normalized absorption coefficients (a.u.)
(Given confidence intervals are for 1 std.)
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0
400 450 500 550 600 650 700 400 450 500 550 600 650 700
Wavelength (nm) Wavelength (nm)
There is an uncertainty in every experiment, in every measurement. Depending on the deviation in the
data, the conclusions that you draw might be drastically different. Therefore, you should always show the
uncertainty in the data.
9
10. 7. Do not use bitmap graphics; prefer eps or pdf when possible.
A line plot A better line plot
Classification accuracy
100
90
80
70
Accuracy (%)
60
50
40
30
20
10
100 200 300 400 500
Number of training samples
Bitmap graphics do not scale well. When graphics do not scale well, your study looks amateurish.
Use vector graphics instead. If you have no access to proprietary tools, then please create high-resolution
bitmap images, or better, consider using free programming languages such as R and Python for plots, and
free graphics tools such as Inkscape and Gimp for drawings.
10
11. 8. Set the precision of the real numbers appropriately.
1.617617
1.294094
0.970570
0.647047
A line plot 0.323523
0
1.8
1.6
1.4
1.2
1.0
0.8
A better line plot 0.6
0.4
0.2
0.0
Ask yourself: What is the minimum precision (number of decimal places) needed to convey my idea?
11
12. 9. Choose colors carefully.
A bar graph A better bar graph
1 1
Our method Our method
0.9 Their method1 0.9 Their method1
Their method2 Their method2
0.8 Their method3 0.8 Their method3
Their method4 Their method4
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
1 2 3 4 1 2 3 4
When you want to compare your method with a number of well-established approaches on a graph, pick an
easily distinguishable color for your results. Do not make the listeners or readers search for it.
12
13. 9. Choose colors carefully.
A scatter plot A better scatter plot
1 1
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
Be gentle, and do not forget color blinds.
13
14. 10. Put your data into a context.
A plot that depicts the total snowfall in Boston for the winter of 2010-2011
http://www.boston.com/news/weather/graphics/2011_snowfall/
Inches?? Can you quickly imagine how high 80.1 inches is?
14
15. 10. Put your data into a context.
A better plot that depicts the total snowfall in Boston for the winter of 2010-2011
http://www.boston.com/news/weather/graphics/2011_snowfall/
Now you can, right?
15
16. THANK YOU.
I would also like to thank Dr. Mehmet Cagatay
Tarhan for his valuable comments.
16