This presentation is about a lecture I gave within the "Green Lab" course of the Computer Science master, Software Engineering and Green IT track of the Vrije Universiteit Amsterdam: http://masters.vu.nl/en/programmes/computer-science-software-engineering-green-it/index.aspx
http://www.procaccianti.me
1. 1 Het begint met een idee
Data Visualization in R
Giuseppe Procaccianti
2. Vrije Universiteit Amsterdam
2 Giuseppe Procaccianti / S2 group / The Green Lab
Data Visualization in R
● Goals for this lab session:
○ learn the basics of the ggplot2 package
■ understand the basic concepts behind ggplot2
■ perform a simple tutorial
5. Vrije Universiteit Amsterdam
5 Giuseppe Procaccianti / S2 group / The Green Lab
A ggplot2 graph
● The ggplot2 package is based upon the graphics grammar (gg)
concept [1]
● Complex statistical graphs can be constructed by composing
elementary elements or changing their properties
[1] Wilkinson, Leland. The grammar of graphics. Springer Science & Business Media, 2006.
6. Vrije Universiteit Amsterdam
6 Giuseppe Procaccianti / S2 group / The Green Lab
Example: a pie chart
pie <- ggplot(mtcars, aes(x = factor(1), fill = factor(cyl))) +
geom_bar(width = 1, position = "fill", color = "black")
7. Vrije Universiteit Amsterdam
7 Giuseppe Procaccianti / S2 group / The Green Lab
Example: a pie chart
pie + coord_polar(theta = "y")
8. Vrije Universiteit Amsterdam
8 Giuseppe Procaccianti / S2 group / The Green Lab
The qplot() function
● The qplot() function allows to create quick graphs
● Simple 1-line syntax, but not very powerful - similar to plot()
qplot(CPUusr, Watts, data=run)
9. Vrije Universiteit Amsterdam
9 Giuseppe Procaccianti / S2 group / The Green Lab
The ggplot() function
● The ggplot() function is the main function of ggplot
● It allows to create graphs by adding subsequent layers
● Three main layers: aesthetics, geometrics, statistics
10. Vrije Universiteit Amsterdam
10 Giuseppe Procaccianti / S2 group / The Green Lab
The ggplot() function
● The aesthetics layer contains the data you want to encode as
graphical properties
● The geometrics layer contains the actual graphical elements
a_graph <- ggplot(data = run,
aes(x = Watts, y = CPUusr))
a_graph <- a_graph + geom_point()
● The statistics layer contains statistics computed from the data
a_graph <- a_graph + stat_smooth()
11. Vrije Universiteit Amsterdam
11 Giuseppe Procaccianti / S2 group / The Green Lab
Example: the violin plot
● From a boxplot….
a_graph <- ggplot(data = run, aes(x = Watts, y = CPUusr))
+ geom_boxplot()
+
12. Vrije Universiteit Amsterdam
12 Giuseppe Procaccianti / S2 group / The Green Lab
Example: the violin plot
● ...to a violin plot
a_graph <- ggplot(data = run, aes(x = Watts, y = CPUusr))
+ geom_violin()
+
13. Vrije Universiteit Amsterdam
13 Giuseppe Procaccianti / S2 group / The Green Lab
Why violin plots are great
● More informative (full data distribution is shown)
● You can draw boxplots on them!
14. Vrije Universiteit Amsterdam
14 Giuseppe Procaccianti / S2 group / The Green Lab
Tutorial: qplot() + ggplot()
● Perform a basic exercise using both qplot() and ggplot()...
● ...using real experimental data (similar to yours)
15. Vrije Universiteit Amsterdam
15 Giuseppe Procaccianti / S2 group / The Green Lab
Further links and references
● Official ggplot2 website
● qplot() tutorial - Edwin Chen
● ggplot2() tutorial - Josef Fruehwald
● Wilkinson, Leland. The grammar of graphics. Springer Science
& Business Media, 2006. (Amazon)
16. Vrije Universiteit Amsterdam
16 Giuseppe Procaccianti / S2 group / The Green Lab
Thank you!
g.procaccianti@vu.nl
i.malavolta@vu.nl