SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
[Unclear] words are denoted in brackets
Webinar: Data Visualisation – Design and Principles
22 March 2018
Video & slides available from ANDS website
START OF TRANSCRIPT
Gerry Ryder: So good afternoon, everyone, and welcome to the webinar today. My name
is Gerry Ryder and it's my pleasure today to host this webinar about data
visualisation. It's my pleasure to introduce Martin Schweitzer. Martin's
currently working with ANDS as a data technologist. He has a background in
computer science and a particular interest in data visualisation, data science
and user interface design. He has a very professional background which
includes photography, working on large IT systems, lecturing, as well as
running workshops and training courses.
Martin is currently seconded to ANDS from the Bureau of Meteorology
where he's largely responsible for the climate record of Australia. Today
Martin is presenting for us the first in a series of two webinars focused on
data visualisation. This first webinar will focus on visualisation design and
principles while the second will focus on tools and techniques. So having
covered off on those introductions it's my pleasure now to handover to
Martin for our presentation today. Thank you, Martin.
Martin Schweitzer: Thanks very much, Gerry, and hello, everybody. I'll just jump straight in. So
when asked to present a series on visualisation the first question, I guess,
that everybody will have asked is what is the visualisation. I wanted it to be
slightly broader than just presenting graphic data so my definition of the
Page 2 of 19
visualisation is that it's a visual explanation. It's anything that helps us
understand something by looking at it. A typical example is something that
should be familiar to most people, a map of the underground. One of the
things that makes this a good visualisation is it helps show the relationships
between the different objects inside this and how people in this case
understand how to get from one point to another.
If you're trying to imagine looking at a text description of how to get, for
example, from Edgware Road to Blackfriars, it would be particularly
complex, particularly if, for example, somebody told you that Tottenham
Court Road is closed. One of the things that make this visualisation famous
was that the designer discovered that when you're underground it's really
only just the relationships that mattered. The actual exact geographical
location is a lot less interesting, and that can be seen in this visualisation. So
we'll just have a look at this. This shows the actual place on the map and
then it morphs to what it looks like on the underground map.
It just cycles through what the locations really look like and the underground
map. So once again a beautiful visualisation of how the underground map
actually maps to the real locations in London. Yet another example, often
for people who may be a bit 3D challenged would be familiar - many people
would be familiar with this IKEA visualisation that shows us the correct way
to construct a bookcase. Why are visualisations important? Why don't we
just have text description? We have a lot of descriptive statistics. Well, one
of my favourite examples and something that really made my hair stand on
end the first time I saw it was this thing called Anscombe's quartet.
Many people may be familiar with this. It's a famous example. What we
have here are four data sets: one, two, three and four in Roman numerals.
Each one is a series of X and Y values. Just looking at them it's very hard to
read much into them, but we can look at their summary statistics and - sorry
- for example, they all have the same average value for the X. They all have
the same average value for the Y. The sample variance of both the X and Y is
the same in all four of them. The correlation between the X and Y is almost
Page 3 of 19
identical in all four of them. The linear regression is exactly the same. So a
statistician may be tempted to just say, well, these numbers are pretty much
the same.
However, as soon as we look at the visualisation - in other words see the
values plotted - we see something quite different. So just one example of
how seeing a visualisation is very different to looking at the raw data.
Another example that I've taken is - we'll just go to some text and we will
have a look at a file. This is the contents of a file. As you can see, it's
probably not easy to interpret what's in the file. Most files when they're
stored on a disc are just bunches of numbers. If I told you that these
numbers represent RGB values arranged according to an XY grid, once again
it may not be obvious what the numbers represent.
However, if I do this and present them as an image - excuse me - suddenly
we see, okay, we have an image. So as numbers the numbers meant - or as
data the numbers meant absolutely nothing. However, as soon as we
visualise it as an image it will make sense. So as Gerry mentioned, I've been
interested in visualisation for a very long time. In fact, over 20 years. One
of the first books I came across was by Edward Tufte and was one of the
seminal works. At that time I think it wasn't really realised that it would
become a seminal work. He wrote a book called The Visual Display of
Quantitative Information.
In it he says, excellence in statistical graphics consists of complex ideas
communicated with clarity, precision and efficiency. For the rest of the
presentation I'm going to try and expand some of these ideas. So he came
up with a few principles. The first one is graphical displays should show the
data. We'll go through these principles first and then we'll look at examples.
It should induce the viewer to think about the substance rather than about
methodology, avoid distorting what the data has to say, present many
numbers in a small space, make large data sets coherent, and reveal the
data at several levels of detail from a broad overview to the fine structure.
Page 4 of 19
In this book he's got many fine examples; however, I've tried to find more
modern examples and I've taken some of the examples from the work that I
do. So the first one, show the data. What we're looking at here is a rainfall
map of Australia and the government instituted a plan where they said they
would give farmers concessionary loans if they were in a region that had
suffered a one in 10 year rainfall deficiency or one in 20 year rainfall
deficiency. So the map we're seeing here is a map where users can typically
zoom in and out, but what we've done is to show only those areas where -
that are affected or covered by this concessionary loan.
So I guess one of the things is we could have shown a typical rainfall map,
but ideally make this simple as possible and show only the data, so the pink
and red areas are the areas that had been affected by either a one in 10 or
one in 20 year rainfall deficiency. Next, induce the viewer to think about the
substance rather than about the methodology. So what we're looking at
here is in Kyoto, Japan cherry blossoms are a big thing. In Kyoto they've
been recording the peak of the cherry blossom season since the year 800.
So they have over 1000 years of data. What somebody's done is to plot all
this data.
What we see is that for about a century they pretty much peak between 10
and 20 April. However, since the early twentieth century they start
blossoming earlier and earlier and a lot of people would say, well, this is a
signal of climate change. However, what we wanted to show about this
graph is that the person has plotted the actual data points using a little
image of cherry blossoms which is quite cute. But they also noted in an
article they wrote about it that initially they had plotted it with a cherry
blossom with six petals until somebody pointed out that cherry blossoms
only have five petals.
The point about that is if people are thinking about how many petals the
cherry blossoms have rather than about what the graph is saying maybe
they should have thought more about the substance than the methodology.
But nonetheless, I think with any of these rules often it's a good thing to
Page 5 of 19
break a rule now and again because in this case, for example, I certainly
remembered this graph long after I'd seen it because I remembered the
issue with the cherry blossoms. The next one was avoid distorting the data
and here we're going to do something exciting and that's do it live. So what
I've done is we're now seeing what's known as Jupyter Notebook.
I imagine a lot of people would be familiar with Jupyter Notebook. Jupyter
Notebook allows us to run Python code and in the next webinar the whole
webinar will be based around looking at our work in Jupyter Notebook;
however, this is a small demo that I've got in this presentation. What we're
looking at here is storage levels in the dams that are around Melbourne. So
the first graph I'll pull up I'll just - so this is fantastic at work. What we see in
this graph is it looks like the Thomson, Cardinia and Upper Yarra dams are
really low and all the rest of them are almost full. So we may worry a bit
about that.
However, when we look at this graph we see that we started - the base of it
was 60 per cent full. So Cardinia, for example, is - well, let's take Thomson.
It's actually almost 65 per cent full so it's really not that bad. When we look
at the graph plotted against - starting at zero we note as well it doesn't look
that bad. We may also look at this and say, well, the other dams are all over
80 per cent so we've got nothing to worry about. However, not all these
dams are the same size, so looking at only the percentage can be a bit
misleading. So let's run this one.
What we see here is that the amount of space in the Thomson Dam, there's
probably not enough water in all of these smaller dams to even fill that gap
that's in the Thomson Dam. So that's what we mean when we say avoid
distorting the data. Try and make sure that we're telling a story with
integrity. The next principle was to present many numbers in a small space.
The map that we're looking at here is Australian rainfall deciles. So this is
that - the areas that are in this bright red have received the least rainfall this
December, they're in the lowest one per cent of December rainfalls.
Page 6 of 19
These tiny dark blue patches are in the highest one per cent of rainfall that -
this record goes back to 1910 so they take every year from 1910. We say
present many numbers in a small space. So what we're looking at here is a
grid, and they're roughly 640 by 800 grid cells. So each one is calculated and
for each one there's 117 years of data. So what we're looking at is almost 36
million data points; however, we've condensed those 36 million data points
into one, well, simple map. So I think this is a fantastic example of
presenting many numbers in a small space. Sometimes, as I said, we want
to break the rules and get something where we break the rules.
This was the recent tropical cyclone. We've got a visualisation that shows
the current position of the cyclone. This arguably is just one data point;
however, it's a really important data point, particularly if you're living in the
north of Western Australia and you want to know how close the cyclone is
or whether it's got a chance. Also we can - by clicking on that one point we
see a far more detailed image which then takes us into seeing the data at
different levels. The next one was around making large data sets coherent.
This is something that at the bureau we're very interested in. How do you
communicate things like probability?
When people hear almost certainly do they think that an event is more
probable or less probable than if they hear highly likely or if they hear very
good chance? So what they've done here is taken all these terms and
presented them using a technique known as KDE on one graph. So we can
very easily compare that, for example, if somebody says, chances are slight,
that people think that there's actually slightly more chance of an event
happening than if we, for example, say, it's highly unlikely, or if we say,
there's almost no chance. So that covers off on Tufte.
The next few slides are some of my ideas and some of my experience in
developing visualisations and somethings that I feel are important. One of
the most important things in any visualisation is that you actually have
something interesting to talk about the data. Whenever I see somebody
saying, we've got this data, it looks pretty boring. Can we just create a
Page 7 of 19
visualisation, well, that's when the hairs on my neck prickle a bit. So this is a
famous video. It started off as a TED Talk by the Swedish Hans Rosling.
[Video playing]
Martin Schweitzer: Okay. I think people get the idea. Now, one of the things that strikes me
about that video is talking about inequality, et cetera, and gave this TED
Talk. At a similar time, Thomas Piketty, who was famous for his book on
capitalism, also gave a TED Talk. I watched both talks. Both were equally
impressive. I thought Piketty's was the more impressive. However,
Rosling's - the one you've just seen - got 10 times as many views roughly as
Piketty's, and I think the real reason it got so many views was because it had
such a story here. It had such remarkable visualisation and graphics.
So it certainly says that it's important. Obviously Rosling is a very - or was a
very impressive storyteller and was just a very impressive presenter and so
did it really well. Of course, not all of us have his talents; however, we can
all do good or great visualisations. So here's a simpler graphic and this one
shows the trend in maximum temperatures from 1970 to 2016. So
wherever the graph is red the average maximum temperature has been
increasing and wherever the graph is blue the maximum temperature has
been decreasing over the years. I think this one tells quite an alarming
story.
Here's another visualisation and this one I've got three slides which show a
progression of how we're trying to convey something. So in the first slide
the person has just taken the data and they've put it - this is rainfall data.
They've started at 1900 and showed how much rainfall up to years 2010.
Now, there are two large influences on rainfall. One is the ENSO which is -
often we hear that in a La Niña system or an El Niño system. The other one
is what is marked as IOD which is Indian Ocean Dipole. Once again, these
can be either positive or negative.
So we've got two, four, six, seven different colours in the graph showing that
when this rainfall fell what kind of system we were in. However, this doesn't
Page 8 of 19
really tell a good story. if we look at it having been rearranged we see that
the blue lines on the right when - all the years where we had a lot of rainfall
all tended to be where we had a La Niña and a negative Indian Ocean
Dipole. The red and brown on the left were during generally El Niño years.
However, we can improve this as well because we've got seven different
things. We have to keep looking at the colours, move forwards and
backwards.
So here's a graph where what we've done is we've plotted the IOD along the
bottom going from negative to positive. We've plotted the ENSO along the
left-hand side. So these numbers in the top right we can see had a strong
ENSO signal, strong La Niña, and a positive IOD, while these numbers to the
left had a - sorry. These are the La Niña and the negative IOD. We can see
as it gets stronger how it affects the rainfall. Here's another graph which
also tells quite an alarming story. This is the water supply in Cape Town and
in 2013/14 we can see they typically get their rainfall in winter.
So around about - from October onwards the dam levels start falling.
Because for about the last five years they haven't been - there hasn't been
good rain, they've continually been falling each year progressively. That's
2013, 2014, 2015, 2016, up till this year which is 2017/18. We see when I
pulled up this graph it was between January and February and we were over
there and they were projecting that around April/May/June Cape Town
could run out of water. There were a few projections. One is if people use
600 megalitres a day of water, one with 500. One is if they were using 600
megalitres and they've started up desal plants so what would happen.
All of them show pretty dire consequences. A visualisation like this really
does tell a story. So the next principle is keep your graph as simple as
possible. I've made a very quick 3D graph. I've just made a fictitious one,
which is how many people attended at morning teas and maybe the person
that attends the most morning teas at the end of the year gets a prize and
the person who has attended the least gets a wooden spoon. So this was
my first graph and I felt, well, this can always be improved. Whenever I see
Page 9 of 19
a 3D graph if it's not displaying 3D data I'm a little bit disturbed. So I
modified it so we've now got a 2D graph.
However, the numbers are in the [box]. We probably don't need those grids
and as many of them. We certainly don't need our dotted and solid line
grids, so I cleaned that up a bit. So there's a simpler graph. However, when
looking at that graph - and often I see graphs like this - the first question I
ask is, what do those colours mean? Why are there different colours? Well,
in this case the colours mean absolutely nothing, so I've got rid of the
colours. The next thing is getting back to this idea maybe of telling a story.
What am I trying to say? Well, really what I'm trying to do is find out who
attended the most and least morning teas.
So maybe by improving the graph, well, I've now put the least - I've ordered
them from least to most and now it's quite obvious who's attended the least
and who's attended the most. So is there anything else we can do to make
this presentation simpler or to remove any unnecessary data et cetera? This
is a trick question, but of course there is. Well, in this particular case I think
we can just remove the graph altogether. I don't think that that
visualisation has given us any more information than simply looking at a
table of numbers. The table remains ordered. I get exactly that same
information.
So it's probably important to ask that question occasionally. Do we really
need a graph for this data, or do we really need a visualisation for this data?
I think Antoine de Saint-Exupery said it best when he said, perfection is
achieved not when there's nothing more to add but when there's nothing
left to take away. However - this was a however - Einstein was apparently
famous for saying, make it as simple as possible but no simpler. So here's
another example of a visualisation. This is called a skew-T log-P graph, and
this is used by meteorologists every single day. Temperature is on these
diagonals. The pressure is going along this way.
The reason it's called log-P is because at the bottom you see the gap
between 100 - 900 and 1000 is much smaller than the gap between 200 and
Page 10 of 19
300. So even the scale appears to be changing. There are two different
colour lines. Each of those lines has a meaning. The red line is what was
recorded today and the blue yesterday. In case - well, I imagine most
people aren't familiar with these graphs. So what this is actually plotting is
at a lot of locations around the world they send up weather balloons or
sondes. So this is plotting the temperature as the balloon is moving up
through the atmosphere. So we can see that it's getting cooler, et cetera.
The second line is the dew point. So we can see, for example, if the dew
point crosses the temperature we're going to get precipitation or rainfall
and so on. On the right-hand side we've got another particularly interesting
thing being visualised here and these are called wind barbs. The direction of
the wind barb shows the direction of the wind. So these ones pointing
upwards show northerly wind. The number of feathers shows the speed of
the wind. So the short ones are five knots. The long one is 10 knots. A long
and a short is 50 knots and so on. I won't go too much into this. But the fact
is that for a meteorologist, this is a really important graph.
It's not as simple as a bar chart or a line graph, et cetera, but it's serving its
purpose. That's the most important thing. A visualisation has to be fit for
purpose. The next thing we'll look at is colour. I'm not going to go into
colour in a lot of detail, the main reason being because you can spend hours
talking about colour to really understand it thoroughly. I've got a few
suggestions, but the most important one, I think, is for colour if it's
important please try and find somebody who's an expert. There are lots of
different factors to consider, things like colour blindness, common
conventions, cultural differences and so on.
This is just a very simple example. These are from images of blood travelling
through an artery. This one - well, basically they showed these different
images to a lot of doctors and asked which one they preferred. Most
doctors came up with this A. However, when they asked people to diagnose
the issues with these things they were - then I think the best one was F or G
where they were able to identify the most issues or see the most problems
Page 11 of 19
with a patient. So even though they thought that this one was the easiest
one to read - the colourful one - admittedly they were used to those colours,
et cetera. It's not always the case.
The reason I'm saying this is it really does say that colour can be a tricky
issue and that really it does need some expertise and, in this case, it was
actually through some research. In the slides there's a reference to this
paper that talks about this. It's quite an interesting paper. Just on the topic
of colour, here are some examples from the bureau once again. This one is
showing rainfall. It's using a gradated scale, so darker means more rainfall.
They've used the colour blue which makes sense because the more
saturated blue tends to show areas that have more saturation in terms of
rainfall.
This map is not showing how much rainfall but it's showing how variable the
rainfall is, in other words, how much it differs from year to year. So it
wouldn't have made sense to use blue here because some areas can be very
dry but at the same time have a lot of variability of - or have very little
variability. Areas that may be very wet may have a very small variability
because they're wet all year round, just as areas that are wet all year around
have low variability. So this one is showing - that's chosen a different colour
for this one. This one is showing how much rainfall in this case fell in the
week of 23 January. This is using a scale that people who are looking at this
type of map are familiar with.
The white areas have had no rainfall or not been able to record it, and these
dark colours are the areas of the highest rainfall. Once again we see that
this scale is not linear, so there's a colour for between one and five
millimetres. There's a different colour for between 300 and 400 millimetres.
I think that's useful when looking at visualisations also to see examples of
maybe things that we can try and avoid. This is always the part of this
presentation that I feel uneasy about, but I think it's just worth having a look
at an example so we'll have a quick look at this one. So what this is talking
Page 12 of 19
about is average household debt in America by this person who is a financial
data journalist.
It's how much debt you have. It's an infographic. So the first thing I looked
at when I saw this is we've got some sort of thing that looks like a
visualisation, and I tried to work out what it's telling us. I looked at it and I
thought, well, why are some people green and some people - is it the green
ones have less debt? No. All different sizes. They've - I realised that it
probably doesn't mean anything. It's just decoration, so we can move on.
So the next thing is the total owed by the average. We see credit cards are
16,000, mortgages are almost 10 times that amount, but the mortgages
aren't actually 10 times as long in the specialisation. 28,000 is a lot longer
than 16,000.
So there's clearly no clear scale - well, I should just say there's no clear scale
on this. Once again, we've got different colours but, once again, they seem
just for decoration. The other thing is I couldn't understand why any type of
debt is 134,000 while mortgages are 176,000. So it wasn't quite clear what
any type of debt meant. Also credit cards and auto loans were lumped
together with mortgages which are more of an asset and some people
differentiate between things like mortgages, which they classify as good
debt and things like auto loans which are classified as bad debt.
The next one is how much does debt cost you. This probably one of the
better ones, but there's no - given that she's used comparative scales in the
previous ones I was surprised that there wasn't any comparative scale. I
think one thing I did notice here was that this figure from memory didn't
really add up. This was an interesting one, medical debt on the rise. There
were a few issues with this but one of the things we notice is if that's 63 per
cent then that one is about 37 per cent and yet that 37 per cent segment
actually looks a bit bigger than the 42 per cent segment.
Considering that halfway across would be 50 per cent I don't think that that
42 per cent is accurately reflected in the pie chart. I won't go into the
colours that have been chosen or talk much more about pie charts. A lot of
Page 13 of 19
people have very strong opinions about how useful pie charts are. We now
come to debt broken down by age. In this one it actually looks as though
the colours may be meaningful because they're two red bars, two orange
bars, and two green bars, but once again it just seems that the colours were
arbitrarily chosen. That's all I'll say about that, but - except to say I do think
- have a look at examples and always look critically.
Look critically at your own work at things that can be improved. But also
when looking at other things think about, okay, is this a good visualisation?
Is it a bad one? When you see something that looks good what makes it
look good? When you see something that looks okay maybe think, how
could it be improved? What could this person have done to make the story
clearer? So what are some techniques that you can use when doing a
visualisation that will make it better for the people looking at it? One of the
first ones I talk about is natural mappings. What we're looking at here is
what's called a wind rose. What this is showing is wind in eight - not
quadrants but eight sectors - and how windy it is.
So this is Melbourne Airport that we're seeing here and we see that most of
the winds at Melbourne Airport are northerly. These are the averages taken
over a particular period. As we go out in this telescope it shows us stronger
and stronger winds. So, for example, we hardly ever have, let's say, gale
force winds in this south-westerly direction. There's very few easterlies at
Melbourne Airport. But the natural mapping is if it's facing upwards then
we can see straightaway it's a northerly wind. We've seen this graph before,
but the important thing is to highlight relevant information.
So if all five of these lines were the same colour it wouldn’t be quite clear
what the story's telling us, but it - given that this one is highlighted and the
others are muted we can see straightaway it - our focus shifts to this one.
The next thing, make comparisons clear. So what this is comparing is arctic
ice. This is going back to 1879 and it's comparing the - as we're progressing
into the present. One of the things we see is it seems pretty clear that
Page 14 of 19
there's less and less arctic ice as we're coming into the present. By
overlaying those plots one on top of the other it makes it a lot clearer.
Going back to this graph we see once again by plotting all these different
attributes on the same set of vertical axes it makes those comparisons much
clearer. So, for example, when we're comparing highly likely to very good
chance we can see quite clearly how they compare. The next thing is in this
case it's probably exaggerated but make the scale clear. This is showing the
stations in Australia that record - it's showing basically the largest difference
between two days - so between the maximum temperature on day 1 and
day 2. So at these stations there was a 25 degree or 27 degree difference.
So one day the maximum temperature was 10 degrees and the next day 37
degrees, for example.
As we went further north there's less difference between successive days in
temperature in terms of their records. Yet another visualisation, this time of
space, and we've got a very different scale here. It's probably hard to read
on the slide, but that distance there is 100 million light years across. So a
light year is pretty big. 100 million light years is 100 million times as big.
Finally, colour should add meaning and not detract. We come back to this
slide, which is how much Australia has - or the warming trend in Australia
since 1970. Here clearly colour is enhancing the meaning of what we're
trying to say here.
Use conventions. If we look at this time series of temperature, at first look it
may seem that temperature is actually declining. This is just a dummy slide I
created for this presentation. What I've done here is these temperatures
are actually - if we look carefully at these numbers we see the numbers are
actually decreasing as we go from left to right. Normally when we read from
left to right we expect time to increase - in other words, get either closer to
the present or further into the future. By turning it around we've defied
that convention and then obviously made this a whole lot harder to read.
There's a lot of ways to display different dimensions, and I'll just - sorry. I'll
just go there and I'll just skip this for the moment. We'll go back to it if
Page 15 of 19
we've got a bit of time. So here's another slide showing how we can plot
dimensions very differently. In this graph or in this visualisation what we've
done is this is temperature in Africa but across a range of latitudes going
from 30 south to 30 north. So the Y axis is latitude. The X axis is the month
of the year. The actual colours depict the rainfall during those months. So
what we see here is in the southern latitudes we get rainfall around
December/January/February.
As we go north of 20 degrees north it's very dry and around about 10
degrees north they get mostly a winter rainfall. This way of plotting data is
known as a Hovmöller plot. These are called Chernoff faces and what this
does is allows us to plot multidimensional data by using faces. So Chernoff
said people, their brains are hardwired to really recognise faces quickly. So
what we can do is we've got about seven or eight different attributes we can
change. We can change the smile on their mouth. We can change the
length of their nose, the distance between their eyes, the amount by which
eyebrows are raised and so on.
So we've taken a dummy data set here comparing different universities,
different people across the universities, and then we've said, okay, we'll use,
for example, the eye colour to show how - where they are, [of data for
sharing] and maybe the length of the nose to show awareness of data
licensing, et cetera. So basically, it's a novel way of displaying data with a
high number of dimensions. As I keep saying, it's always good to break the
rules. Some people may be familiar with this image. It's called pale blue
dot. If you're not familiar it's a visualisation - well, I guess any image can be.
But what it's showing is over there there's a pale blue dot. This photograph
was taken by Voyager 1 from out of space - well, from space. That pale blue
dot there, almost single pixel down there, is Earth. So often we're told to
make the data we're displaying significant and obvious. In this case the
strength of this visualisation comes from how insignificant that tiny little dot
on that photograph is, how insignificant this huge planet that we live on is.
Da Vinci has said, simplicity is the ultimate sophistication.
Page 16 of 19
I've got a few things in my slides. I'll just go back to the slide that I was
trying to find earlier. Which one did we - for some reason - what I'll do is I'll
rewind that. So - okay. So what we're going to see is how Australia's
temperatures changed for the 12 months ending December 1910. I'll just
maximise this. This is an animation. The colour shows the year and we see
as we're coming more and more to the present the colours spiralling
outward, representing warming. So I guess what makes this visualisation
effective is not only the animation but also the fact that we were able to
draw a line which shows about 100 years of data which typically would have
been a very long line but in this case by wrapping it around the inner circle
we were able to show it all in one compact way.
So finally, all visualisations are wrong. What do I mean? There's a famous
quote from George Box, the statistician, that said, all models are wrong. He
said, all models are wrong. The only question of interest is is the model
illuminating and useful. I've changed that to, all visualisations are wrong.
The question is is the visualisation illuminating, useful, and does it have
integrity? Thank you.
Gerry Ryder: Thank you so much, Martin, for that really valuable presentation that I'm
sure has given us all a lot of ideas and some things to look forward to in the
next webinar where we'll actually see some of the tools that you've used to
create these examples. We do have time for questions if we have anyone in
the audience that would like to ask Martin a question about anything he's
presented on today. Please do put it into the question pod and I'd happily
relay that and put Martin on the spot. So we've got a number of people
thanking you, Martin, for a really interesting talk. We have got one
question, Martin, from [Mark Mackay] who's asked if you could suggest any
textbooks or papers that he could share with students.
Martin Schweitzer: Yes, I do, quite a few. I've actually put them in the slides. So at the end of
the slides there's some references. I believe the slides are going to be made
available, Gerry.
Page 17 of 19
Gerry Ryder: Yes. That's correct. We'll have both the slides up as well as the recording
up. So you can have a look at the slides separately to the recording.
Gerry Ryder: Another question, Martin, can you provide the name of the visualisation
with the faces. Somebody's obviously liked that one.
Martin Schweitzer: Chernoff faces, C-H-E-R-N-O, either V or F-F.
Gerry Ryder: So perhaps we might put them - Susannah, we might be able to pop that in
the question box for people to see, C-H-E-R-N-O-V or F-F. Someone's -
Richard's asked, Martin, you've used Jupyter Notebooks. He's pre-empting
the next webinar. What sort of other technologies do you normally use to
build visualisations? Another question related about open source software
for visualisations. So I know we'll cover that in the next webinar, but
perhaps a teaser today, Martin.
Martin Schweitzer: So definitely Jupyter Notebooks and Python. So the next webinar will focus
largely on Python. I also do a lot of work with web front-ends and
JavaScript. So if somebody's working with JavaScript there's a huge array of
visualisation tools but probably if one - if you don't mind a steep learning
curve and want to be able to do absolutely everything, E3.js is the go to one
and it's open source.
Gerry Ryder: Thank you, Martin. Someone wants you to - [Jacinta] wants you to look in a
crystal ball and asks, what do you see is the future direction of data
visualisation?
Martin Schweitzer: Wow. I think the - what's happening is we're getting to things with higher
and higher resolution. We're going to more dimensions so we've got the
three dimensional static flatwork. We move to two dimensional animation
with the web. One of the things that's becoming popular is virtual reality, so
people can put on some glasses and maybe see storms being - the data for
the storm being visualised but in their own surroundings. So what does it
feel if a rain - and that actually gets us on to the next one which is
augmented reality.
Page 18 of 19
So I can look around at Monash University or let's say I could go down to St
Kilda Beach and see what it's going to look like maybe in 100 years with the
sea level rising two feet or 10 feet or something like that. So both exciting
and scary.
Gerry Ryder: As technology changes tend to be. We do have a couple more minutes if
there is any other final questions for Martin. So Lisa is interested in the
relationship between storytelling and data and the idea of integrity and
worries about collecting data to suit a story and there being a lack of rigour
and accountability. I guess that's a comment more than a question, but you
might like to respond to that, Martin.
Martin Schweitzer: I think it's a - integrity is always in the mind of the beholder, so that you
can't - data cannot have integrity. The people using and presenting the data
need to have integrity. They need to present the data with integrity. I
would say any tool that can be used for good can also be used for evil. So,
yes, people can create visualisations that try and push an agenda or push a
point, et cetera. Hopefully by being more critical of visualisations we can
actually see those ones where somebody is trying to push something which
isn't true.
That's why I also push for integrity in data that as soon as we show a
visualisation that, let's say, only shows 30 years of data where maybe, let's
say, temperatures have been decreasing immediately it puts a cloud over
everything that person is saying because why have they picked that one 30
year period where the temperature was dropping? So I think in the long run
it pays to be as honest as one can about data.
Gerry Ryder: A final question today, thanks, Martin. Is there a common standard for
colour coding for general use in data visualisation?
Martin Schweitzer: A very simple and short answer, no, absolutely not. However, there is a
website called ColorBrewer - actually, it's called ColorBrewer 2, so colour is
spelt the American way, and brewer like somebody who brews. I would
recommend anybody looking for a good set of colours to go there first.
Page 19 of 19
[There are tools] for visualisation. [We'll] actually use the - so it was written
by a researcher called - her last name is Brewer and she's done a lot of
research into colour and how to use it well.
Gerry Ryder: Great. I'd like to thank now Martin for his presentation today and also
acknowledge Susannah who's been quietly sitting in the background
responding to your questions and making sure the webinar runs smoothly.
So thank you all today and have a great afternoon.
END OF TRANSCRIPT

Contenu connexe

Similaire à Transcript - Data Visualisation - Design and Principals

Closing Plenary: National Digital Forum
Closing Plenary: National Digital ForumClosing Plenary: National Digital Forum
Closing Plenary: National Digital ForumGeorge Oates
 
Big Ways Data Can Play a Role in Your Relocation Program
Big Ways Data Can Play a Role in Your Relocation ProgramBig Ways Data Can Play a Role in Your Relocation Program
Big Ways Data Can Play a Role in Your Relocation ProgramUrbanBound
 
122707 Virtual Interoperability Metanomics Transcript
122707 Virtual Interoperability Metanomics Transcript122707 Virtual Interoperability Metanomics Transcript
122707 Virtual Interoperability Metanomics TranscriptRemedy Communications
 
Essay On Benjamin Animal Farm. Online assignment writing service.
Essay On Benjamin Animal Farm. Online assignment writing service.Essay On Benjamin Animal Farm. Online assignment writing service.
Essay On Benjamin Animal Farm. Online assignment writing service.Umon Kinneberg
 
Maneesha Palakurthi
Maneesha PalakurthiManeesha Palakurthi
Maneesha Palakurthibadgermole
 
A revew of basic statistics concepts ch 1
A revew of basic statistics concepts ch 1A revew of basic statistics concepts ch 1
A revew of basic statistics concepts ch 1Abdulkadir Jibril
 
New consumers and new business opportunities
New consumers and new business opportunitiesNew consumers and new business opportunities
New consumers and new business opportunitiesJames Boardwell
 
Fundamentals of Design for Non-Designers
Fundamentals of Design for Non-DesignersFundamentals of Design for Non-Designers
Fundamentals of Design for Non-DesignersNathaniel Jeffrey
 
Bitcoin Cryptography: Simply Explained
Bitcoin Cryptography: Simply ExplainedBitcoin Cryptography: Simply Explained
Bitcoin Cryptography: Simply ExplainedMateuszFaltyn
 
Sociology Paper Example. Sociology Research Pape
Sociology Paper Example. Sociology Research PapeSociology Paper Example. Sociology Research Pape
Sociology Paper Example. Sociology Research PapeMelanie Smith
 
The Next Big Thing in Big Data
The Next Big Thing in Big DataThe Next Big Thing in Big Data
The Next Big Thing in Big DataPentaho
 
Edge Talks November 2016: Fixing Patient Flow Transcript
Edge Talks November 2016: Fixing Patient Flow TranscriptEdge Talks November 2016: Fixing Patient Flow Transcript
Edge Talks November 2016: Fixing Patient Flow TranscriptNHS Horizons
 
The End of the World as We Know It
The End of the World as We Know ItThe End of the World as We Know It
The End of the World as We Know ItAndrea Resmini
 
The End of the World as We Know It @Andrea Resmini [Jönköping University]
The End of the World as We Know It @Andrea Resmini [Jönköping University]The End of the World as We Know It @Andrea Resmini [Jönköping University]
The End of the World as We Know It @Andrea Resmini [Jönköping University]WUD Milan
 
Mba Admission Essay Buy Length Mba Admission Es
Mba Admission Essay Buy Length Mba Admission EsMba Admission Essay Buy Length Mba Admission Es
Mba Admission Essay Buy Length Mba Admission EsJessica Simms
 
Closing Plenary: Museums and the Web Asia
Closing Plenary: Museums and the Web AsiaClosing Plenary: Museums and the Web Asia
Closing Plenary: Museums and the Web AsiaGeorge Oates
 

Similaire à Transcript - Data Visualisation - Design and Principals (20)

Closing Plenary: National Digital Forum
Closing Plenary: National Digital ForumClosing Plenary: National Digital Forum
Closing Plenary: National Digital Forum
 
Big Ways Data Can Play a Role in Your Relocation Program
Big Ways Data Can Play a Role in Your Relocation ProgramBig Ways Data Can Play a Role in Your Relocation Program
Big Ways Data Can Play a Role in Your Relocation Program
 
122707 Virtual Interoperability Metanomics Transcript
122707 Virtual Interoperability Metanomics Transcript122707 Virtual Interoperability Metanomics Transcript
122707 Virtual Interoperability Metanomics Transcript
 
Essay On Benjamin Animal Farm. Online assignment writing service.
Essay On Benjamin Animal Farm. Online assignment writing service.Essay On Benjamin Animal Farm. Online assignment writing service.
Essay On Benjamin Animal Farm. Online assignment writing service.
 
Maneesha Palakurthi
Maneesha PalakurthiManeesha Palakurthi
Maneesha Palakurthi
 
A revew of basic statistics concepts ch 1
A revew of basic statistics concepts ch 1A revew of basic statistics concepts ch 1
A revew of basic statistics concepts ch 1
 
New consumers and new business opportunities
New consumers and new business opportunitiesNew consumers and new business opportunities
New consumers and new business opportunities
 
Fundamentals of Design for Non-Designers
Fundamentals of Design for Non-DesignersFundamentals of Design for Non-Designers
Fundamentals of Design for Non-Designers
 
Bitcoin Cryptography: Simply Explained
Bitcoin Cryptography: Simply ExplainedBitcoin Cryptography: Simply Explained
Bitcoin Cryptography: Simply Explained
 
Statistics and probability
Statistics and probabilityStatistics and probability
Statistics and probability
 
Sociology Paper Example. Sociology Research Pape
Sociology Paper Example. Sociology Research PapeSociology Paper Example. Sociology Research Pape
Sociology Paper Example. Sociology Research Pape
 
The Next Big Thing in Big Data
The Next Big Thing in Big DataThe Next Big Thing in Big Data
The Next Big Thing in Big Data
 
Skills booklet
Skills bookletSkills booklet
Skills booklet
 
Urban lenses
Urban lensesUrban lenses
Urban lenses
 
Edge Talks November 2016: Fixing Patient Flow Transcript
Edge Talks November 2016: Fixing Patient Flow TranscriptEdge Talks November 2016: Fixing Patient Flow Transcript
Edge Talks November 2016: Fixing Patient Flow Transcript
 
The End of the World as We Know It
The End of the World as We Know ItThe End of the World as We Know It
The End of the World as We Know It
 
The End of the World as We Know It @Andrea Resmini [Jönköping University]
The End of the World as We Know It @Andrea Resmini [Jönköping University]The End of the World as We Know It @Andrea Resmini [Jönköping University]
The End of the World as We Know It @Andrea Resmini [Jönköping University]
 
Mba Admission Essay Buy Length Mba Admission Es
Mba Admission Essay Buy Length Mba Admission EsMba Admission Essay Buy Length Mba Admission Es
Mba Admission Essay Buy Length Mba Admission Es
 
Legacy 1cptnotesize
Legacy 1cptnotesizeLegacy 1cptnotesize
Legacy 1cptnotesize
 
Closing Plenary: Museums and the Web Asia
Closing Plenary: Museums and the Web AsiaClosing Plenary: Museums and the Web Asia
Closing Plenary: Museums and the Web Asia
 

Plus de ARDC

Introduction to ADA
Introduction to ADAIntroduction to ADA
Introduction to ADAARDC
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and StandardsARDC
 
Data Sharing and Release Legislation
Data Sharing and Release Legislation   Data Sharing and Release Legislation
Data Sharing and Release Legislation ARDC
 
Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)ARDC
 
Investigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveInvestigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveARDC
 
NCRIS and the health domain
NCRIS and the health domainNCRIS and the health domain
NCRIS and the health domainARDC
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataARDC
 
Clinical trials data sharing
Clinical trials data sharingClinical trials data sharing
Clinical trials data sharingARDC
 
Clinical trials and cohort studies
Clinical trials and cohort studiesClinical trials and cohort studies
Clinical trials and cohort studiesARDC
 
Introduction to vision and scope
Introduction to vision and scopeIntroduction to vision and scope
Introduction to vision and scopeARDC
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things dataARDC
 
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian DuncanARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian DuncanARDC
 
Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128ARDC
 
Research data management and sharing of medical data
Research data management and sharing of medical dataResearch data management and sharing of medical data
Research data management and sharing of medical dataARDC
 
Findable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) dataFindable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) dataARDC
 
Applying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and ChallengesApplying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and ChallengesARDC
 
How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018ARDC
 
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintReady, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintARDC
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataARDC
 
Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018ARDC
 

Plus de ARDC (20)

Introduction to ADA
Introduction to ADAIntroduction to ADA
Introduction to ADA
 
Architecture and Standards
Architecture and StandardsArchitecture and Standards
Architecture and Standards
 
Data Sharing and Release Legislation
Data Sharing and Release Legislation   Data Sharing and Release Legislation
Data Sharing and Release Legislation
 
Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)Australian Dementia Network (ADNet)
Australian Dementia Network (ADNet)
 
Investigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspectiveInvestigator-initiated clinical trials: a community perspective
Investigator-initiated clinical trials: a community perspective
 
NCRIS and the health domain
NCRIS and the health domainNCRIS and the health domain
NCRIS and the health domain
 
International perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research dataInternational perspective for sharing publicly funded medical research data
International perspective for sharing publicly funded medical research data
 
Clinical trials data sharing
Clinical trials data sharingClinical trials data sharing
Clinical trials data sharing
 
Clinical trials and cohort studies
Clinical trials and cohort studiesClinical trials and cohort studies
Clinical trials and cohort studies
 
Introduction to vision and scope
Introduction to vision and scopeIntroduction to vision and scope
Introduction to vision and scope
 
FAIR for the future: embracing all things data
FAIR for the future: embracing all things dataFAIR for the future: embracing all things data
FAIR for the future: embracing all things data
 
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian DuncanARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
 
Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128Skilling-up-in-research-data-management-20181128
Skilling-up-in-research-data-management-20181128
 
Research data management and sharing of medical data
Research data management and sharing of medical dataResearch data management and sharing of medical data
Research data management and sharing of medical data
 
Findable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) dataFindable, Accessible, Interoperable and Reusable (FAIR) data
Findable, Accessible, Interoperable and Reusable (FAIR) data
 
Applying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and ChallengesApplying FAIR principles to linked datasets: Opportunities and Challenges
Applying FAIR principles to linked datasets: Opportunities and Challenges
 
How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018How to make your data count webinar, 26 Nov 2018
How to make your data count webinar, 26 Nov 2018
 
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintReady, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
 
How FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of dataHow FAIR is your data? Copyright, licensing and reuse of data
How FAIR is your data? Copyright, licensing and reuse of data
 
Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018Peter neish DMPs BoF eResearch 2018
Peter neish DMPs BoF eResearch 2018
 

Dernier

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...Amil baba
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Pooja Bhuva
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...Nguyen Thanh Tu Collection
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxUmeshTimilsina1
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxPooja Bhuva
 

Dernier (20)

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
Beyond_Borders_Understanding_Anime_and_Manga_Fandom_A_Comprehensive_Audience_...
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 

Transcript - Data Visualisation - Design and Principals

  • 1. [Unclear] words are denoted in brackets Webinar: Data Visualisation – Design and Principles 22 March 2018 Video & slides available from ANDS website START OF TRANSCRIPT Gerry Ryder: So good afternoon, everyone, and welcome to the webinar today. My name is Gerry Ryder and it's my pleasure today to host this webinar about data visualisation. It's my pleasure to introduce Martin Schweitzer. Martin's currently working with ANDS as a data technologist. He has a background in computer science and a particular interest in data visualisation, data science and user interface design. He has a very professional background which includes photography, working on large IT systems, lecturing, as well as running workshops and training courses. Martin is currently seconded to ANDS from the Bureau of Meteorology where he's largely responsible for the climate record of Australia. Today Martin is presenting for us the first in a series of two webinars focused on data visualisation. This first webinar will focus on visualisation design and principles while the second will focus on tools and techniques. So having covered off on those introductions it's my pleasure now to handover to Martin for our presentation today. Thank you, Martin. Martin Schweitzer: Thanks very much, Gerry, and hello, everybody. I'll just jump straight in. So when asked to present a series on visualisation the first question, I guess, that everybody will have asked is what is the visualisation. I wanted it to be slightly broader than just presenting graphic data so my definition of the
  • 2. Page 2 of 19 visualisation is that it's a visual explanation. It's anything that helps us understand something by looking at it. A typical example is something that should be familiar to most people, a map of the underground. One of the things that makes this a good visualisation is it helps show the relationships between the different objects inside this and how people in this case understand how to get from one point to another. If you're trying to imagine looking at a text description of how to get, for example, from Edgware Road to Blackfriars, it would be particularly complex, particularly if, for example, somebody told you that Tottenham Court Road is closed. One of the things that make this visualisation famous was that the designer discovered that when you're underground it's really only just the relationships that mattered. The actual exact geographical location is a lot less interesting, and that can be seen in this visualisation. So we'll just have a look at this. This shows the actual place on the map and then it morphs to what it looks like on the underground map. It just cycles through what the locations really look like and the underground map. So once again a beautiful visualisation of how the underground map actually maps to the real locations in London. Yet another example, often for people who may be a bit 3D challenged would be familiar - many people would be familiar with this IKEA visualisation that shows us the correct way to construct a bookcase. Why are visualisations important? Why don't we just have text description? We have a lot of descriptive statistics. Well, one of my favourite examples and something that really made my hair stand on end the first time I saw it was this thing called Anscombe's quartet. Many people may be familiar with this. It's a famous example. What we have here are four data sets: one, two, three and four in Roman numerals. Each one is a series of X and Y values. Just looking at them it's very hard to read much into them, but we can look at their summary statistics and - sorry - for example, they all have the same average value for the X. They all have the same average value for the Y. The sample variance of both the X and Y is the same in all four of them. The correlation between the X and Y is almost
  • 3. Page 3 of 19 identical in all four of them. The linear regression is exactly the same. So a statistician may be tempted to just say, well, these numbers are pretty much the same. However, as soon as we look at the visualisation - in other words see the values plotted - we see something quite different. So just one example of how seeing a visualisation is very different to looking at the raw data. Another example that I've taken is - we'll just go to some text and we will have a look at a file. This is the contents of a file. As you can see, it's probably not easy to interpret what's in the file. Most files when they're stored on a disc are just bunches of numbers. If I told you that these numbers represent RGB values arranged according to an XY grid, once again it may not be obvious what the numbers represent. However, if I do this and present them as an image - excuse me - suddenly we see, okay, we have an image. So as numbers the numbers meant - or as data the numbers meant absolutely nothing. However, as soon as we visualise it as an image it will make sense. So as Gerry mentioned, I've been interested in visualisation for a very long time. In fact, over 20 years. One of the first books I came across was by Edward Tufte and was one of the seminal works. At that time I think it wasn't really realised that it would become a seminal work. He wrote a book called The Visual Display of Quantitative Information. In it he says, excellence in statistical graphics consists of complex ideas communicated with clarity, precision and efficiency. For the rest of the presentation I'm going to try and expand some of these ideas. So he came up with a few principles. The first one is graphical displays should show the data. We'll go through these principles first and then we'll look at examples. It should induce the viewer to think about the substance rather than about methodology, avoid distorting what the data has to say, present many numbers in a small space, make large data sets coherent, and reveal the data at several levels of detail from a broad overview to the fine structure.
  • 4. Page 4 of 19 In this book he's got many fine examples; however, I've tried to find more modern examples and I've taken some of the examples from the work that I do. So the first one, show the data. What we're looking at here is a rainfall map of Australia and the government instituted a plan where they said they would give farmers concessionary loans if they were in a region that had suffered a one in 10 year rainfall deficiency or one in 20 year rainfall deficiency. So the map we're seeing here is a map where users can typically zoom in and out, but what we've done is to show only those areas where - that are affected or covered by this concessionary loan. So I guess one of the things is we could have shown a typical rainfall map, but ideally make this simple as possible and show only the data, so the pink and red areas are the areas that had been affected by either a one in 10 or one in 20 year rainfall deficiency. Next, induce the viewer to think about the substance rather than about the methodology. So what we're looking at here is in Kyoto, Japan cherry blossoms are a big thing. In Kyoto they've been recording the peak of the cherry blossom season since the year 800. So they have over 1000 years of data. What somebody's done is to plot all this data. What we see is that for about a century they pretty much peak between 10 and 20 April. However, since the early twentieth century they start blossoming earlier and earlier and a lot of people would say, well, this is a signal of climate change. However, what we wanted to show about this graph is that the person has plotted the actual data points using a little image of cherry blossoms which is quite cute. But they also noted in an article they wrote about it that initially they had plotted it with a cherry blossom with six petals until somebody pointed out that cherry blossoms only have five petals. The point about that is if people are thinking about how many petals the cherry blossoms have rather than about what the graph is saying maybe they should have thought more about the substance than the methodology. But nonetheless, I think with any of these rules often it's a good thing to
  • 5. Page 5 of 19 break a rule now and again because in this case, for example, I certainly remembered this graph long after I'd seen it because I remembered the issue with the cherry blossoms. The next one was avoid distorting the data and here we're going to do something exciting and that's do it live. So what I've done is we're now seeing what's known as Jupyter Notebook. I imagine a lot of people would be familiar with Jupyter Notebook. Jupyter Notebook allows us to run Python code and in the next webinar the whole webinar will be based around looking at our work in Jupyter Notebook; however, this is a small demo that I've got in this presentation. What we're looking at here is storage levels in the dams that are around Melbourne. So the first graph I'll pull up I'll just - so this is fantastic at work. What we see in this graph is it looks like the Thomson, Cardinia and Upper Yarra dams are really low and all the rest of them are almost full. So we may worry a bit about that. However, when we look at this graph we see that we started - the base of it was 60 per cent full. So Cardinia, for example, is - well, let's take Thomson. It's actually almost 65 per cent full so it's really not that bad. When we look at the graph plotted against - starting at zero we note as well it doesn't look that bad. We may also look at this and say, well, the other dams are all over 80 per cent so we've got nothing to worry about. However, not all these dams are the same size, so looking at only the percentage can be a bit misleading. So let's run this one. What we see here is that the amount of space in the Thomson Dam, there's probably not enough water in all of these smaller dams to even fill that gap that's in the Thomson Dam. So that's what we mean when we say avoid distorting the data. Try and make sure that we're telling a story with integrity. The next principle was to present many numbers in a small space. The map that we're looking at here is Australian rainfall deciles. So this is that - the areas that are in this bright red have received the least rainfall this December, they're in the lowest one per cent of December rainfalls.
  • 6. Page 6 of 19 These tiny dark blue patches are in the highest one per cent of rainfall that - this record goes back to 1910 so they take every year from 1910. We say present many numbers in a small space. So what we're looking at here is a grid, and they're roughly 640 by 800 grid cells. So each one is calculated and for each one there's 117 years of data. So what we're looking at is almost 36 million data points; however, we've condensed those 36 million data points into one, well, simple map. So I think this is a fantastic example of presenting many numbers in a small space. Sometimes, as I said, we want to break the rules and get something where we break the rules. This was the recent tropical cyclone. We've got a visualisation that shows the current position of the cyclone. This arguably is just one data point; however, it's a really important data point, particularly if you're living in the north of Western Australia and you want to know how close the cyclone is or whether it's got a chance. Also we can - by clicking on that one point we see a far more detailed image which then takes us into seeing the data at different levels. The next one was around making large data sets coherent. This is something that at the bureau we're very interested in. How do you communicate things like probability? When people hear almost certainly do they think that an event is more probable or less probable than if they hear highly likely or if they hear very good chance? So what they've done here is taken all these terms and presented them using a technique known as KDE on one graph. So we can very easily compare that, for example, if somebody says, chances are slight, that people think that there's actually slightly more chance of an event happening than if we, for example, say, it's highly unlikely, or if we say, there's almost no chance. So that covers off on Tufte. The next few slides are some of my ideas and some of my experience in developing visualisations and somethings that I feel are important. One of the most important things in any visualisation is that you actually have something interesting to talk about the data. Whenever I see somebody saying, we've got this data, it looks pretty boring. Can we just create a
  • 7. Page 7 of 19 visualisation, well, that's when the hairs on my neck prickle a bit. So this is a famous video. It started off as a TED Talk by the Swedish Hans Rosling. [Video playing] Martin Schweitzer: Okay. I think people get the idea. Now, one of the things that strikes me about that video is talking about inequality, et cetera, and gave this TED Talk. At a similar time, Thomas Piketty, who was famous for his book on capitalism, also gave a TED Talk. I watched both talks. Both were equally impressive. I thought Piketty's was the more impressive. However, Rosling's - the one you've just seen - got 10 times as many views roughly as Piketty's, and I think the real reason it got so many views was because it had such a story here. It had such remarkable visualisation and graphics. So it certainly says that it's important. Obviously Rosling is a very - or was a very impressive storyteller and was just a very impressive presenter and so did it really well. Of course, not all of us have his talents; however, we can all do good or great visualisations. So here's a simpler graphic and this one shows the trend in maximum temperatures from 1970 to 2016. So wherever the graph is red the average maximum temperature has been increasing and wherever the graph is blue the maximum temperature has been decreasing over the years. I think this one tells quite an alarming story. Here's another visualisation and this one I've got three slides which show a progression of how we're trying to convey something. So in the first slide the person has just taken the data and they've put it - this is rainfall data. They've started at 1900 and showed how much rainfall up to years 2010. Now, there are two large influences on rainfall. One is the ENSO which is - often we hear that in a La Niña system or an El Niño system. The other one is what is marked as IOD which is Indian Ocean Dipole. Once again, these can be either positive or negative. So we've got two, four, six, seven different colours in the graph showing that when this rainfall fell what kind of system we were in. However, this doesn't
  • 8. Page 8 of 19 really tell a good story. if we look at it having been rearranged we see that the blue lines on the right when - all the years where we had a lot of rainfall all tended to be where we had a La Niña and a negative Indian Ocean Dipole. The red and brown on the left were during generally El Niño years. However, we can improve this as well because we've got seven different things. We have to keep looking at the colours, move forwards and backwards. So here's a graph where what we've done is we've plotted the IOD along the bottom going from negative to positive. We've plotted the ENSO along the left-hand side. So these numbers in the top right we can see had a strong ENSO signal, strong La Niña, and a positive IOD, while these numbers to the left had a - sorry. These are the La Niña and the negative IOD. We can see as it gets stronger how it affects the rainfall. Here's another graph which also tells quite an alarming story. This is the water supply in Cape Town and in 2013/14 we can see they typically get their rainfall in winter. So around about - from October onwards the dam levels start falling. Because for about the last five years they haven't been - there hasn't been good rain, they've continually been falling each year progressively. That's 2013, 2014, 2015, 2016, up till this year which is 2017/18. We see when I pulled up this graph it was between January and February and we were over there and they were projecting that around April/May/June Cape Town could run out of water. There were a few projections. One is if people use 600 megalitres a day of water, one with 500. One is if they were using 600 megalitres and they've started up desal plants so what would happen. All of them show pretty dire consequences. A visualisation like this really does tell a story. So the next principle is keep your graph as simple as possible. I've made a very quick 3D graph. I've just made a fictitious one, which is how many people attended at morning teas and maybe the person that attends the most morning teas at the end of the year gets a prize and the person who has attended the least gets a wooden spoon. So this was my first graph and I felt, well, this can always be improved. Whenever I see
  • 9. Page 9 of 19 a 3D graph if it's not displaying 3D data I'm a little bit disturbed. So I modified it so we've now got a 2D graph. However, the numbers are in the [box]. We probably don't need those grids and as many of them. We certainly don't need our dotted and solid line grids, so I cleaned that up a bit. So there's a simpler graph. However, when looking at that graph - and often I see graphs like this - the first question I ask is, what do those colours mean? Why are there different colours? Well, in this case the colours mean absolutely nothing, so I've got rid of the colours. The next thing is getting back to this idea maybe of telling a story. What am I trying to say? Well, really what I'm trying to do is find out who attended the most and least morning teas. So maybe by improving the graph, well, I've now put the least - I've ordered them from least to most and now it's quite obvious who's attended the least and who's attended the most. So is there anything else we can do to make this presentation simpler or to remove any unnecessary data et cetera? This is a trick question, but of course there is. Well, in this particular case I think we can just remove the graph altogether. I don't think that that visualisation has given us any more information than simply looking at a table of numbers. The table remains ordered. I get exactly that same information. So it's probably important to ask that question occasionally. Do we really need a graph for this data, or do we really need a visualisation for this data? I think Antoine de Saint-Exupery said it best when he said, perfection is achieved not when there's nothing more to add but when there's nothing left to take away. However - this was a however - Einstein was apparently famous for saying, make it as simple as possible but no simpler. So here's another example of a visualisation. This is called a skew-T log-P graph, and this is used by meteorologists every single day. Temperature is on these diagonals. The pressure is going along this way. The reason it's called log-P is because at the bottom you see the gap between 100 - 900 and 1000 is much smaller than the gap between 200 and
  • 10. Page 10 of 19 300. So even the scale appears to be changing. There are two different colour lines. Each of those lines has a meaning. The red line is what was recorded today and the blue yesterday. In case - well, I imagine most people aren't familiar with these graphs. So what this is actually plotting is at a lot of locations around the world they send up weather balloons or sondes. So this is plotting the temperature as the balloon is moving up through the atmosphere. So we can see that it's getting cooler, et cetera. The second line is the dew point. So we can see, for example, if the dew point crosses the temperature we're going to get precipitation or rainfall and so on. On the right-hand side we've got another particularly interesting thing being visualised here and these are called wind barbs. The direction of the wind barb shows the direction of the wind. So these ones pointing upwards show northerly wind. The number of feathers shows the speed of the wind. So the short ones are five knots. The long one is 10 knots. A long and a short is 50 knots and so on. I won't go too much into this. But the fact is that for a meteorologist, this is a really important graph. It's not as simple as a bar chart or a line graph, et cetera, but it's serving its purpose. That's the most important thing. A visualisation has to be fit for purpose. The next thing we'll look at is colour. I'm not going to go into colour in a lot of detail, the main reason being because you can spend hours talking about colour to really understand it thoroughly. I've got a few suggestions, but the most important one, I think, is for colour if it's important please try and find somebody who's an expert. There are lots of different factors to consider, things like colour blindness, common conventions, cultural differences and so on. This is just a very simple example. These are from images of blood travelling through an artery. This one - well, basically they showed these different images to a lot of doctors and asked which one they preferred. Most doctors came up with this A. However, when they asked people to diagnose the issues with these things they were - then I think the best one was F or G where they were able to identify the most issues or see the most problems
  • 11. Page 11 of 19 with a patient. So even though they thought that this one was the easiest one to read - the colourful one - admittedly they were used to those colours, et cetera. It's not always the case. The reason I'm saying this is it really does say that colour can be a tricky issue and that really it does need some expertise and, in this case, it was actually through some research. In the slides there's a reference to this paper that talks about this. It's quite an interesting paper. Just on the topic of colour, here are some examples from the bureau once again. This one is showing rainfall. It's using a gradated scale, so darker means more rainfall. They've used the colour blue which makes sense because the more saturated blue tends to show areas that have more saturation in terms of rainfall. This map is not showing how much rainfall but it's showing how variable the rainfall is, in other words, how much it differs from year to year. So it wouldn't have made sense to use blue here because some areas can be very dry but at the same time have a lot of variability of - or have very little variability. Areas that may be very wet may have a very small variability because they're wet all year round, just as areas that are wet all year around have low variability. So this one is showing - that's chosen a different colour for this one. This one is showing how much rainfall in this case fell in the week of 23 January. This is using a scale that people who are looking at this type of map are familiar with. The white areas have had no rainfall or not been able to record it, and these dark colours are the areas of the highest rainfall. Once again we see that this scale is not linear, so there's a colour for between one and five millimetres. There's a different colour for between 300 and 400 millimetres. I think that's useful when looking at visualisations also to see examples of maybe things that we can try and avoid. This is always the part of this presentation that I feel uneasy about, but I think it's just worth having a look at an example so we'll have a quick look at this one. So what this is talking
  • 12. Page 12 of 19 about is average household debt in America by this person who is a financial data journalist. It's how much debt you have. It's an infographic. So the first thing I looked at when I saw this is we've got some sort of thing that looks like a visualisation, and I tried to work out what it's telling us. I looked at it and I thought, well, why are some people green and some people - is it the green ones have less debt? No. All different sizes. They've - I realised that it probably doesn't mean anything. It's just decoration, so we can move on. So the next thing is the total owed by the average. We see credit cards are 16,000, mortgages are almost 10 times that amount, but the mortgages aren't actually 10 times as long in the specialisation. 28,000 is a lot longer than 16,000. So there's clearly no clear scale - well, I should just say there's no clear scale on this. Once again, we've got different colours but, once again, they seem just for decoration. The other thing is I couldn't understand why any type of debt is 134,000 while mortgages are 176,000. So it wasn't quite clear what any type of debt meant. Also credit cards and auto loans were lumped together with mortgages which are more of an asset and some people differentiate between things like mortgages, which they classify as good debt and things like auto loans which are classified as bad debt. The next one is how much does debt cost you. This probably one of the better ones, but there's no - given that she's used comparative scales in the previous ones I was surprised that there wasn't any comparative scale. I think one thing I did notice here was that this figure from memory didn't really add up. This was an interesting one, medical debt on the rise. There were a few issues with this but one of the things we notice is if that's 63 per cent then that one is about 37 per cent and yet that 37 per cent segment actually looks a bit bigger than the 42 per cent segment. Considering that halfway across would be 50 per cent I don't think that that 42 per cent is accurately reflected in the pie chart. I won't go into the colours that have been chosen or talk much more about pie charts. A lot of
  • 13. Page 13 of 19 people have very strong opinions about how useful pie charts are. We now come to debt broken down by age. In this one it actually looks as though the colours may be meaningful because they're two red bars, two orange bars, and two green bars, but once again it just seems that the colours were arbitrarily chosen. That's all I'll say about that, but - except to say I do think - have a look at examples and always look critically. Look critically at your own work at things that can be improved. But also when looking at other things think about, okay, is this a good visualisation? Is it a bad one? When you see something that looks good what makes it look good? When you see something that looks okay maybe think, how could it be improved? What could this person have done to make the story clearer? So what are some techniques that you can use when doing a visualisation that will make it better for the people looking at it? One of the first ones I talk about is natural mappings. What we're looking at here is what's called a wind rose. What this is showing is wind in eight - not quadrants but eight sectors - and how windy it is. So this is Melbourne Airport that we're seeing here and we see that most of the winds at Melbourne Airport are northerly. These are the averages taken over a particular period. As we go out in this telescope it shows us stronger and stronger winds. So, for example, we hardly ever have, let's say, gale force winds in this south-westerly direction. There's very few easterlies at Melbourne Airport. But the natural mapping is if it's facing upwards then we can see straightaway it's a northerly wind. We've seen this graph before, but the important thing is to highlight relevant information. So if all five of these lines were the same colour it wouldn’t be quite clear what the story's telling us, but it - given that this one is highlighted and the others are muted we can see straightaway it - our focus shifts to this one. The next thing, make comparisons clear. So what this is comparing is arctic ice. This is going back to 1879 and it's comparing the - as we're progressing into the present. One of the things we see is it seems pretty clear that
  • 14. Page 14 of 19 there's less and less arctic ice as we're coming into the present. By overlaying those plots one on top of the other it makes it a lot clearer. Going back to this graph we see once again by plotting all these different attributes on the same set of vertical axes it makes those comparisons much clearer. So, for example, when we're comparing highly likely to very good chance we can see quite clearly how they compare. The next thing is in this case it's probably exaggerated but make the scale clear. This is showing the stations in Australia that record - it's showing basically the largest difference between two days - so between the maximum temperature on day 1 and day 2. So at these stations there was a 25 degree or 27 degree difference. So one day the maximum temperature was 10 degrees and the next day 37 degrees, for example. As we went further north there's less difference between successive days in temperature in terms of their records. Yet another visualisation, this time of space, and we've got a very different scale here. It's probably hard to read on the slide, but that distance there is 100 million light years across. So a light year is pretty big. 100 million light years is 100 million times as big. Finally, colour should add meaning and not detract. We come back to this slide, which is how much Australia has - or the warming trend in Australia since 1970. Here clearly colour is enhancing the meaning of what we're trying to say here. Use conventions. If we look at this time series of temperature, at first look it may seem that temperature is actually declining. This is just a dummy slide I created for this presentation. What I've done here is these temperatures are actually - if we look carefully at these numbers we see the numbers are actually decreasing as we go from left to right. Normally when we read from left to right we expect time to increase - in other words, get either closer to the present or further into the future. By turning it around we've defied that convention and then obviously made this a whole lot harder to read. There's a lot of ways to display different dimensions, and I'll just - sorry. I'll just go there and I'll just skip this for the moment. We'll go back to it if
  • 15. Page 15 of 19 we've got a bit of time. So here's another slide showing how we can plot dimensions very differently. In this graph or in this visualisation what we've done is this is temperature in Africa but across a range of latitudes going from 30 south to 30 north. So the Y axis is latitude. The X axis is the month of the year. The actual colours depict the rainfall during those months. So what we see here is in the southern latitudes we get rainfall around December/January/February. As we go north of 20 degrees north it's very dry and around about 10 degrees north they get mostly a winter rainfall. This way of plotting data is known as a Hovmöller plot. These are called Chernoff faces and what this does is allows us to plot multidimensional data by using faces. So Chernoff said people, their brains are hardwired to really recognise faces quickly. So what we can do is we've got about seven or eight different attributes we can change. We can change the smile on their mouth. We can change the length of their nose, the distance between their eyes, the amount by which eyebrows are raised and so on. So we've taken a dummy data set here comparing different universities, different people across the universities, and then we've said, okay, we'll use, for example, the eye colour to show how - where they are, [of data for sharing] and maybe the length of the nose to show awareness of data licensing, et cetera. So basically, it's a novel way of displaying data with a high number of dimensions. As I keep saying, it's always good to break the rules. Some people may be familiar with this image. It's called pale blue dot. If you're not familiar it's a visualisation - well, I guess any image can be. But what it's showing is over there there's a pale blue dot. This photograph was taken by Voyager 1 from out of space - well, from space. That pale blue dot there, almost single pixel down there, is Earth. So often we're told to make the data we're displaying significant and obvious. In this case the strength of this visualisation comes from how insignificant that tiny little dot on that photograph is, how insignificant this huge planet that we live on is. Da Vinci has said, simplicity is the ultimate sophistication.
  • 16. Page 16 of 19 I've got a few things in my slides. I'll just go back to the slide that I was trying to find earlier. Which one did we - for some reason - what I'll do is I'll rewind that. So - okay. So what we're going to see is how Australia's temperatures changed for the 12 months ending December 1910. I'll just maximise this. This is an animation. The colour shows the year and we see as we're coming more and more to the present the colours spiralling outward, representing warming. So I guess what makes this visualisation effective is not only the animation but also the fact that we were able to draw a line which shows about 100 years of data which typically would have been a very long line but in this case by wrapping it around the inner circle we were able to show it all in one compact way. So finally, all visualisations are wrong. What do I mean? There's a famous quote from George Box, the statistician, that said, all models are wrong. He said, all models are wrong. The only question of interest is is the model illuminating and useful. I've changed that to, all visualisations are wrong. The question is is the visualisation illuminating, useful, and does it have integrity? Thank you. Gerry Ryder: Thank you so much, Martin, for that really valuable presentation that I'm sure has given us all a lot of ideas and some things to look forward to in the next webinar where we'll actually see some of the tools that you've used to create these examples. We do have time for questions if we have anyone in the audience that would like to ask Martin a question about anything he's presented on today. Please do put it into the question pod and I'd happily relay that and put Martin on the spot. So we've got a number of people thanking you, Martin, for a really interesting talk. We have got one question, Martin, from [Mark Mackay] who's asked if you could suggest any textbooks or papers that he could share with students. Martin Schweitzer: Yes, I do, quite a few. I've actually put them in the slides. So at the end of the slides there's some references. I believe the slides are going to be made available, Gerry.
  • 17. Page 17 of 19 Gerry Ryder: Yes. That's correct. We'll have both the slides up as well as the recording up. So you can have a look at the slides separately to the recording. Gerry Ryder: Another question, Martin, can you provide the name of the visualisation with the faces. Somebody's obviously liked that one. Martin Schweitzer: Chernoff faces, C-H-E-R-N-O, either V or F-F. Gerry Ryder: So perhaps we might put them - Susannah, we might be able to pop that in the question box for people to see, C-H-E-R-N-O-V or F-F. Someone's - Richard's asked, Martin, you've used Jupyter Notebooks. He's pre-empting the next webinar. What sort of other technologies do you normally use to build visualisations? Another question related about open source software for visualisations. So I know we'll cover that in the next webinar, but perhaps a teaser today, Martin. Martin Schweitzer: So definitely Jupyter Notebooks and Python. So the next webinar will focus largely on Python. I also do a lot of work with web front-ends and JavaScript. So if somebody's working with JavaScript there's a huge array of visualisation tools but probably if one - if you don't mind a steep learning curve and want to be able to do absolutely everything, E3.js is the go to one and it's open source. Gerry Ryder: Thank you, Martin. Someone wants you to - [Jacinta] wants you to look in a crystal ball and asks, what do you see is the future direction of data visualisation? Martin Schweitzer: Wow. I think the - what's happening is we're getting to things with higher and higher resolution. We're going to more dimensions so we've got the three dimensional static flatwork. We move to two dimensional animation with the web. One of the things that's becoming popular is virtual reality, so people can put on some glasses and maybe see storms being - the data for the storm being visualised but in their own surroundings. So what does it feel if a rain - and that actually gets us on to the next one which is augmented reality.
  • 18. Page 18 of 19 So I can look around at Monash University or let's say I could go down to St Kilda Beach and see what it's going to look like maybe in 100 years with the sea level rising two feet or 10 feet or something like that. So both exciting and scary. Gerry Ryder: As technology changes tend to be. We do have a couple more minutes if there is any other final questions for Martin. So Lisa is interested in the relationship between storytelling and data and the idea of integrity and worries about collecting data to suit a story and there being a lack of rigour and accountability. I guess that's a comment more than a question, but you might like to respond to that, Martin. Martin Schweitzer: I think it's a - integrity is always in the mind of the beholder, so that you can't - data cannot have integrity. The people using and presenting the data need to have integrity. They need to present the data with integrity. I would say any tool that can be used for good can also be used for evil. So, yes, people can create visualisations that try and push an agenda or push a point, et cetera. Hopefully by being more critical of visualisations we can actually see those ones where somebody is trying to push something which isn't true. That's why I also push for integrity in data that as soon as we show a visualisation that, let's say, only shows 30 years of data where maybe, let's say, temperatures have been decreasing immediately it puts a cloud over everything that person is saying because why have they picked that one 30 year period where the temperature was dropping? So I think in the long run it pays to be as honest as one can about data. Gerry Ryder: A final question today, thanks, Martin. Is there a common standard for colour coding for general use in data visualisation? Martin Schweitzer: A very simple and short answer, no, absolutely not. However, there is a website called ColorBrewer - actually, it's called ColorBrewer 2, so colour is spelt the American way, and brewer like somebody who brews. I would recommend anybody looking for a good set of colours to go there first.
  • 19. Page 19 of 19 [There are tools] for visualisation. [We'll] actually use the - so it was written by a researcher called - her last name is Brewer and she's done a lot of research into colour and how to use it well. Gerry Ryder: Great. I'd like to thank now Martin for his presentation today and also acknowledge Susannah who's been quietly sitting in the background responding to your questions and making sure the webinar runs smoothly. So thank you all today and have a great afternoon. END OF TRANSCRIPT