SlideShare a Scribd company logo
1 of 205
How Humans
See Data
John Rauser
@jrauser
November 2016
How Humans
See Data
John Rauser
@jrauser
November 2016
visualization
visualization
is
communication
how to make better visualizations
help humans solve analytical
problems quickly and accurately
with visualization
Part I: Why visualize data at all?
x
1.972
y
1.236
x y
0.111 0.542
1.112 1.994 0.902 0.005
0.000 1.009 0.598 0.085
0.665 1.942 1.613 1.790
0.235 0.356 1.298 1.955
0.247 1.658 0.651 1.937
1.275 1.961 1.949 1.316
0.702 0.045 0.099 0.567
1.760 0.350 0.862 0.010
1.691 0.277 0.027 0.768
1.628 1.778 0.706 1.956
1.957 1.290 1.042 1.999
pre-attentive processing
A graph is an encoding
of the data.
x
1.972
y
1.236
x y
0.111 0.542
1.112 1.994 0.902 0.005
0.000 1.009 0.598 0.085
0.665 1.942 1.613 1.790
0.235 0.356 1.298 1.955
0.247 1.658 0.651 1.937
1.275 1.961 1.949 1.316
0.702 0.045 0.099 0.567
1.760 0.350 0.862 0.010
1.691 0.277 0.027 0.768
1.628 1.778 0.706 1.956
1.957 1.290 1.042 1.999
n x y n x y
1 1.972 1.236 13 0.111 0.542
2 1.112 1.994 14 0.902 0.005
3 0.000 1.009 15 0.598 0.085
4 0.665 1.942 16 1.613 1.790
5 0.235 0.356 17 1.298 1.955
6 0.247 1.658 18 0.651 1.937
7 1.275 1.961 19 1.949 1.316
8 0.702 0.045 20 0.099 0.567
9 1.760 0.350 21 0.862 0.010
10 1.691 0.277 22 0.027 0.768
11 1.628 1.778 23 0.706 1.956
12 1.957 1.290 24 1.042 1.999
Good visualizations optimize
for the human visual system.
How does the human
visual system work?
How does the human visual
system decode a graph?
Cleveland’s three visual
operations of pattern perception:
1. Detection
2. Assembly
3. Estimation
Part II: estimation
Three levels of estimation
a. discrimination X=Y X!=Y
b. ranking X>Y X<Y
c. ratioing X / Y = ?
At the heart of quantitative
reasoning is a single question:
Compared to what?
- Tufte, Envisioning Information
Three levels of estimation
a. discrimination X=Y X!=Y
b. ranking X>Y X<Y
c. ratioing X / Y = ?
the most
important
thing
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
“The first rule of color:
do not talk about color!”
- Tamara Munzner
luminance
saturation
hue
luminance
saturation
hue
Observation: Alphabetical is
almost never the correct ordering
of a categorical variable.
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
11 mpg
11 mpg
11 mpg
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned
scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
Observation: Stacked
anything is nearly always
a mistake.
Stacking makes the reader
decode lengths, not position
on a common scale.
11 mpg
Observation: Stacked
anything is nearly always
a mistake.
Observation: Pie charts are
ALWAYS a mistake.
Piecharts are the information visualization
equivalent of a roofing hammer to the
frontal lobe. They have no place in the world
of grownups, and occupy the same semiotic
space as short pants, a runny nose, and
chocolate smeared on one’s face. They are
as professional as a pair of assless chaps.
http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/
Piecharts are the information visualization
equivalent of a roofing hammer to the frontal
lobe. They have no place in the world of
grownups, and occupy the same semiotic
space as short pants, a runny nose, and
chocolate smeared on one’s face. They are
as professional as a pair of assless chaps.
http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
Tables are preferable to graphics for many small
data sets. A table is nearly always better than a
dumb pie chart; the only thing worse than a pie
chart is several of them, for then the viewer is
asked to compared quantities located in spatial
disarray both within and between pies… Given
their low data-density and failure to order
numbers along a visual dimension, pie charts
should never be used.
-Edward Tufte, The Visual Display of Quantitative Information
Tables are preferable to graphics for many
small data sets. A table is nearly always better
than a dumb pie chart; the only thing worse than
a pie chart is several of them, for then the viewer
is asked to compared quantities located in spatial
disarray both within and between pies… Given
their low data-density and failure to order
numbers along a visual dimension, pie charts
should never be used.
-Edward Tufte, The Visual Display of Quantitative Information
Clinton Trump
Among Democrats 99% 1%
Among Republicans 53% 47%
Who do you think did a better
job in tonight’s debate?
Afghanistan
Albania
Algeria
Angola
Argentina
Australia
Austria
Bahrain
Bangladesh
Belgium
Benin
Bolivia
Bosnia and Herzegovina
Botswana
Brazil
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
All good pie charts are jokes.
Observation: Comparison is trivial
on a common scale.
the dashboard metaphor is
fundamentally flawed
Observation: Scatterplots
show relationships directly.
Observation: Growth charts
usually aren’t.
If growth (slope) is
important, plot it directly.
Observation: Growth charts
usually aren’t.
If growth (slope) is important,
plot it directly.
The most important measurement should exploit
the highest ranked encoding possible.
• Position along a common scale
• Position on identical but nonaligned scales
• Length
• Angle or Slope
• Area
• Volume or Density or Color saturation
• Color hue
Cleveland’s three visual operations
of pattern perception:
1. Detection
2. Assembly
3. Estimation
Part three: assembly
Gestalt Psychology
reification
emergence
emergence
Prägnanz
Law Of Closure
Law Of Continuity
Observation: Good plots
leverage the law of continuity
to assist with assembly.
Law of Similarity
Law of Proximity
Observation: dodged bar
charts are a bad idea
Cleveland’s three visual operations
of pattern perception:
1. Detection
2. Assembly
3. Estimation
Part IV: detection
excel’s defaults are pretty bad
-
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
180,000
200,000
1 2 3 4 5 6
Observation: Detection isn’t
as trivial as it seems.
“Above all else, show the data.”
-Tufte
Part V: other useful results
Weber’s law: The “Just Noticeable
Difference” is proportional to the
size of the initial stimuli.
10 20
10 20
100 110
12 units
12 units
Observation: Weber’s Law is
why gridlines are useful
“Erase non-data ink.”
-Tufte
“Erase non-data ink,
within reason.”
-Tufte
“Erase non-data ink that interferes
with detection or doesn’t assist
assembly and estimation.”
-Rauser
You are best at detecting variation
in slope near 45 degrees.
banking to 45
Observation: Banking to 45
best shows variation in slope
Q: Should I include 0 on my scale?
Q: Should I include 0 on my scale?
A: It depends.
Q: Should I include 0 on my scale?
A: Relying on the pre-attentive
perception of size or intensity?
Yes, otherwise you will mislead.
Using position? It’s up to you.
“Above all else, show the data.”
-Tufte
“Above all else, show
the variation in the data.”
-Rauser (via Tufte)
R/GGplot2 code for every plot in this
presentation available at http://goo.gl/xH5PLV
The rendered document is at
http://rpubs.com/jrauser/hhsd_notes
This presentation is at
http://goo.gl/VKxxya
I will tweet these links as @jrauser
coda
visualization
is
communication
art
is
communication
visualization
is
art
why does it make you
feel that way?
visualization has as much to
learn from art as from science
R/GGplot2 code for every plot in this
presentation available at http://goo.gl/xH5PLV
The rendered document is at
http://rpubs.com/jrauser/hhsd_notes
This presentation is at
http://goo.gl/VKxxya
I will tweet these links as @jrauser
end

More Related Content

Similar to How Humans See Data

How not to make a bad presentation
How not to make a bad presentationHow not to make a bad presentation
How not to make a bad presentation
Samir Haffar
 

Similar to How Humans See Data (20)

Making sense of data visually: A modern look at datavisualization
Making sense of data visually: A modern look at datavisualizationMaking sense of data visually: A modern look at datavisualization
Making sense of data visually: A modern look at datavisualization
 
Guidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candyGuidelines for data visualisation: eye vegetables and eye candy
Guidelines for data visualisation: eye vegetables and eye candy
 
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
AMIA 2015 Visual Analytics in Healthcare Tutorial Part 1
 
Using visual aids effectively
Using visual aids effectivelyUsing visual aids effectively
Using visual aids effectively
 
Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Exploratory Data Analysis week 4
Exploratory Data Analysis week 4
 
Eda sri
Eda sriEda sri
Eda sri
 
Working With Infographics
Working With InfographicsWorking With Infographics
Working With Infographics
 
Best Practices for Killer Data Visualization
Best Practices for Killer Data VisualizationBest Practices for Killer Data Visualization
Best Practices for Killer Data Visualization
 
Lec 3.pptx
Lec 3.pptxLec 3.pptx
Lec 3.pptx
 
Size Matters
Size MattersSize Matters
Size Matters
 
Data Visualization dataviz superpower
Data Visualization dataviz superpowerData Visualization dataviz superpower
Data Visualization dataviz superpower
 
How not to make a bad presentation
How not to make a bad presentationHow not to make a bad presentation
How not to make a bad presentation
 
Information Visualisation: perception and principles
Information Visualisation: perception and principlesInformation Visualisation: perception and principles
Information Visualisation: perception and principles
 
Visualizing and Communicating High-dimensional Data
Visualizing and Communicating High-dimensional DataVisualizing and Communicating High-dimensional Data
Visualizing and Communicating High-dimensional Data
 
Torturing numbers - Descriptive Statistics for Growers (2013)
Torturing numbers - Descriptive Statistics for Growers (2013)Torturing numbers - Descriptive Statistics for Growers (2013)
Torturing numbers - Descriptive Statistics for Growers (2013)
 
Making an Impact With Data Visualization
Making an Impact With Data VisualizationMaking an Impact With Data Visualization
Making an Impact With Data Visualization
 
Res701 research methodology lecture 7 8-devaprakasam
Res701 research methodology lecture 7 8-devaprakasamRes701 research methodology lecture 7 8-devaprakasam
Res701 research methodology lecture 7 8-devaprakasam
 
Designing Effective PowerPoint Presentations.pptx
Designing Effective PowerPoint Presentations.pptxDesigning Effective PowerPoint Presentations.pptx
Designing Effective PowerPoint Presentations.pptx
 
Display in Primary School
Display in Primary SchoolDisplay in Primary School
Display in Primary School
 
L14. Anomaly Detection
L14. Anomaly DetectionL14. Anomaly Detection
L14. Anomaly Detection
 

Recently uploaded

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

How Humans See Data