Graduating with a BA from UCD in 1995, Colman emigrated to America to pursue a career that combined creativity, commerce and computers. Heading west to California, Colman worked for 11 years in Hollywood's visual effects (VFX) industry. During this time he worked mainly at The Walt Disney Co. and also as a. In 2006, Colman returned home to Ireland to undertake a . A short time after the conclusion of the course, while starting up his own , Colman was invited back to DIT as a part-time lecturer. In 2011, Colman was offered a PhD Fellowship at modeling and simulating the relationship between innovation and profit. This full-time study is under the direction of Prof. Petra Ahrweiler, Director UCD Innovation Research Unit and Professor of Technology and Innovation Management, Smurfit School of Business. In 2012, Colman designed and delivered the first iteration of a new Visualisation module as part of DIT's .
Details of Colman's research activities can be found at .
-Dubinked-
Drawing from a new module at DIT, Colman's presentation at Dublinked will be an introduction to the domain of visualisation and a demonstration of powerful yet "do-able" data visualisations. The ethos of the presentation is for people who have little or no visualisation experience but have an aptitude and appetite for using technical tools to surface meaning from data. The tools used will be R, R Studio and Inkscape.
2. Visualisation
MSc Data Analytics
http://www.dit.ie/postgrad/programmes/dt285dt286mscincomputingdataanalytics/
2012-05-24 2
3. Agenda
1) Background to Data Visualization*
2) Resources
3) Classification of Visualization
4) The Design Process
5) Demonstration
*Disclaimer (and apologies to some), I use the American spelling “visualization”
2012-05-24 3
4. Take-away Points
1) Open to all
» new domain with many facets
2) Professional-level output is achievable
» practice a few programming and graphic design techniques
3) It's (only) a means to an end
» should affect behaviour
2012-05-24 4
5. Background
Data Visualisation
(very briefly)
2012-05-24 5
17. Texts (1 of 2)
► R in a Nutshell: A Desktop Quick Reference - Adler, Joseph
► Excel 2007 Dashboards & Reports For Dummies - Alexander, Michael
► Ways of Seeing: Based on the BBC Television Series - Berger, John S.
► Semiology of Graphics: Diagrams, Networks, Maps - Bertin, Jacques
► Statistics in a Nutshell: A Desktop Quick Reference - Boslaugh, Watters
► The Jelly Effect: How to Make Your Communication Stick - Bounds, Andy
► Gamestorming: A Playbook for Innovators, Rulebreakers, and Changemakers - Brown, Sunni
► Sketching User Experiences: Getting the Design Right and the Right Design - Buxton, Bill
► Readings in Information Visualization: Using Vision to Think - Card, Mackinlay and Shneiderman
► The Elements of Graphing Data - Cleveland, William S.
► Visualizing Data - Cleveland, William S.
► Now You See It - Davidson, Cathy N.
► slide:ology: The Art and Science of Creating Great Presentations - Duarte, Nancy
► Art: The Whole Story - Farthing, Stephen
► Information Dashboard Design: The Effective Visual Communication of Data - Few, Stephen
► Now You See It: Simple Visualization Techniques for Quantitative Analysis - Few, Stephen
► Show Me the Numbers: Designing Tables and Graphs to Enlighten - Few, Stephen
► Freelance Design in Practice - Fishel, Cathy
► Art of Plain Talk - Flesch, Rudolf
► The Art of Looking Sideways - Fletcher, Alan
► Graphic Artist's Guild Handbook of Pricing and Ethical Guidelines - Graphic Artists Guild
► Made to Stick: Why Some Ideas Survive and Others Die - Heath, Chip and Dan
► Switch: How to Change Things When Change Is Hard - Heath, Chip and Dan
► Data Analysis with Open Source Tools - Janert, Philipp K.
► We Feel Fine: An Almanac of Human Emotion - Kamvar, Sep
► Turning Numbers into Knowledge: Mastering the Art of Problem Solving - Koomey, Jon
► Elements of Graph Design - Kosslyn, Stephen M.
2012-05-24 17
Andy Kirk, http://www.visualisingdata.com
18. Texts (2 of 2)
► Graph Design for the Eye and Mind - Kosslyn, Stephen M.
► Don't Make Me Think: A Common Sense Approach to Web Usability, 2nd Edition - Krug, Steve
► Universal Principles of Design, Revised and Updated - Lidwell - Holden, Butler
► Visual Complexity: Mapping Patterns of Information - Lima, Manuel
► The Power of the 2 x 2 Matrix: Using 2 x 2 Thinking to Solve Business Problems and Make Better Decisions - Lowy, Alex
► How Maps Work: Representation, Visualization, and Design - MacEachren, Alan M.
► The Laws of Simplicity (Simplicity: Design, Technology, Business, Life) - Maeda, John
► Visual Language for Designers: Principles for Creating Graphics that People Understand - Malamed, Connie
► Understanding Comics: The Invisible Art - Mccloud, Scott
► The Chicago Guide to Writing about Numbers (Chicago Guides to Writing, Editing, and Publishing) - Miller, Jane E.
► How to make an IMPACT - Moon, Jon
► Designing Visual Interfaces: Communication Oriented Techniques - Mullet, Kevin
► The Designful Company: How to build a culture of nonstop innovation - Neumeier, Marty
► Emotional Design: Why We Love (or Hate) Everyday Things - Norman, Donald A.
► The Design of Everyday Things - Norman, Donald A.
► Playfair's Commercial and Political Atlas and Statistical Breviary - Playfair, William
► Presentation Zen Design: Simple Design Principles and Techniques to Enhance Your Presentations - Reynolds, Garr
► Presentation Zen: Simple Ideas on Presentation Design and Delivery - Reynolds, Garr
► The Back of the Napkin (Expanded Edition): Solving Problems and Selling Ideas with Pictures - Roam, Dan
► Unfolding the Napkin: The Hands-On Method for Solving Complex Problems with Simple Pictures - Roam, Dan
► Creating More Effective Graphs - Robbins, Naomi B.
► The Craft of Information Visualization: Readings and Reflections - Shneiderman, Ben
► The Visual Display of Quantitative Information - Tufte, Edward R.
► Envisioning Information - Tufte, Edward R.
► Beautiful Evidence - Tufte, Edward R.
► Graphic Discovery: A Trout in the Milk and Other Visual Adventures - Wainer, Howard
► Visual Thinking: for Design - Ware, Colin
► The Grammar of Graphics - Wilkinson, Leland
► Non-Designer's Design Book (3rd Edition) - Williams, Robin
2012-05-24 18
► Glut: Mastering Information Through the Ages - Wright, Alex
Andy Kirk, http://www.visualisingdata.com
22. Google's Charting and Visualisation Tools
► Google Docs https://docs.google.com/?pli=1#home
► Google Fusion Tables http://www.google.com/fusiontables/Home?pli=1
► Google Chart API http://code.google.com/apis/chart/
► Google Visualization API http://code.google.com/apis/visualization/documentation/gallery.html
► Google Motion Chart & Public Data Explorer
http://www.google.com/publicdata/home
► Google Insights for Search http://www.google.com/insights/search/#
► Google Zeitgeist http://www.google.com/intl/en/press/zeitgeist2010/
► Google Ngram Viewer http://ngrams.googlelabs.com/
► Google Analytics http://www.google.com/intl/en_uk/analytics/
► Google.org Philanthropy http://www.google.org/#one
► Google Wonder Wheel http://www.google.com/landing/searchtips/engineers.html
► GraphViz http://code.google.com/apis/chart/docs/gallery/graphviz.html
► Choosel http://code.google.com/p/choosel/
► Data Appeal http://dataappeal.com/
Andy Kirk, http://www.visualisingdata.com
2012-05-24 22
25. Combination of Many Disciplines
Given complexity of data, insights from diverse fields are required to provide
meaningful solutions:
Statistics Graphic Design
Data Mining
Computer Science
Data/Info Visualisation
(Ben Fry – “Visualizing Data”)
2011/12 25
26. Pick an area of interest/define your requirements, then drill down...
2012-05-24 26
28. “Designing Data Visualizations”
Designing Data Visualizations
Intentional Communication from Data to Display
Noah Iliinsky and Julie Steele
Publisher: O'Reilly Media (September 29, 2011)
ISBN-10: 1449312284
2012-05-24 28
29. “Visualize This”
Visualize This
The Flowing Data Guide to Design, Visualization and Statistics
Nathan Yau
Publisher: Wiley (July 20, 2011)
ISBN-10: 0470944889
2012-05-24 29
43. “Designing Data Visualizations”
Designing Data Visualizations
Intentional Communication from Data to Display
Noah Iliinsky and Julie Steele
Publisher: O'Reilly Media (September 29, 2011)
ISBN-10: 1449312284
2012-05-24 43
44. Classifications of Visualizations
1 Complexity
2 Infographics Data Viz
3 Exploration Explanation
4 Informative Persuasive Visual Art
2012-05-24 44
45. (Data Visualisations)
(Infographics)
Figure1-2. The difference between infographics and data visualization may be loosely determined
2012-05-24 45
by the method of generation, the quantity of data represented, and the degree of aesthetic
treatment applied.
46. Infographics
Infographics is useful term for referring to visual representation of data that is:
» manually drawn (and therefore a custom treatment of the information)
» specific to the data at hand (and therefore non-trivial to recreate with
different data)
» aesthetically rich (strong visual content meant to draw the eye and hold
interest)
» relatively data—poor (because each piece of information must be manually
encoded)
2012-05-24 46
49. Classifications of Visualizations
1 Complexity
2 Infographics Data Viz
3 Exploration Explanation
4 Informative Persuasive Visual Art
2012-05-24 49
50. (Data Visualisations)
(Infographics)
Figure1-2. The difference between infographics and data visualization may be loosely determined
2012-05-24 50
by the method of generation, the quantity of data represented, and the degree of aesthetic
treatment applied.
51. Data Visualization
The terms data visualization and information visualization refer to any visual
representation of data that is:
» algorithmically drawn (may have custom touches but is largely rendered with
the help of computerized methods);
» easy to regenerate with different data (the same form may be re-purposed to
represent different datasets with similar dimensions or characteristics);
» often aesthetically barren (data is not decorated); and
» relatively data-rich (large volumes of data are welcome and viable, in contrast
to infographics)
2012-05-24 51
54. Classifications of Visualizations
1 Complexity
2 Infographics Data Viz
3 Exploration Explanation
4 Informative Persuasive Visual Art
2012-05-24 54
55. Exploration vs Explanation
Exploratory visualization:
103123101123425832
453246502163409218
► The dataset 3640634102
9236401326432654
736147236421523452
123453456856
141232343576 (1) (2)
153465
► The mind of the designer
Explanatory visualization:
?
► The mind of the designer
► The mind of the reader
(3)
2012-05-24 55
56. "Holy Trinity"
Designer-Reader-Data
Reader
Informative Persuasive
Data Visual Art Designer
Figure 1-4. The nature of the visualization depends on which relationship (between two of the three components) is dominant.
2012-05-24 56
57. Classifications of Visualizations
1 Complexity
2 Infographics Data Viz
3 Exploration Explanation
4 Informative Persuasive Visual Art
2012-05-24 57
61. Visual Art
2012-05-24 61
Nora Ligorano and Marshall Reese designed a project that converts Twitter streams into a woven fiber-optic tapestry
http://ligoranoreese.net/hber-optic-tapestry)
62. Classifications of Visualizations
1 Complexity
2 Infographics Data Viz
3 Exploration Explanation
4 Informative Persuasive Visual Art
2012-05-24 62
63. Agenda
1) Background to Data Visualization*
2) Resources
3) Classification of Visualisation
4) The Design Process
5) Demonstration
2012-05-24 63
66. Reconcile through single process...
► Must reconcile the various elements
through a single process
► The process begins with:
» a set of numbers
» a question
2011/12 66
67. Visualization Goals - Technical
1) Highlight data features in order of
their importance
2) Reveal patterns
3) Simultaneously show features across
multiple dimensions
» e.g. time, quantity & geography
2011/12 67
68. Visualization Goals - People
► The goal of your visualization will be informed by:
» Your own goals and motivations
» The needs of your reader ?
• need for specific information
• to change the reader’s opinions or behaviour
2012-05-24 68
69. Data Visualization Process
-7 Stages-
acquire parse filter mine represent refine interact
► Iteration & combination
» demonstrates how later decisions can affect earlier stages
2011/12 69
70. Data Process – 7 Stages
1) Acquire Obtain the data (file, disk, over network)
2) Parse Provide some structure for the data's meaning, and order it into
categories
3) Filter Remove all but the data of interest
4) Mine Apply methods from statistics or data mining as a way to discern
patterns or place the data in mathematical context
5) Represent Choose a basic visual model, such as a bar graph, list or tree
6) Refine Improve the basic representation to make it clearer and
more visually engaging
7) Interact Add methods for manipulating the data or controlling what
features are visible
(may not need every step in every project)
2011/12 70
71. Represent
► Rule #1 - function then form
► The visual design elements should enhance and enable the function
► The key to a successful visualization is making good design choices
» elegance, simplicity, efficiency
2012-05-24 71
73. Agenda
1) Background to Data Visualization*
2) Resources
3) Classification of Visualisation
4) The Design Process
5) Demonstration
2012-05-24 73
74. Demonstration
(walk-through followed by demo)
2012-05-24 74
75. “Visualize This”
Visualize This
The Flowing Data Guide to Design, Visualization and Statistics
Nathan Yau
Publisher: Wiley (July 20, 2011))
ISBN-10: 0470944889
2012-05-24 75
76. R Project
http://www.r-project.org/
2012-05-24 76
78. The R Script
► A file in the R format
► Allows you to save your scripting work
► File (or Ctrl+Shift+N)
» New
• R Script
► Hit “Run” (or Ctrl + Enter) after each
command
2011/12 78
81. Installing packages
Package installation in R Studio
► Option 1 (R or R Studio)
» Type the following commands into
the console or R script:
» install.packages(packagename)
» library (packagename)
► Option 2 (R Studio)
» Use GUI as show on right ->
Activate package
2011/12 81
85. Process
(roughly)
Beautiful
colorize Soup counties
/cmd _svg .svg
.py
(data crunched)
(writes to a
(run colorize_svg (or double-click new file)
.py) to run)
(uses BS & Python)
2012-05-24 85
86. Chapter 8: Visualizing Spatial Relationships
► What to Look For
► Specific Locations
» Just Points
• Map with Dots
• Map with Lines
» Scaled Points
• Map with Bubbles
► Regions
» Color by Data
• Map Counties
• Map Countries
2011/12 86
88. New
Map with Dots
R script
file
► R, although limited in mapping functionality, makes placing dots on a map easy
► The maps package does most of the work
» install via Package Installer or console.
► Next step: Load the data. Use the Costco locations that you just geocoded, or load it
directly from the URL
costcos <read.csv("http://book.flowingdata.com/ch08/geocode/costcos
geocoded.csv", sep=",")
2011/12 88
90. Mapping – first layer
► When you create your maps, it’s useful to think of them as layers (regardless of the
software in use).
► The bottom layer is usually the base map that shows geographical boundaries, and then
you place data layers on top of that.
► In this case the bottom layer is a map of the United States, and the second layer is
Costco locations
map(database="state")
2011/12 90
Figure 8-2: Plain map of the United States
91. Mapping – second layer
► The second layer, or Costco’s, are then mapped with the symbols() function.
symbols(costcos$Longitude, costcos$Latitude,
circles=rep(1, length(costcos$Longitude)), inches=0.05, add=TRUE)
symbols()
2011/12 Figure 8-3: Map of Costco locations 91
92. Change colours
► Change the colors of both the map and the circles so that the locations stand out and
boundary lines sit in the background
map(database="state", col="#cccccc")
symbols(costcos$Longitude, costcos$Latitude, bg="#e2373f", fg="#ffffff",
lwd=0.5, circles=rep(1, length(costcos$Longitude)),
inches=0.05, add=TRUE)
2011/12 92
Figure 8-4: Using color with mapped locations
93. Result?
► Not bad for a few lines of code. Costco has clearly focused on opening locations on the
coasts with clusters in southern and northern California, northwest Washington, and in
the northeast of the country.
2011/12 93
Figure 8-4: Using color with mapped locations
95. Alaska & Hawaii
► Alaska and Hawaii are in the “world” database, so you need to map the entire world
map(database="world", col="#cccccc")
symbols(costcos$Longitude, costcos$Latitude, bg="#e2373f", fg="#ffffff",
lwd=0.3, circles=rep(1, length(costcos$Longitude)),
inches=0.03, add=TRUE)
2011/12 95
Figure 8-5: World map of Costco locations
96. State specific
► Say you want to only map Costco locations
for a few states. You can do that with the
region argument.
map(database="state",
region=c("California", "Nevada", "Oregon",
"Washington"), col="#cccccc")
symbols(costcos$Longitude,
costcos$Latitude, bg="#e2373f",
fg="#ffffff",
lwd=0.5, circles=rep(1,
length(costcos$Longitude)), inches=0.05,
add=TRUE)
► Some dots are not in any of those states
» easy to remove in Inkscape
2011/12 96
Figure 8-6: Costco locations in selected states
97. Chapter 8: Visualizing Spatial Relationships
► What to Look For
► Specific Locations
» Just Points
• Map with Dots
• Map with Lines
» Scaled Points
• Map with Bubbles
► Regions
» Color by Data
• Map Counties
• Map Countries
2011/12 97
99. New
Map with Lines
R script
file
► Draw the lines by simply plugging in the two columns into lines(). Also specify color
(col) and line width (lwd).
lines(faketrace$longitude, faketrace$latitude, col="#bb4cd4", lwd=2)
► Now also add dots, exactly like you just did with the Costco locations
symbols(faketrace$longitude, faketrace$latitude, lwd=1,
bg="#bb4cd4", fg="#ffffff", circles=rep(1,
length(faketrace$longitude)), inches=0.05, add=TRUE)
2011/12 99
Figure 8-7: Drawing a location trace
101. Drawing Connections
► It could be interesting to draw lines from one location to all the others
map(database="world", col="#cccccc")
for (i in 2:length(faketrace$longitude)1) {
lngs < c(faketrace$longitude[8], faketrace$longitude[i])
lats < c(faketrace$latitude[8], faketrace$latitude[i])
lines(lngs, lats, col="#bb4cd4", lwd=2)
} (run function as a block)
► Isn’t very informative, but maybe
you can find a good use for it
► The point here is that you can draw a
map and then use R’s other graphics
functions to draw whatever you want
using latitude and longitude
coordinates.
2011/12 101
Figure 8-8: Drawing worldwide connections
102. Chapter 8: Visualizing Spatial Relationships
► What to Look For
► Specific Locations
» Just Points
• Map with Dots
• Map with Lines
» Scaled Points
• Map with Bubbles
► Regions
» Color by Data
• Map Counties
• Map Countries
2011/12 102
103. Figure 8-10: Rates more clearly explained for a wider audience
2011/12 103
104. Scaled Points
► Usually,don’t just have a location
» also have other values, e.g
• sales volume
• city population
► Use the principle of bubble plot and
apply it to a map
2011/12 104
105. New
R script
file
► The code is almost the same as when you mapped Costco locations, but remember you
just passed a vector of ones for circle size in the symbols() function. Instead, we use the
sqrt() of the rates to indicate size.
fertility <
read.csv("http://book.flowingdata.com/ch08/points/adolfertility.csv")
map(‘world’, fill = FALSE, col = "#cccccc")
symbols(fertility$longitude, fertility$latitude,
circles=sqrt(fertility$ad_fert_rate), add=TRUE,
inches=0.15, bg="#93ceef", fg="#ffffff")
2011/12 105
Figure 8-9: Adolescent fertility rate worldwide
106. Figure 8-10: Rates more clearly explained for a wider audience
2011/12 106
107. Chapter 8: Visualizing Spatial Relationships
► What to Look For
► Specific Locations
» Just Points
• Map with Dots
• Map with Lines
» Scaled Points
• Map with Bubbles
► Regions
» Color by Data
• Map Counties
• Map Countries
2011/12 107
108. Regions
► Mapping points can take you only so far
because they represent only single
locations.
► Large scale data is usually aggregated
over whole counties, states, countries,
and continents
► Use Python and SVG to generate map
» Python - to process the data
http://www.nevron.com/Gallery.DiagramFor.NET.Maps.ChoroplethMaps.aspx
» SVG - for the map
2011/12 108
109. Color By Data
► Choropleth maps are the most common way to map regional data
► Based on some metric, regions are colored following a color scale that you define
Figure 8-11: Choropleth map framework
2011/12 109
110. Using colours
► When you have your color scheme, you have two more things to do:
» Scale - decide how the colors you picked match up to the data range
» Location - assign colors to each region based on your choice
2011/12 110
http://gismapcatalog.blogspot.com/2010/07/standardized-choropleth-map.html
111. Chapter 8: Visualizing Spatial Relationships
► What to Look For
► Specific Locations
» Just Points
• Map with Dots
• Map with Lines
» Scaled Points
• Map with Bubbles
► Regions
» Color by Data
• Map Counties
• Map Countries
2011/12 111
116. Get data
► U.S. Bureau of Labor Statistics provides county-level unemployment
data every month
► Download the data at
http://book.flowingdata.com/ch08/regions/unemploymentaug2010.txt.
► There are six columns:
1) is a code specific to the Bureau of Labor Statistics
2) and 3) are a unique id specifying county
4) is the county name and
5) is the month the rate is an estimate of
6) is the estimated percentage of people in the county who are
unemployed
► For the purposes of this example, only interested in COUNTY ID (FIPS) and the RATE
2011/12 116
117. US Unemployment figures (BLS)
LAUS_CODE,STATE_FIPS,COUNTY_FIPS,COUNTY,MONTH,RATE
CN010010,01,001,"Autauga County, AL",Aug10(p),8.1
PA011000,01,003,"Baldwin County, AL",Aug10(p),8.2
CN010050,01,005,"Barbour County, AL",Aug10(p),11.6
CN010070,01,007,"Bibb County, AL",Aug10(p),10.1
CN010090,01,009,"Blount County, AL",Aug10(p),8.3
CN010110,01,011,"Bullock County, AL",Aug10(p),15.0
CN010130,01,013,"Butler County, AL",Aug10(p),12.2
PA010250,01,015,"Calhoun County, AL",Aug10(p),9.1
CN010170,01,017,"Chambers County, AL",Aug10(p),13.6
CN010190,01,019,"Cherokee County, AL",Aug10(p),8.8
CN010210,01,021,"Chilton County, AL",Aug10(p),9.4
CN010230,01,023,"Choctaw County, AL",Aug10(p),11.1
CN010250,01,025,"Clarke County, AL",Aug10(p),15.8
CN010270,01,027,"Clay County, AL",Aug10(p),13.3
CN010290,01,029,"Cleburne County, AL",Aug10(p),8.4
CN010310,01,031,"Coffee County, AL",Aug10(p),7.3
PA010900,01,033,"Colbert County, AL",Aug10(p),9.2
CN010350,01,035,"Conecuh County, AL",Aug10(p),15.4
CN010370,01,037,"Coosa County, AL",Aug10(p),12.2
2011/12 117
118. Get map
► Blank map from Wikimedia Commons:
http://commons.wikimedia.org/wiki/File
:USA_Counties_with_FIPS_and_names.svg
► download SVG file and save as
counties.svg, in the same directory
that you save the unemployment data
2011/12 118
119. Download the SVG file
2011/12 http://commons.wikimedia.org/wiki/File:USA_Counties_with_FIPS_and_names.svg 119
120. SVG map file
► SVG (scalable vector graphics) is an XML file
► It’s text with tags, and you can edit it in a text editor like you would an HTML file
► The browser or image viewer reads the XML, and the XML tells the browser what to show,
such as the colors to use and shapes to draw.
2011/12 120
122. SVG - colour of each state
► Change the fill color of each county to match the corresponding unemployment rate
<path
style="fontsize:12px;fill:#d0d0d0;fillrule:nonzero;stroke:#000000;stroke
opacity:1;strokewidth:0.1;strokemiterlimit:4;strokedasharray:none;stroke
linecap:butt;markerstart:none;strokelinejoin:bevel"
► There are more than 3,000 counties so use Beautiful Soup to make parsing XML and HTML
easy
2011/12 122
123. Load the elements
(create a small script/program)
colorize.svg.py
► Open a blank file in the same directory
as your SVG map and unemployment
data
► Save it as colorize_svg.py
► Follow instructions from book to
construct the script
2011/12 123
124. Connect data & map)
► The challenge is to somehow link the unemployment data to the county map
► The linkage = the FIPS codes (Federal Information Processing Standard)
Underemployment
rates
FIPS codes
Blank
map
2011/12 124
125. US Unemployment figures (BLS)
LAUS_CODE,STATE_FIPS,COUNTY_FIPS,COUNTY,MONTH,RATE
CN010010,01,001,"Autauga County, AL",Aug10(p),8.1
PA011000,01,003,"Baldwin County, AL",Aug10(p),8.2
CN010050,01,005,"Barbour County, AL",Aug10(p),11.6
CN010070,01,007,"Bibb County, AL",Aug10(p),10.1
CN010090,01,009,"Blount County, AL",Aug10(p),8.3
CN010110,01,011,"Bullock County, AL",Aug10(p),15.0
CN010130,01,013,"Butler County, AL",Aug10(p),12.2
PA010250,01,015,"Calhoun County, AL",Aug10(p),9.1
CN010170,01,017,"Chambers County, AL",Aug10(p),13.6
CN010190,01,019,"Cherokee County, AL",Aug10(p),8.8
CN010210,01,021,"Chilton County, AL",Aug10(p),9.4
CN010230,01,023,"Choctaw County, AL",Aug10(p),11.1
CN010250,01,025,"Clarke County, AL",Aug10(p),15.8
CN010270,01,027,"Clay County, AL",Aug10(p),13.3
CN010290,01,029,"Cleburne County, AL",Aug10(p),8.4
CN010310,01,031,"Coffee County, AL",Aug10(p),7.3
PA010900,01,033,"Colbert County, AL",Aug10(p),9.2
CN010350,01,035,"Conecuh County, AL",Aug10(p),15.4
CN010370,01,037,"Coosa County, AL",Aug10(p),12.2
2011/12 125
126. Connect data & SVG (map)
► Each path in the SVG file has a unique id
» combined FIPS state and county FIPS code:
id="01001"
inkscape:label="Autauga, AL”
2011/12 126
127. Run the Python script
$ python colorize_svg.py > colored_map.svg
2011/12 127
128. Possible code problem...
unemployment = {}
rates_only = [] # To
In book... calculate quartiles
min_value = 100; max_value
= 0; past_header = False
for row in reader:
if not past_header:
past_header = True
continue
try:
full_fips = row[1]
+ row[2]
rate =
float( row[5].strip() )
unemployment[full_fips] =
rate
rates_only.append(rate)
except:
Finished script... pass
2011/12 128
http://book.flowingdata.com/ch08/regions/colorize_svg.py.txt
129. Figure 8-18: Choropleth map showing unemployment rates
► Open your new choropleth map in a modern browser such as Firefox, Safari, or Chrome
or in Inkscape to see the fruits of your labor
2011/12 129
131. Define thresholds is by quartiles
► Another common way to define thresholds is by quartiles
» This means that a quarter of the counties have rates below 6.9 percent, another
quarter between 6.9 and 8.7, one between 8.7 and 10.8, and the last quarter is
greater than 10.8 percent
# Quantile scale
if rate > 10.8:
color_class = 3
elif rate > 8.7:
color_class = 2
elif rate > 6.9:
color_class = 1
else:
color_class = 0
2011/12 131
132. Define thresholds is by quartiles
► Use four colors to represents a quarter of the regions
» one shade per quarter
colors = ["#f2f0f7", "#cbc9e2", "#9e9ac8", "#6a51a3"]
2011/12 132
133. Quartiles for re-use
► Instead of hard-coding the values 6.9, 8.7, and 10.8 in your code, you can replace those
values with q1, q2, and q3, respectively.
» The advantage of calculating the values programmatically is that you can reuse the
code with a different dataset just by changing the CSV file
# Quartiles
rates_only.sort()
q1_index = int( 0.25 * len(rates_only) )
q1 = rates_only[q1_index]
q2_index = int( 0.5 * len(rates_only) )
q2 = rates_only[q2_index]
q3_index = int( 0.75 * len(rates_only) )
q3 = rates_only[q3_index]
2011/12 133
134. Modify the script
(or create a new one)
colorize.svg.py
► Follow instructions in book to construct
the next script example
► Minor alterations
2011/12 134
136. Customise and reuse
► You can edit the SVG file in Inkscape,
change border colors and sizes, and add
annotation to make it a complete
graphic for a larger audience (hint: It
still needs a legend) and that fits with
the theme of your project.
► The code is reusable - you can apply it
to other datasets that use the FIPS
code.
2011/12 136
138. Summary
1) Background to Data Visualization
2) Resources
3) Classification of Visualization
4) The Design Process
5) Demonstration
2012-05-24 138
139. Take-away Points
1) Open to all
» new domain with many facets
2) Professional-level output is achievable
» practice a few programming and graphic design techniques
3) It's (only) a means to an end
» should affect behaviour
2012-05-24 139