This document discusses different types of visualizations that can be used for information exploration interfaces. It describes tag clouds, bar charts, histograms, pie charts and scatter plots for visualizing facets. It also covers lists, grids and icons for representing items and matrix charts for showing aggregate views and correlations between features. The roles of proportional, logarithmic and class-based scaling techniques are explained.
1. !"#!$#%&'
COMO CAMPUS
Information visualization
for exploratory interfaces
Luigi Spagnolo
luigi.spagnolo@polimi.it
1 Information and Communication Quality
Why visualization? | 1
!! Highlighting/analysing/learning relationships in
data
!! Telling stories with data
!! Informative and persuasive
!! Fostering learning
!! Goal: provide a simplified and emotional view of a
domain
!! Representation accuracy is secondary w.r.t.
communicative impact
%'
2. !"#!$#%&'
Why visualization? | 2
!! Visual data analysis
!! Mainly informative
!! Goal: provide a tool for visual data mining
!! High precision is essential
Why visualization? | 2
!! More in general…
!! Understanding the domain and its vocabulary
!! The features of information items in it
!! How much these features are relevant (within the
specific context of exploration " query)
!! How these features are correlated
&'
3. !"#!$#%&'
Informative vs Persuasive vs Visual art
!! Informative
!! The reader/viewer looks at the
data to acquire knowledge
!! Persuasive
!! The designer uses data
essentially to convey an intended
message to the reader
!! Visual art
!! The designer “plays” with data
just for the sake of art and
aesthetics
6
http://www.thepaltrysapien.com/2011/06/art-in-the-age-of-interface/
visual-art-derived-from-altanta-airway-traffic-coded-by-type/
"'
4. !"#!$#%&'
Static vs dynamic | 1
!! Infographics: static and carefully designed to
convey the intended message(s) and “tell a story
with data”
!! Often manually drawn (e.g. with software like Adobe
Illustrator)
!! Aesthetics is fundamental for emotional impact.
!! Tailored to the specific data (and therefore nontrivial
to recreate with different data).
!! Relatively data-poor (because each piece of
information must be manually encoded).
8
('
5. !"#!$#%&'
Static vs dynamic | 2
!! Exploration: the message(s) emerge from interaction
!! Also in a way that is not preplanned by designers " users
discover “which is story the data tell them”
!! Designed to be adaptive to different datasets (or updates
of the same dataset) and different user queries
!! Simpler visualization: must be rendered at runtime
!! Complexity is in the interaction and in the amount of data
shown
!! High responsiveness is fundamental to support effective
interaction (rich internet applications required)
A brief recall | 1
10
!! Facet
!! A property describing items features
!! Facet vocabulary
!! Possible values for the property
!! Facet widget
!! Feedback: Monitoring the distribution of items w.r.t.
to the terms of a given property
!! Selection: Adding (or replacing) terms (or
disjunctions of multiple options or negated concepts)
in AND
$'
6. !"#!$#%&'
A brief recall | 2
11
!! Canvas
!! Visualization (and analysis) of query results
!! Possibly according to one or more dimensions
(facets) at differents levels of granularity
Contents of the lecture
12
!! Visualizing facets
!! For facet widgets
!! Univariate (monodimensional) visualizations
!! Visualizing the set of results
!! Lists,indexes and alike
!! Multivariate (multidimensional) visualizations "
show correlations between features
!! Network visualizations " show relationships
between items
)'
7. !"#!$#%&'
Let’s see an example…
13
!! Policulture Portal
!! http://hoc.elet.polimi.it/PoliculturaPortal/
!! A prototypal interface for exploring 1001stories
narratives created by students within the
PoliCultura contest for Italian schools
!! Several facets: school level, discipline, words
extracted from abstracts, etc.
!! See also: !
http://www.museumsandtheweb.com/mw2012/papers/
policultura_portal_17000_students_tell_their_s
14 Visualizing facets
*'
8. !"#!$#%&'
The elements of visualization | 1
15
!! A range of possible values for a property "
facet vocabulary
!! Strings of but also number, dates, complex
concepts, etc.
!! Possibly arranged into a hierarchy
competition : 'Winner'! competition : 'Finalist'! competition : 'Competition'
!! Possibly sorted according to a natural order
schoolLevel : 'primary'! schoolLevel : 'junior'! schoolLevel : 'senior'
The elements of visualization | 2
16
!! A measure of relevance for each term t !TF with
respect to the context of the current query q !Q
µ :TF ! Q " !
!! Possible measures
Count: µ ( t,q ) = ext ( t and q )
ext ( t and q ) ext ( t and q )
Precision: µ ( t,q ) = Recall: µ ( t,q ) =
ext ( q ) ext ( t )
+'
9. !"#!$#%&'
The elements of visualization | 3
17
!! A visualization strategy
!! Mapping the measure of the term relevance to a
graphical aspect
!! E.g. length, width, areas, colors, angles, font
size, etc.
!! E.g. for a bar chart:
µ ( t,q ) ! unit : unit of length (e.g. in pixels)
! ( t,q ) = ! unit where
µ max,q µ max,q : max relevance value for query q
Tag cloud
18
!! The relevance of a term is represented by the font
size
!! Proportional scaling
!! The size of a term is directly proportional to its relevance measure
!! Logarithmic scaling
!! The size of a tag is proportional to the logarithm of it relevance
!! Class-based sizing
!! The size of a term can assume only some values
!! Each possible size corresponds to a range of relevance measure
values
!! Each tag takes the size corresponding to the range in which its
relevance measure falls
,'
10. !"#!$#%&'
Calculating the size of each tag
19
!! Proportional scaling
!! The size of a tag is directly proportional to its magnitude
!! Logarithmic scaling
!! The size of a tag is proportional to the logarithm of it magnitude
!! Class-based sizing
!! The size of a tag can assume only some values
!! Each possible size corresponds to a range of tag magnitudes
!! Each tag takes the size corresponding to the range in which its
magnitude falls
Proportional scaling
20
!! The minimum and maximum tag size desired (given as parameter
by the designer): ! min , ! max
!! Finding the minimum (non-zero) and maximum term relevance:
(
µ max,q = max µ ( ti ,q ) )
with µ ( ti ,q ) > 0
ti !T
µ min,q = min ( µ ( t ,q ))
i
ti !T
!! The formula for µ ti ,q > 0 ( ) where
(
! ( ti ,q ) = ! min + µ ( ti ,q ) " µ min ) #!
#µ q
!" = " max # " min
!µ q = µ max,q # µ min.q
%!'
11. !"#!$#%&'
Logarithmic scaling
21
!! Exactly the same formula, but with the logarithm (usually with base
10) of the term relevance:
!
µ ( ti ,q ) = logµ ( ti ,q ) ( )
! !
µ max,q = max µ ( ti ,q )
!
with µ ( ti ,q ) > 0
ti !T
= min ( µ ( t ,q ))
! !
µ min,q i
ti !T
( ! ! #!
! ( ti ) = ! min + µ ( ti .q ) " µ min !
#µ
) !" = " max # " min
ˆ ˆ
!$ = $ # $ ˆ
max min
Class-based scaling | 1
22
!! Again: ! min , ! max given (
µ max,q = max µ ( ti ,q ) ) calculated
with µ ( ti ,q ) > 0
ti !T
µ min,q = min ( µ ( t ,q ))
i
ti !T
!! The desired number of sizes N (usually between 3 and 20)
!! A ordered set of sizes ! = {! 0 , ! 1,…, ! N }
!! And the corresponding ranges ! = {!0 , !1,…, ! N }
!! A mapping {(! 0 , "0 ), (! 1, "1 ),…, (! k , !k ),…} based on the same index k
%%'
12. !"#!$#%&'
Class-based scaling | 2
23
!! Each range is !k = [lk , hk ]
µ max,q ! µ min,q
!! (
Lower bound: lk = µ min,q + k ! 1 ) N
µ max,q ! µ min,q
!! Higher bound h = µ +k
k min,q
N
An analogy…
24
!! A list of students
!! Each student has her own mark between
18-30
!! Studenti = <namei, marki>
!! We want to display the list such that the size of
each name depends on how high is the mark
!! We create 3 ranges: !max ! !min = 30 !18 = 12
!! 18-22, 22-26, 26-30
%&'
13. !"#!$#%&'
Class-based scaling | 3
25
!! Simple way
!! we compute ranges in advance
!! For each term, we check the range in which it falls
!! Smarter way
!! We just keep a mapping between the index k and the corresponding
size ! k
( )
" µ ti ,q ! µ min,q %
We just determine k = floor $ N
!! '
# µ max,q ! µ min,q &
An analogy | 2
26
!! E.g. for mark 21:
# 21 "18 & # 3&
k = floor % 3! ( = floor % 3! ( = floor ( 0.75) = 0
$ 30 "18 ' $ 12 '
!! E.g. for 25:
# 25 "18 & # 7&
k = floor % 3! ( = floor % 3! ( = floor (1.75) = 1
$ 30 "18 ' $ 12 '
!! E.g. for 29:
# 29 "18 & # 11 &
k = floor % 3! ( = floor % 3! ( = floor ( 2.75) = 2
$ 30 "18 ' $ 12 '
%"'
14. !"#!$#%&'
Which scaling function? | 1
27
!! Power law: in many cases the term measure
(e.g. count) is proportional to a power of the
number of terms having that relevance
measure value
!! Few tags have very high frequency
!! Many tags have low frequency
!! With proportional scaling: few tags are huge,
many tags are very small
!! Logarithmic scaling “adjusts” power laws
distributions by “turning” them into linear…
!! Smoother difference between tags
Which scaling function? | 2
28
!! Class-based sizing: like a step
(piece-wise) function
!! Proportional scaling using sizes in
pixels:
!! Since each tag size must be an integer
!! Proportional scaling is like class scaling
with N = µ !µ
max min
!! Class-based sizing can have a
logarithmic scaling too
%('
15. !"#!$#%&'
Tag clouds: pro and cons
29
!! A tag cloud shows a “simplified” representation of
the distribution of terms according to the facet
!! Advantage: very immediate to convey basic facts (e.g.
which concepts are more relevant)
!! Disadvantage: Cannot analyse in more detail the
quantities into account
Other visualizations | 1
30
!! Bar charts and histograms
!! Length of the bar proportional to the term relevance
measure
!! Possibly logarithmic scaling applied
!! Allow for a more faithful representation " it is possible to
compare relative length of bars
!! Less immediate and “eye candy”
%$'
16. !"#!$#%&'
Other visualizations | 2
31
!! Displaying the fraction of a feature with respect to
the whole range
!! Stacked bars
!! Pie chart
!! Lengths are easier to compare than angles
!! But pie chart may be more “immediate” and
engaging to convey a message
Pie charts and 3D views
32
!! The 3D may distort too
much the values
%)'
17. !"#!$#%&'
The role of colors
33
!! Different colors can be used to represent categorical
values
!! If you want to convey numerical ordering between
term, choose different shades rather than different
colors
!! E.g. states by population
!! less populated " lighter shade
!! more populated " darker shade
The role of colors: bad example
34
%*'
18. !"#!$#%&'
The role of colors: better solution
35
36 Visualizing items
%+'
19. !"#!$#%&'
Lists, indexes and alike | 1
37
!! Allow for access to specific items
!! Items are represented as a “preview”
!! E.g.thumbnail, snippet, etc.
!! Some salient features are chosen by the designer
and/or the user
!! Sorting/grouping of items can be allowed
Lists, indexes and alike | 2
38
%,'
20. !"#!$#%&'
Lists, indexes and alike | 3
39
Tubular/grid view
Lists, indexes and alike | 4
40
Icons rather than text
can help at a glance
understanding
&!'
21. !"#!$#%&'
Scatter plot
41
!! Classical statistical
diagram
!! Shows correlations between
a feature on the x axis and
a feature on the y axis
!! For quantitative data
!! Good impact only for
“expert” user
Aggregate views | 1
42
!! Focus on features shared by items and their
correlation
!! Access to specific item is secondary
!! Items are grouped and aggregated according
to two or more dimensions (at a certain level of
granularity)
!! Aggregation measures: count, average, min,
max, etc.
&%'
22. !"#!$#%&'
Aggregate views: matrix chart | 1
43
!! Two facets: one for rows and one for columns
!! Each datapoint (pair of terms) is represented
as a circle (or other shape), where…
TX : Facet vocabulary for rows
( )
t x ,t y !TX " TY
TY : Facet vocabulary for columns
!! Thesize of the shape represents the number of
items “belonging” to the data point:
( ) (
µ t x ,t y ,q = ext q and t x and t y )
Aggregate views: matrix chart
44
&&'
23. !"#!$#%&'
Aggregate views: matrix + pie chart
45
Aggregate views: mosaic plot | 1
46
!! A “mix” between stacked bars/columns and
matrix plot
!! Width and height of rectangles represent two
different features
!! The area of the rectangle shows how many items
“belong” to the data point
!! More than two dimensions are possible with
additional splits (but becomes less clear)
&"'
24. !"#!$#%&'
Aggregate views: mosaic plot | 2
47
!! Songs by:
!! Theme
(rows)
!! Decade
(columns)
Aggregate views: mosaic plot | 3
48
!! Passengers by:
!! Gender (1st
horizontal split)
!! Survived vs. deceas
(2nd horiz. Split +
color)
!! Travel class (1st
vertical split)
!! Age (2nd vertical
split)
!! What you can learn
from that?
&('
25. !"#!$#%&'
Aggregate views: mosaic plot | 4
49
!! All male crew
members died!
!! Richest (1st class)
women and
children survived
!! Poorest (2nd class)
deceased mostly
Pixel grid plot
50
!! Between a list/index and a
mosaic plot…
!! Each “pixel” or tassel represent
an item
!! One dimension is represented by
color
!! A second dimension may be
represented by the tassel shape
!! Aggregation is “at a glance”
&$'
26. !"#!$#%&'
Network graphs | 1
51
!! Show relations between items as a graph where:
!! Nodes are items
!! Edges are shared features
!! The edge can be “weighted” depending on how
much a pair item have in common
!! Weights may be represented by length thickness
and/or by spatial distance
Network graphs | 2
52
!! Nicoletta Di Blas
co-authors on
Microsoft
Academic Search
!! http://
academic.research.microsoft.co
m/VisualExplorer#686102
&)'
27. !"#!$#%&'
53 Visualizing geography
Geographical information
54
!! Thematic maps visually represent one or more features on
a geographical area
!! Digital, interactive thematic maps
!! Users can zoom and/or adjust visualization in some way
!! Users can filter items
!! More features at once: multivariate thematic map
!! Different signs (shapes, colors, icons) can be used for
showing more characteristics on the same map
!! Avoid mixing shapes, colors and icons together: the result
may be very messy!
&*'
28. !"#!$#%&'
Dot map
55
!! Simplest thematic map
!! One placemark = one item at its
exact location (like in Google
Maps), or
!! One sign = k items in that area
!! Different signs (shapes, colors,
icons) can be used for showing
more characteristics on the same
map
!! May be messy if many items are
concentrated in a small area
!! Expecially at low levels of zoom
!! Expecially multivariate dot maps
Dot map: nice interactive example
56
!! http://www.lemonde.fr/election-presidentielle-2012/visuel/2012/04/23/rapports-de-
force-entre-les-candidats_1688324_1471069.html
&+'
29. !"#!$#%&'
Graduated symbol map | 1
57
!! Also called Proportional symbol map
!! The map is divided into areas
#! (e.g. administrative areas)
!! One sign for each area (single
feature)
!! One sign for each of N features in
each area (multivariate)
!! The size of the sign changes
according to the number of items
with feature X on area Y
!! Proportial, linear, class scalings
!! Multivariate version tends to be messy
if you display too much values at one
Graduated symbol map | 2
58
!! Advantages
!! Statistical distribution on a certain area clearly showed
!! (With respect to dot map) overlapping of signs avoided
!! Disadvantages
!! Multivariate version tends to be messy if you display
too much values at once (e.g. facets with many distinct
values)
!! The scaling should be carefully chosen to avoid too
huge or too small signs
&,'
30. !"#!$#%&'
Pie chart map | 1
59
!! Similar to multivariate graduate
symbol map
!! The map is divided into areas
#! (e.g. administrative areas)
!! One circle (pie) for each area
!! Each part is cut into slices
!! The size of the slice is
proportional to the number of
items with feature X on area Y
Pie chart map | 2
60
!! With respect to multivariate graduate symbol
map…
!! Advantages
!! Less messy when you have to show a lot features at
once
!! Disadvntages
!! Features with low frequency are less visible
!! Analogously we could have histogram chart maps
Information and Communication Quality | Multifaceted Classification and
"!'
31. !"#!$#%&'
Choropleth map | 1
61
!! Using colors, shades or patterns
!! The map is still divided into
areas
!! Each area is colored/patterned/
shaded according to the feature
to show
!! High communicative strengh,
but…
Information and Communication Quality | Multifaceted Classification and
Choropleth map | 2
62
!! A single area may be
colored/shaded/patterned
according on mutually
escusive values
!! E.g. Regions that are
governed by left vs. right
parties
!! Single-valued facets only
"%'
32. !"#!$#%&'
Choropleth map | 3
63
!! The gradient of shade/color
may be proportial to the
frequency of a single feature
!! E.g. number of earthquakes,
population
!! To show more features at
ones you should overlap
colors or patterns: too
messy
!! You need a map for each
facet value
64 Visualizing time
"&'
33. !"#!$#%&'
Timeline
65
!! Shows discribution of items in time
!! Duration can be represented by a bar length
!! Callout for item preview
!! Two or more “resolutions” (unit of time) " detail vs. overall view
Stacked area chart
66
!! Shows evolution of
multiple (numeric)
features over time
!! Each feature is represented by
the colored area
!! Features are stacked
!! The summation of features
represents the whole
""'
34. !"#!$#%&'
Streamgraph
67
!! Evolution of trends (themes of discussion) over time:
!! stacked area + tag cloud
Designing exploratory applications
!! Elicit requirements
!! Decide relevant features
!! Design effective visualization
"('
35. !"#!$#%&'
69 Designing explorations
Requirements: users and stakeholders
!! Identify users and their goals
!! Expert vs novice
!! Ontologies used
!! Overall understanding vs detailed analysis
!! Identify stakeholders goals towards users
!! Identify related scenarios of usage
"$'
36. !"#!$#%&'
Requirements: data and application
!! Constraints on the type and quantity of information the
designer can rely on
!! Number of items and features to handlw
!! Already existing sources (e.g. for information mash-ups)
!! Efforts required for editing and classification
#! E.g. classifiying ancient artifacts is quite difficult because experts
disagree!
!! Technical and application constraints
!! Data formats and kind of devices
!! Software architecture, responsiveness, latency (for web
applications)
!! Time-to-market
Indentifying relevant features
!! With respect to each kind of user, identify
!! The information items they are interested in
!! The relevant properties the user may be actually interested
in
!! If necessary, map existing data description to the required
facets
#! E.g. if you already have “birth-date” and you need “age”, you have
to compute it
#! E.g. convert currencies, unit of measures, etc.
!! Also possibly map different classifications for different
users
")'
37. !"#!$#%&'
Design visualization
!! Evaluate each facet and consider
!! Showing relative vs absolute relevance of terms
!! Precise representation vs “at a glance” understanding
and “emotive” impact
!! Depending on: user interests, number and type of
terms to display, distribution of properties
!! Do the same for canvases
!! Building a fast prototype with a sample of realistic
data may be very helpful
!! Help understanding how data “actually looks like”
COMO CAMPUS
Project B:!
design and prototyping of
an exploratory interface
Luigi Spagnolo
luigi.spagnolo@polimi.it
74 Information and Communication Quality
"*'
38. !"#!$#%&'
The project
75
!! Choose a topic of your interest
!! Find information, create and organize a
collection of information items
!! Design the application:
!! The features (facets) used for the exploration
!! The visualization of results
!! Delivery material: report + prototype
Topics
76
!! Suggested domains/information items
!! Arts, cinema, literature, music " artworks, novels,
movies, artists involved in the field…
!! Cultural heritage " e.g. monuments, cities of
interest, museums, etc.
!! Science and technology " discoveries, inventions,
famous scientists, animals, plants, etc.
!! Something connected with your study interests (e.g.
thesis " must be discussed)
!! Every topic must be agreed with us
"+'
39. !"#!$#%&'
The work | 1
77
!! Create a collection of 80 (or more) information
items
!! For each one, write down an abstract of
approximately 100 words
!! Classify the items according to at least 5 relevant
facets
!! Design a proper visualization for the facets and the
results (at least 2 different canvases)
!! “Special” works can be discussed (e.g.
implementing a different kind of visualization)
The work | 2
78
!! Implement a prototype:
!! A Simile Exhibit/Solr application or similar tools
!! A sequence of realistic mock-ups showing
features
!! Write a report:
!! 5-10 pages
!! Describe the application, and a scenario of usage
",'
40. !"#!$#%&'
The project
79
!! Choose a topic of your interest
!! Find information, create and organize a collection
of information items
!! Design the application:
!! The features (facets) used for the exploration
!! The visualization of results
!! Delivery material: report + prototype (one week
before the exam)
So now what?
80
!! Start deciding the groups (max two people)
!! Communicate us:
!! The team members
!! The choice between Project A (1001stories narrative) vs. Project B (exploratory
app)
!! The topic of your work (to be approved in both cases)
!! Each group should open a thread on the BEEP website forum (category
“PROJECT GROUPS”)
!! Write the title of the post as follows:
[Project X] Surname01 - Surname02
where X = A or B, e.g.
[Project A] Di Blas – Spagnolo!
[Project B] Smith – Rossi
!! All further communications and delivery will be on that thread
(!'