1. Visual Analytics for the Digital Humanities:
Combining Analytics and Visualization for Gaining
Insights into Linguistic Data
Daniel A. Keim
Data Analysis and Information
Visualization Group
University of Konstanz, Germany
Herrenhausen Conference, Hannover, Germany
December 5, 2013
1
2. Visual Analytics
"Computers are incredibly fast,
accurate, and stupid; humans are
incredibly slow, inaccurate, and
brilliant; together they are powerful
beyond imagination."
attributed to Albert Einstein
Visual Analytics
Tight Integration of Visual and Automatic Data Analysis Methods
for Information Exploration and Scalable Decision Support
Visual Data Exploration
Visualization
Data
Knowledge
Models
Automated Data Analysis
Feedback loop
2
4. Why Visualization for the Digital Humanities?
•! Automated techniques not sufficient
–! Data ambiguous and incomplete
–! Complex relationship
–! Semantic gap
–! Limited Accuracy
•! Human Interaction is central for
–! Exploration of Data
–! Generation of Hypotheses
–! Interpretation of Results
–! Steering of the Analysis
Outline
•! Visual Analytics
–! Motivation and Definition
–! Visualization for the e-Humanities
•! Visual Analytics Examples
–! Literature Analysis
–! Language Analysis
–! Political Analysis
•! Perspectives
4
5. Autorship Attribution
Books of Mark Twain
Books of Jack London
Autorship Attribution
Average
or
Development
over the text?
5
8. Age Suitability Analysis
Features
Characters (Part of Harry Potter)
–! Character Detection
–! Topic Detection
–! Emotion Detection
–! Story Complexity
–! Book Features
–! Readability
Characters (Part of Stephen King’s “It”)
Character are, for example,
(1) Named Entities (2) often agents of verbs
(3) usually not after prepositions indicating a location
Age Suitability Analysis
8
9. Outline
•! Visual Analytics
–! Motivation and Definition
–! Visualization for the e-Humanities
•! Visual Analytics Examples
–! Literature Analysis
–! Language Analysis
–! Political Analysis
•! Perspectives
Cross-Language Analysis
9
10. Cross-Language Analysis
Languages from Papua New Guinea with leaves showing features
ordered to maximize (left) and minimize (right) the pairwise leaf similarity
Cross-Language Analysis
10
11. Vowel Harmony: Cross-linguistic Comparison
of Complex Language Features
“two-level” Vowel Harmony
i and u avoid each other
“one-level” Vowel Harmony
syllable reduplication
Vowel succession patterns in 42 languages (automatically
sorted by significance) [2]
Vowel Harmony: Cross-linguistic Comparison
of Complex Language Features
Comparing Swedish and Norwegian: Vowel
transitions according to their position within words
based on at least 50 Bible types.
Vowel transitions according to their position within
words. Only those transitions plotted based on at least
200 Bible types (interactive filter).
11
18. Voronoi Treemaps [10] in NYT
http://www.nytimes.com/interactive/2008/05/03/business/20080403_SPENDING_GRAPHIC.html?_r=0
Outline
•! Visual Analytics
–! Motivation and Definition
–! Visualization for the e-Humanities
•! Visual Analytics Examples
–! Literature Analysis
–! Language Analysis
–! Political Analysis
•! Perspectives
18
19. Visualization in the Digital Humanities
•! Visualization is central to allow humans and computers
to cooperate effectively
–! allow the computer to process large data
–! allow the human to understand and interact with large data
•! Interactive Visualization is central for
–! Exploration of Data
–! Interpretation of Results
–! Generation of Hypotheses
–! Steering of the Analysis
19
20. Thank you for your attention.
Questions?
“Anyone who claims to know all the answers
doesn't really know very much.”
Apostle Paul in 1. Cor. 8,2
infovis.uni-konstanz.de
20