obat aborsi Tarakan wa 081336238223 jual obat aborsi cytotec asli di Tarakan9...
Data Visualization for Literary Analysis
1. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Bringing Medieval Occitan to Life:
Visualization Analytics
July 6, 2017
AIEO, Albi 2017
Olga Scrivner and Sandra K¨ubler
2. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Digital Humanities - Transformation
The “epic transformation of archives” - shifting from print to
digital archival form (Folsom, 2007)
3. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Digital Humanities
“As our collective knowledge continues to be digitized and
stored (...) it becomes more difficult to find and discover
what we are looking for.” (Blei 2012)
4. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Visual Analytics in Literature
“The science of analytical reasoning facilitated by
visual interactive interfaces”
(Thomas et al., 2005)
5. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Close Reading
Concept Micro-analysis (Jockers, 2013)
Close textual analysis of individual texts to
“unveil words, verbal images, elements of style,
sentences, argument patterns” (Jasinski, 2001)
Methods Color coding, marginal comments, underlining
Tools Poem Viewer, PRISM, Juxta, eMargin
6. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Close Reading Visualization: eMargin and
JUXTA
7. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Distant Reading
Concept Macro-analysis (Jockers, 2013)
“the construction of abstract models”
(Jasinski, 2001)
Methods Tag clouds, heat maps, clusters, topics,
network graphs
Tools GUI: Voyant, Papermachine
TUI: Mallet, Meta, R and Python packages
8. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Visualization Methods in Literature
Computer-assisted methods for text analysis can “offer new
and unexpected insights and knowledge to the literary
scholar” (Oelke et al., 2012)
9. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Visualization Methods in Literature
Computer-assisted methods for text analysis can “offer new
and unexpected insights and knowledge to the literary
scholar” (Oelke et al., 2012)
Word clouds to analyze a novel (Vuillemot et al., 2009)
10. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Visualization Methods in Literature
Social network graphs of characters in Greek tragedies
(Rydberg-Cox, 2011)
11. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Visualization Methods in Literature
Literary fingerprint and summaries (Oelke et al., 2012)
12. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Visualization Methods in Literature
Tracking emotion and sentiment in fairy tales
(Mohammad, 2012)
13. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Topic Modeling
Discovering underlying theme of text collections (Blei, 2012)
14. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Technological and Methodological Obstacles
Many tools require some programming skills (Mallet,
Meta, R and Python libraries)
GUI tools are limited to certain formats and functions
(Voyant, PaperMachine)
Lack of active control by users
15. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Our Goals - Interactive Text Mining Suite
A user-friendly interactive tool for quantitative and
visualization analysis
Designed for linguistic and literary analysis
Incorporation of annotated corpora in macro-analysis
16. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Background
1. R - a free programming language for statistical
computing and graphics
2. RStudio - Integrated Development Environment: a
source code editor, an executor and a debugger
3. Shiny App - a web application framework for R
17. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
ITMS - Interactive Text Mining Suite
Platform-independent, user-friendly and interactive
State-of-the-art statistical and graphical tools (R
libraries)
http://www.interactivetextminingsuite.com
18. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Multi-Functional
1. Import txt, pdf, rdf and Google books API
2. Metadata extraction
3. Interactive data pre-processing
4. Dynamic visualization
19. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Case Study - Medieval Occitan
Occitan (Proven¸cal) constitutes an important element of the
literary, linguistic, and cultural heritage in the history of
Romance languages
Interactive online database and linguistically annotated
corpus (Scrivner et al., 2014)
http://www.oldoccitancorpus.org
20. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Comparative Analysis: Original and Translation
Lexical level
Grammatical level (part-of-speech)
Stylistic level (sentence length, punctuation)
Document level (cluster, topic analysis)
21. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Bigrams
Bigrams are occurrences of two consecutive words observed
in the text (genre, text classification, discourse features)
22. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Bigram - Lexical Comparison: HE / SHE
went
left
found
had
could
took
were
did
wanted
would
gave
was
said
will
does
saw
knew
might
should
can
has
is
who
asked
0.25x 0.5x Same 2x 4x
Relative appearance after 'she' compared to 'he'
More 'she'
More 'he'
Women asked while men went
Words paired with 'he' and 'she'
She - asked and modal verbs: could, might, can
He - action verbs: went, left, found
23. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Bigram - Archimbaut / Flamenca
distressed
jealous
lady
lord
order
left
leave
marguerite
dear
giving
head
knew
troubled
replied
asked
found
gave
heard
heart
lay
0.5x Same 2x
logratio
More 'Flamenca'
More 'Archambaut'
Words paired with 'Archambaut' and 'Flamenca'
24. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Bigram - Archimbaut / Flamenca
estet
cor
demanda
venc
dis
0.25x 0.5x Same 2x
Relative appearance after 'Flamenca' compared to 'Archimbaut(z)'
More 'Flamenca'
More 'Archimbaut'
Words paired with 'Archimbaut(z)' and 'Flamenca'
25. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Grammatical Level - Part-of-Speech
Occitan corpus English translation
26. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Stylistic Similarities - Sentence Length
Occitan Corpus English Translation
27. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Stylistic Comparison - Punctuation
Occitan Corpus English Translation
Question marks and exclamation marks - red; quotation marks, hyphens and parenthesis - green;
semicolons, colons, commas, periods - blue
28. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Document Level - Network Analyis
Network - “resembling a net (..) to capture the notion of
elements in a system and their interconnectedness”
(Kolaczyk, 2009)
29. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Document Level - Cluster Analysis
Cluster analysis - groups documents into subgroups. These
subgroups “are coherent internally, but clearly different from
each other”
(Manning, 2009)
30. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Document Level - Topic Analysis
Text collections - “represented as random mixtures over
latent topics, where each topic is characterized by a
distribution over words”
(Blei, 2003)
31. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
Conclusion
1. There is a need for text mining tools designed for
linguists and literary scholars
2. Interactive user-friendly applications bridge the gap
between data visualization and digital humanities
3. Shiny framework can be incorporated in any digital
corpora to exhibit, search or visualize written collections
32. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
ITMS
Browser and Smart Phone
Questions, comments
https:
//languagevariationsuite.wordpress.com/
33. Bringing Medieval
Occitan to Life:
Visualization
Analytics
Introduction
Visualization
Methods
ITMS
Medieval Corpus
Conclusion
References
Mohammad, Saif. 2013. From Once Upon a Time to Happily Ever After:
Tracking Emotions in Novels and Fairy Tales. In Proceedings of the ACL
Workshop on Language Technology for Cultural Heritage, Social Sciences, and
Humanities (LaTeCH), 2011, Portland, OR.
Moretti, Franco. 2005. Graphs, maps, trees: abstract models for a literary
history. R.R. Donnelley & Sons.
Oelke, Daniela, Dimitrios Kokkinakis and Mats Malm. 2012. Advanced Visual
Analytics Methods for Literature Analysis. In Proceedings of the 6th EACL
Workshop, 35-44.
Rydberg-Cox, Jeff. 2011. Social Networks and the Language of Greek
Tragedy. Journal of the Chicago Colloquium on Digital Humanities and
Computer Science. 1(3): 1-11.
Thomas, James and Kristin Cook. 2005. Illuminating the Path: the Research
and Development Agenda for Visual Analytics. National Visualization and
Analytics Center.
Vuillemot, Romain, Tanya Clement, Catherine Plaisant and Amit Kumar.
2009. What’s Being Near “Martha”? Exploring Name Entities in Literary Text
Collections. In Proceedings if the IEEE Symposium. Atlantic City, New Jersey.
107-114.
http://www.clipartbest.com/clipart-9i4A55xiE