Digital Humanities is a term that elicits both excitement and scorn in scholarly circles, and there is still a great deal of discussion as to whether it is a field of inquiry, a set of research methods, or simply a new perspective on arts and humanities research. This workshop will provide a brief survey of how the evolving theory and practice of using contemporary technology and technology-assisted research methods are impacting scholarship in the arts and humanities.
2. Contact Information
• All Presentation Materials: https://goo.gl/QgAwcv
Presentation Resources
Kevin Comerford, Associate Professor
Director of Digital Initiatives & Scholarly Communication
Director of IT Services
UNM University Libraries
kevco@unm.edu
4. Definitions
• “Digital Humanities is born of the encounter between
traditional humanities and computational methods.” –
Anne Burdick
• “Digital humanities is work at the intersection of digital
technology and humanities disciplines.” – Johanna Drucker
• “Digital Humanities is less a unified field than an array of
convergent practices that explore a universe in which print
is no longer the primary medium in which knowledge is
produced and disseminated.” – Dig. Humanities Manifesto
5. Definitions
• Our key problem is this: The humanities is a discipline
that values subtlety, nuance, conflicting ideas, and even
paradox. When you’re working with computers, on the
other hand, you have to format information precisely
and rigidly. So how do you use a computer to do
humanities work? Should we stick to word-processing, or
is there a way to take advantage of newer tools, like
digital maps and data visualization, for humanities work?
– Miriam Posner
7. Manifestos
• Themes of Diversity: Diversity studies over the past 30
years contribute to a humanities that is no longer the
province of “old white men”
• Focus on Openness: Open Access, Open Source, Open
Annotation. Advocates for change in copyright laws and
traditions
• Research is project-based. DH projects and research
results are published online, rather than in print.
• Emphasizes teamwork and co-authorship: scholars with
many talents needed for DH research
8. Manifestos
• Digital Humanities means iterative scholarship, mobilized
collaborations, and networks of research.
• Our emblem is a digital photograph of a hammer
(manual making) superimposed over a folded page (the
2d text that now unfolds in three dimensions).
• Digital Humanists recognize curation as a central feature
of the future of the Humanities disciplines.
• Yes, there is something utopian at the core of digital
humanities: The open, the unfixed, the contingent, the
infinite, the expansive, the no place.
9. Source Material
• Digital Humanities research utilizes a variety of
source material, both visual and textual
• Published texts – the most notable DH projects have
performed analysis of traditional humanities texts
• Digital texts – Websites, blogs, emails, etc.
• Primary source material – archives and special collections
content
• Visual collections, both physical (i.e., to be digitized) and
digital
11. Popular DH Working Methods
• Online Writing & Blogging / Open Annotation
• Text Analysis & Data Mining
• Digital Exhibitions
• Digital Mapping/GIS
• Data Visualization
• Linked Data
• Photogrammetry, Virtual Worlds, 3-D Modeling
13. Common Features
“All digital projects have certain structural features in
common. Some are built on “platforms” using software that
has either been designed specifically from within the digital
humanities community (such as Omeka), or has been
repurposed to serve (WordPress, Drupal), or has been
custom-built. We talk about the “back end” and “front end”
of digital projects, the workings under the hood (files on
servers, in browsers, databases, search engines, processing
programs, and networks) and the user experience. Because
all display of digital information on screen is specified in
HTML, hyper-text markup language, all digital projects are
produced in HTML as their final format.” – Johanna Drucker
17. Text Analysis Projects
"Vocabulary changes in
Agatha Christie's
mysteries as an
indication of dementia:
A case study," by Ian
Lancashire and Graeme
Hirst, 2009.
https://www.theguardian.com/books/2009/apr/03/agatha-christie-alzheimers-research
18.
19. Text Analysis
“Textual analysis is a way for researchers to gather
information about how other human beings make sense of
the world. It is a methodology and a data-gathering process
for those researchers who want to understand the ways in
which members of various cultures and subcultures make
sense of who they are, and of how they fit into the world in
which they live. Textual analysis is useful for researchers
working in cultural studies, media studies, in mass
communication, and perhaps even in sociology and
philosophy.” – Alan McKee
25. Text Analysis Tools
VOYANT Hands-On Demo
1. Open your browser to: https://voyant-tools.org/
2. Click on Open – Select a corpus of documents from the list
3. Click Open
4. Re-open browser to: Go to: https://voyant-tools.org/
5. Open your browser to: https://goo.gl/QgAwcv
6. In the Data folder, double click on the “Public ETD Data” file to open
it.
7. Press <control>-a and <control>-c to copy the data.
8. Return to the Voyant tools page, click inside the Add Texts box, and
press <control>-v to paste the data into the field.
9. Click the Reveal button. Note the results
27. Quick and Incomprehensible
Mallet Demo
1. Install & configure Mallet from http://mallet.cs.umass.edu
2. Import documents to be examined:
• binmallet import-dir --input C:Malletdata --output topic-
input.mallet --keep-sequence --remove-stopwords
3. Tell Mallet to look for clusters of 20 words in each record:
• binmallet train-topics --input topic-input.mallet --num-topics 20 --
output-state topic-state.gz --output-topic-keys tutorial_keys.txt --
output-doc-topics tutorial_compostion.txt
28. Quick and Incomprehensible
Mallet Demo: Starting a Topic Model
4. Examine output results (text file):
1. 0 0.25 thesis studies masters in-depth transient develop negotiating live stories conducted poetry discuss multiple struggles
limited extended academic language childcare encounter
2. 1 0.25 party tea political religion making mobilization representing form strategies intensive lens mtp peripherally collective
stimulate difficult degrees economically culturally broad
3. 2 0.25 sociology culture literature examine individual government previous movements result utilizes title workplace phases
transcribed fitting synonymous thinness code content american
4. 3 0.25 rhetorical actors community layers posthuman criticism moves meanings dimensions higher-level teleaction
categories human connection heteronormative ant generally positions strategy single
5. 4 0.25 subject dissertation type lcsh abstract author political level keywords department title analysis provide work public
doctoral research make families populations
6. 5 0.25 embodied investments reproductive theory employment education including influences economies stability changing
behavior importance structural economic results context females tradeoff parental
7. 6 0.25 capital dominican birth opportunities republic access factors offspring's educational income invest decision
differences due investment children reproduction onset spacing control
8. 7 0.25 blind dignified activists labor debates argentina argued employment men theories believed sighted drove
5. Adjust topic commands to focus on specific words and phrases, repeat the
topic commands.
29. Selected Mapping & GIS Tools
• Google Maps (https://www.google.com/maps) Free
• HistoryPin (http://www.historypin.org) Free
• QGIS (http://www.qgis.org) Free
• ArcGIS (https://www.arcgis.com) $ Personal Edition
(note: includes access to StoryMap)
30. Mapping & GIS Tools
Esri StoryMap
• The Blessing Way – Map of the Novel by Hillerman
https://goo.gl/M8ULph
31. Mapping & GIS Tools
Google My Maps Demo
https://www.google.com/maps/d/
36. Contact Information
• All Presentation Materials: https://goo.gl/QgAwcv
Presentation Resources
Kevin Comerford, Associate Professor
Director of Digital Initiatives & Scholarly Communication
Director of IT Services
UNM University Libraries
kevco@unm.edu
Notes de l'éditeur
Hi everyone, the Digital Humanities is an exciting area of growth in humanities research and scholarship. The field is relatively new, still evolving, and growing quickly both in terms of sophistication and breadth of tools, methods and subject focus. Over the next hour and a half we’ll look at some of the underlying theory and concepts that have motivated the development of this field, we’ll also look at actual digital humanities projects, and we’ll examine the software tools that are popularly used in digital humanities research.
This presentation is available on Slideshare, and I have also put an electronic resource guide and copies of some of the software tools we’ll demonstrate on Dropbox at the link shown.
There is a great deal of online discussion and informal publishing about the digital humanities. On the resource list for this presentation there is an article from the online college website that lists 20 valuable blogs about digital humanities reseach – these are definitely worth examining.
Let’s first talk about some of the leading figures in the digital humanities, and start with some basic definitions of the field. At its foundation, digital humanities is an extension of traditional humanities research. For those just dabbing their toes into the field, the work of Anne Burdick, from the Art Center College of Design, and Johanna Drucker from UCLA are an excellent place to start. Both of these scholars have contributed to the defining literature of the digital humanities, and they have taught extensively about the field. Both have open course syllabi published online that you can use to further your understanding of digital humanities principles and practice.
Miriam Posner is another UCLA professor who has taught extensively about Digital Humanities. As her quote implies, digital humanities work spans a broad scope of activities. On one end of the spectrum, digital humanities work can be as straightforward as digitizing content and publishing it online so it is accessible to scholars. On the far end of the spectrum the field utilizes sophisticated analytical techniques to detect similarities and differences in writing style, image content, or geographic distribution.
While the “revolutionary” flavor of digital humanities has dulled a bit over time, in the early 2000’s there was quite a bit of scholarly discussion about how computational methods would combine with new ideas about humanities research, and in particular how these methods could break open perceived barriers in traditional humanities scholarship. New ideas about open access scholarly publishing contributed to defining a more expansive, and collaborative philosophy of how humanities research should be conducted in the twenty-first century.
A cadre of interested scholars came together to create a digital humanities manifesto in 2008. The manifesto outlines both aspirational ideals, such as “openness” but also provides a descriptive framework of how new techniques in data analysis would change traditional notions of research. In particular, the manifesto recognizes that in order to use sophisticated computational tools, that humanists would need to partner collaboratively with statisticians, quantitative analysts, computer programmers and web designers to develop the product of digital research. Thus digital humanities would be seen as project-based and collaborative – more of a group or team effort, rather than the work of individual scholars.
While some of the manifesto content can be seen as enthusiastic hyperbole, it does provide an intriguing foundation from which we can examine our attitudes toward and methods of performing humanities research.
The subject of digital humanities research includes both traditional and new media/social media source material. Frequently, digital humanities research combines “mainstream” intellectual theory or textual analysis methods, such as Hermeneutics, Marxism, Semiotics, etc. in tandem with digitization of content and contemporary data analysis or visualization methods to provide new insights into traditional materials. We’ll examine a couple of these research projects in a minute.
Many digital humanities projects incorporate digitization of physical materials into the research process. In fact, digitization often accounts for a significant portion of digital humanities research grants – this is where libraries who hold primary source materials can collaborate with digital humanities scholars, as library funding for digitization of collections is frequently available through a variety of sources.
Here’s an example of a digital humanities project that I developed here at UNM, using our own library special collections. The Tony Hillerman Portal was a massive effort to digitize all of the manuscripts and papers of Southwest novelist Tony Hillerman. The portal as a website provides access to his papers as a digital collection, but also includes linked data that references cultural, historical and geographical terms and concepts the author used in his work. The site is also a platform for a seminal digital mapping project that we’ll discuss a little later. On the right is a page from the archival finding aid for Hillerman’s papers, in the center is a screenshot of the Hillerman portal website that shows how the finding aid has been enhanced with descriptive analysis of each manuscript and digital image snapshots of manuscript pages. On the left is an example of the digitized manuscript for the novel Blessing Way, which can be viewed entirely online.
This list shows a variety of popular digital humanities working methods. We’ll only have time to discuss the first four today.
Digital Humanities projects are published on the web, either through standalone websites, content management systems, or other types of digital repositories. Given the project-based nature of digital humanities projects
This is a copy of Lancashire & Hirst’s text analysis study of Agatha Christi novels.
Historypin is a web platform where people can come together to share and celebrate local history. It consists of a shared archive, a mutually supportive community and a collaborative approach to engagement with local history. 65,000 individuals and community groups as well as 2,500 libraries, archives, museums, schools in 2,600 cities are using Historypin. At present, the New Mexico Department of Cultural Affairs is participating in a HistoryPin initative called “Memories of Migration” – a project that combines geographic mapping with Oral Histories from families who immigrated to the United States.
The Bomb Sight project is mapping the London WW2 bomb census between 7/10/1940 and 06/06/1941. Previously available only by viewing in the Reading Room at The National Archives, Bomb Sight is making the maps available to citizen researchers, academics and students. They will be able to explore where the bombs fell and to discover memories and photographs from the period. The project has scanned original 1940s bomb census maps , geo-referenced the maps and digitally captured the geographical locations of all the falling bombs recorded on the original map.
The interactive web-mapping application allows users to explore and search for different bomb locations across London. Click on individual bombs and find out information relating to the neighbouring area by reviewing contextual images and memories from the Blitz.
The list here shows a variety of text analysis-capable tools. Wordle is a free tool that most of us have experience using. We’ll take a look at VOYANT in just a moment – VOYANT is free and quite capable for small projects. For large-scale or sophisticated text analysis, MALLET is a good free solution, though it comes with a command-line interface and good learning curve. Other text analysis packages that are very good, but not free include Atlas.ti, JMP by SAS, and NVIVO.
It’s also worth mentioning that web search engine optimization (SEO) tools can also be used for certain types of text analysis. Many of these tools are free.
Voyant is an outgrowth of the TaPor project which was a collaboration between several Canadian universities. The developers were Stefan Sinclair of McGill University and Geoffrey Rockwell, of the University of Alberta. Voyant is suitable for small projects for which basic analytic data, like term occurrence, is the desired outcome. Voyant provides several data visualization tools.
MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. Mallet excels at large scale text processing, and has been used in several numeric optimization projects, where text is converted into numeric data to more easily determine word, phrase and topic clusters.
This demo is meant to quickly show how Mallet works at the command line. Step 2 above imports 10 documents from a directory folder into MALLET, it then removes stopwords and saves them in a database. Step 3 begins the training process for the MALLET application. The command tells MALLET to extract clusters of 20 words in the 10 test documents. The results are saved to a text file.
When we open the results file, the first number of the string is the topic number, and the second number shows the weight of importance of the topic word cluster. From here, one would begin to weight specific words or phrases to focus on actual topics. The demo shows that while Mallet is kind of arcane and clunky, it has the potential to yield quite sophisticated results. Comparable software tools cost thousands of dollars per seat.
We are all familiar with the amazing features of online mapping tools like Google Maps and Google earth. Each of these products can be customized to provide a rich user experie
Omeka (http://omeka.org)
A project of the Roy Rosenzweig Center for History and New Media at George Mason University, Omeka is a free, flexible, and open source web-publishing platform for the display of library, museum, archives, and scholarly collections and exhibitions. Omeka’s Showcase includes projects powered by Omeka.
Neatline (http://neatline.org/)
A suite of add-on tools for Omeka, Neatline allows scholars, students, and curators to tell stories with maps and timelines.
Other Digital Exhibition Applications include:
Scalar (http://scalar.usc.edu/scalar/)
A free, open source authoring and publishing platform that’s designed to make it easy for authors to write long-form, born-digital scholarship online. Scalar enables users to assemble media from multiple sources and juxtapose them with their own writing in a variety of ways, with minimal technical expertise required.
DH Press (http://digitalinnovation.unc.edu/projects/dhpress/)
DH Press is a flexible, repurposable, extensible digital humanities toolkit designed for non-technical users. It enables administrative users to mashup and visualize a variety of digitized humanities-related material, including historical maps, images, manuscripts, and multimedia content. DH Press can be used to create a range of digital projects, from virtual walking tours and interactive exhibits, to classroom teaching tools and community repositories
Viewshare (http://viewshare.org/)
A free platform for generating and customizing views (interactive maps, timelines, facets, tag clouds) that allow users to experience your digital collections.
This presentation is available on Slideshare, and I have also put an electronic resource guide and copies of some of the software tools we’ll demonstrate on Dropbox at the link shown.