Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Analyzing AV Sources as Data - Responsible Data Science lecture

38 vues

Publié le

Humanities scholars work with unstructured data: information about human culture stored in books, archival records, audiovisual sources and other carriers of information. Traditionally, the data from these various sources were extracted and processed in the mind of the scholar. With the growing availability of these data in digital form, the tasks of extracting and combining information from various datasets becomes mediated by computational tools. In order to support scholars in working with digital data, a high level of transparency is required: scholars want to know exactly where the data originate, how they have been processed and manipulated, and what this means for their results and interpretation. In this lecture I will discuss our experiences in designing the CLARIAH research infrastructure for media studies research, focusing on the requirements regarding the transparency of data and tools.

Publié dans : Données & analyses
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Analyzing AV Sources as Data - Responsible Data Science lecture

  1. 1. Analyzing AV Sources as Data Experiences Gained with Designing the CLARIAH Media Suite Julia Noordegraaf University of Amsterdam
  2. 2. Introduction • Data transparency prerequisite for broad take-up DH • Learning by doing: – CREATE program (www.create.humanities.uva.nl) – CLARIAH infrastructure (http://mediasuite.clariah.nl/) • Examples: – metadata criticism – data and tool criticism
  3. 3. What does CLARIAH do? Build a virtual research environment for conducting humanities research with digital data and tools
  4. 4. Context • Part of the National Roadmap Large-scale Research Infrastructure • Dutch contribution to to the European research infrastructures for the Humanities and Social Sciences (CLARIN and DARIAH)
  5. 5. Focus • Selective focus on three types of data & fields: ▪ Textual data – Linguistics ▪ Structured data – Socio-economic History ▪ Audiovisual data – Media Studies • Aim to make collections and tools available for other disciplines in the Humanities and Social Sciences
  6. 6. Following the research process
  7. 7. Data in the Media suite Overview of available collections including collection descriptions (CKAN)
  8. 8. Collection Inspector Tool
  9. 9. 1.Selecting a collection
  10. 10. 2. Inspecting available metadata fields
  11. 11. 3. Selecting date & analysis fields
  12. 12. Collection inspector results
  13. 13. EYE metadata
  14. 14. Hoyt, Eric, Kit Hughes, Derek Long, Anthony Tran, and Kevin Ponto. 2014. “Scaled Entity Search: A Method for Media Historiography and Response to Critiques of Big Humanities Data Research.” In 2014 IEEE International Conference on Big Data (Big Data), 51–59.
  15. 15. Advancing tool criticism Spinque Desk
  16. 16. Conclusions • More data = more transparency needed • Black-boxing versus improving digital literacy • Solutions: – aggregate views on collections and metadata quality – tools for data criticism – tools for tool criticism
  17. 17. Launch MSv3: April 2018 http://mediasuite.clariah.nl/ Julia Noordegraaf j.j.noordegraaf@uva.nl http://www.clariah.nl/werkpakketten/focusgebieden/media-studies