SlideShare a Scribd company logo
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Timo Honkela
3 Dec 2015
Research interests in
text and metadata mining
of literature
timo.honkela@helsinki.fi
A Seminar on Digital Publishing and Research
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Inspirational context:
classical literature
The presentation handles mostly general methods.
We can discuss how and to which extent these
can be used in this context.
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Humanities and “compunities”
● Understanding and analysis of cultural artefacts
such as novels or poems requires human
experience of the world and dedication to the
relevant context
● Computers are useful as tireless “distant reading
tools” that can, for example, be put to count
instances of linguistic expressions, their contextual
relations and relation to given categories or to
parse the structure of writings at different levels of
abstraction
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Digital humanities
● A research area called Digital Humanities usually
combines (1) the research questions stemming from
humanities and social sciences, (2) their data
represented in digital form, and (3) addresses the
research questions using quantitative computational
analysis methods (statistics, machine learning) on the
data along with qualitative research methods
● For the computational analysis, large collections of
data (“bid data”) can provide certain benefits but also
small data sets can be analyzed in a similar manner
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Text and data mining
● Data mining refers to the computer-based
analysis of data collections in order to find
interesting or useful patterns, relations or
structures in the data
● Data mining is often applied to numerical data
but also structured data can be used
● Text mining refers to data mining applied
specifically to texts
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Text mining
● Analysis of text documents at different levels
of abstraction
– Word segments (morphology)
– Lexicon (words, terms, phrases, names, etc.)
– Syntax
– Semantics
– Pragmatics (computationally challenging)
● Context!
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Classical example: Learning meaning from context:
Maps of words in Grimm fairy tales
Honkela, Pulkki & Kohonen 1995
Automated learning of word relations
using self-organizing map on text context data
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Map of Finnish Science
Chemistry
Physics and
engineering
Biosciences
Medicine
Culture and
society
A fully automated process from terminology extraction (Likey) to
semantic space construction (SOM) without any manually constructed resources.
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Further opportunities
● Time series analysis (historical developments)
– Ordering a collection of texts
– Analysis of narrative structures
● Social Network Analysis
● Sentiment analysis
● Names Entity Recognition
(people, places, organisations)
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Analysis of metadata
● Study of the overall structures in a collection
● Important sources include author, year of
publication, place, etc.
● Can be analyzed in itself or in combination
with the full text data
● Study of the quality of the metadata
– Variation
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Hybrid methods:
qualitative + quantative
● Qualitative interpretation of quantitative
analysys results
● Quantitative analysis of qualitative
interpretations
● Parallel analysis with qualitative methods (e.g.
grounded theory) and quantative methods
(e.g. undersupervised learning)
● Quantative analysis of human subjective and
contextual understanding and expression
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Grounded Intersubjective
Concept Analysis
● A method developed to model how langage is
understood in context and with some degree
of individuality
● Computational approaches often assume a
shared epistemology; here we are interested
in the differences in human interpretation
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
GICA analysis of the word health
in State of the Union Addresses
Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
Tack så mycket!

More Related Content

Viewers also liked

Presentation1 inflasi
Presentation1 inflasiPresentation1 inflasi
Presentation1 inflasi
Alawiyahipeh
 
Fundamentos de circuitos eléctricos 3 ed - sadiku
Fundamentos de circuitos eléctricos   3 ed - sadikuFundamentos de circuitos eléctricos   3 ed - sadiku
Fundamentos de circuitos eléctricos 3 ed - sadiku
Diego Bladimir Cuesta Jurado
 
EL ZARANDEO - II
EL ZARANDEO - IIEL ZARANDEO - II
English primero
English primeroEnglish primero
English primero
Alejandro Galvis Hurtado
 
Las 3 oraciones diarias de protección.
Las 3 oraciones diarias de protección. Las 3 oraciones diarias de protección.
Las 3 oraciones diarias de protección.
Raquel Z
 
Fakhra complete assignment
Fakhra complete assignmentFakhra complete assignment
Fakhra complete assignment
uzma Gujjer
 
Series and parallel arrangement of capacitors
Series and parallel arrangement of capacitorsSeries and parallel arrangement of capacitors
Series and parallel arrangement of capacitors
physicscatalyst
 
Van de graff generator
Van de graff generatorVan de graff generator
Van de graff generator
Dipak Antil
 
Managing Safe and Orderly Schools
Managing Safe and Orderly SchoolsManaging Safe and Orderly Schools
Managing Safe and Orderly Schools
Robert Leneway
 

Viewers also liked (9)

Presentation1 inflasi
Presentation1 inflasiPresentation1 inflasi
Presentation1 inflasi
 
Fundamentos de circuitos eléctricos 3 ed - sadiku
Fundamentos de circuitos eléctricos   3 ed - sadikuFundamentos de circuitos eléctricos   3 ed - sadiku
Fundamentos de circuitos eléctricos 3 ed - sadiku
 
EL ZARANDEO - II
EL ZARANDEO - IIEL ZARANDEO - II
EL ZARANDEO - II
 
English primero
English primeroEnglish primero
English primero
 
Las 3 oraciones diarias de protección.
Las 3 oraciones diarias de protección. Las 3 oraciones diarias de protección.
Las 3 oraciones diarias de protección.
 
Fakhra complete assignment
Fakhra complete assignmentFakhra complete assignment
Fakhra complete assignment
 
Series and parallel arrangement of capacitors
Series and parallel arrangement of capacitorsSeries and parallel arrangement of capacitors
Series and parallel arrangement of capacitors
 
Van de graff generator
Van de graff generatorVan de graff generator
Van de graff generator
 
Managing Safe and Orderly Schools
Managing Safe and Orderly SchoolsManaging Safe and Orderly Schools
Managing Safe and Orderly Schools
 

Similar to Timo Honkela: Research interests in text and metadata mining of literature

Timo Honkela: Semantic and pragmatics representations of large text corpora
Timo Honkela: Semantic and pragmatics representations of large text corporaTimo Honkela: Semantic and pragmatics representations of large text corpora
Timo Honkela: Semantic and pragmatics representations of large text corpora
Timo Honkela
 
Timo Honkela: Spaces of Knowledge
Timo Honkela: Spaces of KnowledgeTimo Honkela: Spaces of Knowledge
Timo Honkela: Spaces of Knowledge
Timo Honkela
 
Digital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectivesDigital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectives
Cornelius Puschmann
 
Textual analysis Or Content Analysis ppt
Textual analysis Or Content Analysis pptTextual analysis Or Content Analysis ppt
Textual analysis Or Content Analysis ppt
HelinaWorku2
 
Digital humanities
Digital humanitiesDigital humanities
Digital humanities
MansiGajjar13
 
Cultural text mining workshop
Cultural text mining workshopCultural text mining workshop
Cultural text mining workshop
Pim Huijnen
 
Content analysis
Content analysisContent analysis
Content analysis
dsmjrf
 
Timo Honkela: Epistemological status of linguistic theories and models
Timo Honkela: Epistemological status of linguistic theories and modelsTimo Honkela: Epistemological status of linguistic theories and models
Timo Honkela: Epistemological status of linguistic theories and models
Timo Honkela
 
Zoss High-Level Text Analysis and Techniques
Zoss High-Level Text Analysis and TechniquesZoss High-Level Text Analysis and Techniques
Zoss High-Level Text Analysis and Techniques
DukeDigitalScholarship
 
Introduction To Critical Enquiry Research
Introduction To Critical Enquiry ResearchIntroduction To Critical Enquiry Research
Introduction To Critical Enquiry Research
Terry Flew
 
Forty Years of the OTA
Forty Years of the OTAForty Years of the OTA
Forty Years of the OTA
Martin Wynne
 
content analysis
content analysiscontent analysis
content analysis
Essam Obaid
 
How a Prototype Argues
How a Prototype ArguesHow a Prototype Argues
How a Prototype Argues
drcode
 
Tools that Encourage Criticism - Leiden University Symposium on Tools Criticism
Tools that Encourage Criticism - Leiden University Symposium on Tools CriticismTools that Encourage Criticism - Leiden University Symposium on Tools Criticism
Tools that Encourage Criticism - Leiden University Symposium on Tools Criticism
Marijn Koolen
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological research
TERMCAT
 
Digital Humanities: An Introduction
Digital Humanities: An IntroductionDigital Humanities: An Introduction
Digital Humanities: An Introduction
Dilip Barad
 
Digital Humanities
Digital HumanitiesDigital Humanities
Digital Humanities
Latta Baraiya
 
Involving users in the design of apps for the writing processes. An experimen...
Involving users in the design of apps for the writing processes.An experimen...Involving users in the design of apps for the writing processes.An experimen...
Involving users in the design of apps for the writing processes. An experimen...Maria Ranieri
 
Timo Honkela: Digital Preservation and Computational Modeling of Language and...
Timo Honkela: Digital Preservation and Computational Modeling of Language and...Timo Honkela: Digital Preservation and Computational Modeling of Language and...
Timo Honkela: Digital Preservation and Computational Modeling of Language and...
Timo Honkela
 

Similar to Timo Honkela: Research interests in text and metadata mining of literature (20)

Timo Honkela: Semantic and pragmatics representations of large text corpora
Timo Honkela: Semantic and pragmatics representations of large text corporaTimo Honkela: Semantic and pragmatics representations of large text corpora
Timo Honkela: Semantic and pragmatics representations of large text corpora
 
Susanne popp - E-researching in the History classroom
Susanne popp - E-researching in the History classroomSusanne popp - E-researching in the History classroom
Susanne popp - E-researching in the History classroom
 
Timo Honkela: Spaces of Knowledge
Timo Honkela: Spaces of KnowledgeTimo Honkela: Spaces of Knowledge
Timo Honkela: Spaces of Knowledge
 
Digital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectivesDigital Humanities and Internet Research: shared methods and perspectives
Digital Humanities and Internet Research: shared methods and perspectives
 
Textual analysis Or Content Analysis ppt
Textual analysis Or Content Analysis pptTextual analysis Or Content Analysis ppt
Textual analysis Or Content Analysis ppt
 
Digital humanities
Digital humanitiesDigital humanities
Digital humanities
 
Cultural text mining workshop
Cultural text mining workshopCultural text mining workshop
Cultural text mining workshop
 
Content analysis
Content analysisContent analysis
Content analysis
 
Timo Honkela: Epistemological status of linguistic theories and models
Timo Honkela: Epistemological status of linguistic theories and modelsTimo Honkela: Epistemological status of linguistic theories and models
Timo Honkela: Epistemological status of linguistic theories and models
 
Zoss High-Level Text Analysis and Techniques
Zoss High-Level Text Analysis and TechniquesZoss High-Level Text Analysis and Techniques
Zoss High-Level Text Analysis and Techniques
 
Introduction To Critical Enquiry Research
Introduction To Critical Enquiry ResearchIntroduction To Critical Enquiry Research
Introduction To Critical Enquiry Research
 
Forty Years of the OTA
Forty Years of the OTAForty Years of the OTA
Forty Years of the OTA
 
content analysis
content analysiscontent analysis
content analysis
 
How a Prototype Argues
How a Prototype ArguesHow a Prototype Argues
How a Prototype Argues
 
Tools that Encourage Criticism - Leiden University Symposium on Tools Criticism
Tools that Encourage Criticism - Leiden University Symposium on Tools CriticismTools that Encourage Criticism - Leiden University Symposium on Tools Criticism
Tools that Encourage Criticism - Leiden University Symposium on Tools Criticism
 
Social media as a tool for terminological research
Social media as a tool for terminological researchSocial media as a tool for terminological research
Social media as a tool for terminological research
 
Digital Humanities: An Introduction
Digital Humanities: An IntroductionDigital Humanities: An Introduction
Digital Humanities: An Introduction
 
Digital Humanities
Digital HumanitiesDigital Humanities
Digital Humanities
 
Involving users in the design of apps for the writing processes. An experimen...
Involving users in the design of apps for the writing processes.An experimen...Involving users in the design of apps for the writing processes.An experimen...
Involving users in the design of apps for the writing processes. An experimen...
 
Timo Honkela: Digital Preservation and Computational Modeling of Language and...
Timo Honkela: Digital Preservation and Computational Modeling of Language and...Timo Honkela: Digital Preservation and Computational Modeling of Language and...
Timo Honkela: Digital Preservation and Computational Modeling of Language and...
 

More from Timo Honkela

Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
 Timo Honkela: Meaning negotiations as phenomenon and as languages technology... Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
Timo Honkela
 
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela
 
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
Timo Honkela
 
Timo Honkela: From early to later Wittgenstein and Artificial Intelligence
Timo Honkela: From early to later Wittgenstein and Artificial IntelligenceTimo Honkela: From early to later Wittgenstein and Artificial Intelligence
Timo Honkela: From early to later Wittgenstein and Artificial Intelligence
Timo Honkela
 
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
Timo Honkela
 
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
Timo Honkela
 
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
Timo Honkela
 
Timo Honkela: Turning quantity into quality and making concepts visible using...
Timo Honkela: Turning quantity into quality and making concepts visible using...Timo Honkela: Turning quantity into quality and making concepts visible using...
Timo Honkela: Turning quantity into quality and making concepts visible using...
Timo Honkela
 
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
Timo Honkela
 
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
Timo Honkela
 
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
Timo Honkela
 
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
Timo Honkela
 
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
Timo Honkela
 
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
Timo Honkela
 
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
Timo Honkela
 
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
Timo Honkela
 
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
Timo Honkela
 
Timo Honkela: Digitalisaatio tulevaisuudessa
Timo Honkela: Digitalisaatio tulevaisuudessaTimo Honkela: Digitalisaatio tulevaisuudessa
Timo Honkela: Digitalisaatio tulevaisuudessa
Timo Honkela
 
Timo Honkela: Analysis of Qualitative Data using Machine Learning Methods
Timo Honkela: Analysis of Qualitative Data using Machine Learning MethodsTimo Honkela: Analysis of Qualitative Data using Machine Learning Methods
Timo Honkela: Analysis of Qualitative Data using Machine Learning Methods
Timo Honkela
 
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
Timo Honkela
 

More from Timo Honkela (20)

Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
 Timo Honkela: Meaning negotiations as phenomenon and as languages technology... Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology...
 
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
Timo Honkela: Meaning negotiations as phenomenon and as languages technology ...
 
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
Timo Honkela: Peace Machine: Using Artificial Intelligence to Promote Peacefu...
 
Timo Honkela: From early to later Wittgenstein and Artificial Intelligence
Timo Honkela: From early to later Wittgenstein and Artificial IntelligenceTimo Honkela: From early to later Wittgenstein and Artificial Intelligence
Timo Honkela: From early to later Wittgenstein and Artificial Intelligence
 
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
Timo Honkela: Peace Machine: Peace from a difference perspective - Dialogue o...
 
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
Timo Honkela: Kielellisten merkisten tilastollinen ja psykologinen luonne: Ko...
 
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
Timo Honkela, kutsuttu esitelmä Automaatiopäivillä 2017
 
Timo Honkela: Turning quantity into quality and making concepts visible using...
Timo Honkela: Turning quantity into quality and making concepts visible using...Timo Honkela: Turning quantity into quality and making concepts visible using...
Timo Honkela: Turning quantity into quality and making concepts visible using...
 
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
Timo Honkela: Tietokone lukemassa yli 100 miljoonaa eri kirjaa: Kielitieteen ...
 
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
Timo Honkela: Introducing the book Encyclopedia of Artificial Intelligence (i...
 
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
Timo Honkela: Tekoälyn ja koneoppimisen uhat ja mahdollisuudet, Turku, 27.10....
 
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
Timo Honkela: Kohonen's Self-Organizing Maps for Intelligent Systems Developm...
 
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
Timo Honkela: Kylmä data kohtaa inhimillisen tulkinnan, Studia Generalia -esi...
 
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
Timo Honkela: Ihminen+ -esitelmä, Mikkeli, 22.9.2016
 
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
Timo Honkela: Kynä ja kone alustus menetelmistä, 15.9.2016
 
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
Honkela. Lagus & Kanner: Parallel Conceptual Spaces and Systems in Health and...
 
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
Timo Honkela: Miten tekoäly muuttaa oppimista ja työtä? Kalajoen lukio, 17.8....
 
Timo Honkela: Digitalisaatio tulevaisuudessa
Timo Honkela: Digitalisaatio tulevaisuudessaTimo Honkela: Digitalisaatio tulevaisuudessa
Timo Honkela: Digitalisaatio tulevaisuudessa
 
Timo Honkela: Analysis of Qualitative Data using Machine Learning Methods
Timo Honkela: Analysis of Qualitative Data using Machine Learning MethodsTimo Honkela: Analysis of Qualitative Data using Machine Learning Methods
Timo Honkela: Analysis of Qualitative Data using Machine Learning Methods
 
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
Timo Honkela: Silta-tilaisuuden alustus, 7.6.2016
 

Recently uploaded

June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
Peter Windle
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 

Recently uploaded (20)

June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
Embracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic ImperativeEmbracing GenAI - A Strategic Imperative
Embracing GenAI - A Strategic Imperative
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 

Timo Honkela: Research interests in text and metadata mining of literature

  • 1. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Timo Honkela 3 Dec 2015 Research interests in text and metadata mining of literature timo.honkela@helsinki.fi A Seminar on Digital Publishing and Research
  • 2. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Inspirational context: classical literature The presentation handles mostly general methods. We can discuss how and to which extent these can be used in this context.
  • 3. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015
  • 4. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Humanities and “compunities” ● Understanding and analysis of cultural artefacts such as novels or poems requires human experience of the world and dedication to the relevant context ● Computers are useful as tireless “distant reading tools” that can, for example, be put to count instances of linguistic expressions, their contextual relations and relation to given categories or to parse the structure of writings at different levels of abstraction
  • 5. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Digital humanities ● A research area called Digital Humanities usually combines (1) the research questions stemming from humanities and social sciences, (2) their data represented in digital form, and (3) addresses the research questions using quantitative computational analysis methods (statistics, machine learning) on the data along with qualitative research methods ● For the computational analysis, large collections of data (“bid data”) can provide certain benefits but also small data sets can be analyzed in a similar manner
  • 6. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Text and data mining ● Data mining refers to the computer-based analysis of data collections in order to find interesting or useful patterns, relations or structures in the data ● Data mining is often applied to numerical data but also structured data can be used ● Text mining refers to data mining applied specifically to texts
  • 7. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Text mining ● Analysis of text documents at different levels of abstraction – Word segments (morphology) – Lexicon (words, terms, phrases, names, etc.) – Syntax – Semantics – Pragmatics (computationally challenging) ● Context!
  • 8. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Classical example: Learning meaning from context: Maps of words in Grimm fairy tales Honkela, Pulkki & Kohonen 1995 Automated learning of word relations using self-organizing map on text context data
  • 9. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Map of Finnish Science Chemistry Physics and engineering Biosciences Medicine Culture and society A fully automated process from terminology extraction (Likey) to semantic space construction (SOM) without any manually constructed resources.
  • 10. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Further opportunities ● Time series analysis (historical developments) – Ordering a collection of texts – Analysis of narrative structures ● Social Network Analysis ● Sentiment analysis ● Names Entity Recognition (people, places, organisations)
  • 11. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Analysis of metadata ● Study of the overall structures in a collection ● Important sources include author, year of publication, place, etc. ● Can be analyzed in itself or in combination with the full text data ● Study of the quality of the metadata – Variation
  • 12. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Hybrid methods: qualitative + quantative ● Qualitative interpretation of quantitative analysys results ● Quantitative analysis of qualitative interpretations ● Parallel analysis with qualitative methods (e.g. grounded theory) and quantative methods (e.g. undersupervised learning) ● Quantative analysis of human subjective and contextual understanding and expression
  • 13. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Grounded Intersubjective Concept Analysis ● A method developed to model how langage is understood in context and with some degree of individuality ● Computational approaches often assume a shared epistemology; here we are interested in the differences in human interpretation
  • 14. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 GICA analysis of the word health in State of the Union Addresses
  • 15. Timo Honkela, A Seminar on Digital Publishing and Research. 3 Dec 2015 Tack så mycket!