SlideShare une entreprise Scribd logo
1  sur  17
Thesaurus-Based Indexing of Research
Data in the Social Sciences
Opportunities and Difficulties
of Internationalization Efforts
Katrin Baum, Dipl.-Bibl.
Dr. Andreas Oskar Kempf, M.A. (LIS)
GESIS – Leibniz-Institute for the Social Sciences
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data
Contents
1. Current Trends and Demands in Describing and Cataloguing Research
Data
2. Subject Indexing of Research Data in the Social Sciences – Present
Situation in Europe
3. Thesauri in Subject Indexing
4. Recommended Indexing Model
5. Retrieval Model
6. Practical Aspects
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 2
1. Current Trends and Demands
in Describing and Cataloguing Research Data
Increasing internationalization and standardization efforts:
 to enable and facilitate data exchange
 to enable and facilitate integrated retrieval across distributed
information systems
In the social sciences:
 DDI (e.g. metadata specification, controlled vocabularies)
 Commonly used systems for subject indexing (e.g. ELSST,
CESSDA Topic Classification)
 …
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 3
2. Subject Indexing of Research Data in the Social
Sciences – Present Situation in Europe (1/5)
CESSDA (Council of European Social Science Data Archives):
 Members = data archives and other organisations all across
Europe which archive and provide social science data for
secondary use
 Provides access to 25,000 data collections + 1,000 data
collections every year
 Development and maintenance of European Language Social
Science Thesaurus (ELSST) and CESSDA Topic Classification
 CESSDA catalogue: allows search in data collections of
member organisations, e.g. search by topic or search by
keyword
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 4
2. Subject Indexing of Research Data in the Social
Sciences in Europe – Present Situation (2/5)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 5
2. Subject Indexing of Research Data in the Social
Sciences in Europe – Present Situation (3/5)
European Language Social Science Thesaurus (ELSST):
 Multilingual thesaurus for the social sciences (translated into English,
Danish, Finnish, French, German, Greek, Norwegian, Spanish and
Swedish)
 Based on the HASSET Thesaurus of UKDA
 Further developed by CESSDA members
 Planned: annual release of new version (latest version: 3/2013)
 Contains about 3,300 internationally applicable concepts extracted
from HASSET
 Allows for local extensions of concepts
 Used for subject indexing of research data by CESSDA members
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 6
2. Subject Indexing of Research Data in the Social
Sciences in Europe – Present Situation (4/5)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 7
2. Subject Indexing of Research Data in the Social Sciences
in Europe – Present Situation (5/5)
But:
 No coherent indexing practice throughout the participating
archives due to a lack of a binding indexing policy
 Limited representation of fine-grained national / local issues
(e.g. historical, juridical, religious and political aspects, forms
of national organizations, educational system, collection-
specific aspects …)
 Retrieval limited to internationally applicable concepts
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 8
3. Thesauri in Subject Indexing (1/3)
Some general findings on thesauri:
 Scope and content of each thesaurus is tightly
connected to a specific collection => scope and content
of thesauri of the same domain can differ
 Different levels of abstraction / specificity
 Different perspectives / classification aspects can lead to
different semantic relations
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research
Data
9
3.1 Thesauri in Subject Indexing - Internationally
usable Thesauri (2/3)
Internationally usable thesaurus has to:
 represent concepts that exist in any language
 display these concepts in a hierarchical / semantic structure
that fits all languages
 be free of any bias
 be multilingual
But:
 Fine-grained local issues cannot be displayed
 Retrieval limited to internationally applicable concepts
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research
Data
10
3.2 Thesauri in Subject Indexing - Local Thesauri (3/3)
Exclusive use of a local indexing system:
 Represents scope of local collection
 Respects local aspects
 Allows for more precise indexing
 Easier to maintain
 Monolingual or multilingual access to local collection
But:
 No access to dispersed collections that are indexed with
different terminological resources
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 11
= Aggregate of local thesauri with common, internationally
applicable core concepts
Core:
 Contains concepts that exist in any language
 Hierarchical structure fits all languages
 Free of bias
 Concepts that are already part of the local systems
can be mapped to concepts of core system
 Concepts that are still missing in local systems
can be added
4. Recommended Indexing Model (1/3)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 12
4. Recommended Indexing Model (2/3)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 13
ELSST
(CESSDA
CATALOGUE)
TheSoz
(GESIS)
Universal Core Indexing System
contains central concepts which exist
in any language
(e.g. SECONDARY SCHOOLS)
contains central concepts which
already exist in local indexing systems
(e.g. WEITERFÜHRENDE SCHULEN)
Local Indexing System:
contains local specificities
(e.g. GYMNASIUM)
contains collection-specific concepts
(e.g. NORDRHEIN-WESTFALEN)
HASSET
(UKDA)
4. Recommended Indexing Model (3/3)
SECONDARY SCHOOLS > GYMNASIUM SECONDARY
SCHOOL
(Gymnasium)
GYMNASE
SECONDARY SCHOOLS > REALSCHULE INTERMEDIATE
SCHOOL
ÉCOLE SECONDAIRE
PRATIQUE
SECONDARY SCHOOLS > HAUPTSCHULE SECONDARY
MODERN SCHOOL
ÉCOLE SECONDAIRE
OBLIGATOIRE
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data
14
Thesaurus Cross-Concordances
ELSST
(D, DK, E, FIN, F, GB, GR, N, S)
Relation TheSoz
(D, GB, F)
SECONDARY SCHOOLS = WEITERFÜHRENDE
SCHULE
SECONDARY
SCHOOL
ÉCOLE SECONDAIRE
Linkage between International Core and Local Indexing
System
5. Retrieval Model
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 15
„schools“
„Schulen“
„écoles“
„colegios“
„koulut“
„skole“
„ΣΧΟΛΕΙΑ“
„skola“
„skoler“
Integrated
Retrieval System
(e.g. CESSDA
Catalogue)
ELSST
Preferred Term:
SCHOOLS
Narrower Terms:
- SECONDARY
SCHOOLS
- WEITERFÜHREDE
SCHULE
- … Narrower Terms:
> SECONDARY SCHOOL
(GYMNASIUM)
- GYMNASIUM
> INTERMEDIATE
SCHOOL
- REALSCHULE
> SECONDARY MODERN
SCHOOL
- Hauptschule
=
TheSoz
- SECONDARY
SCHOOLS
- WEITERFÜHRENDE
SCHULE
International Indexing System Local Indexing System
6. Practical Aspects
 Need for binding indexing guidelines for core terms
 Data already indexed with local system remain useful
 User only needs to know one thesaurus
 Local system represents local collection
 Indexing with local system guarantees a more precise
indexing and respects local aspects
 Local systems are easier to maintain
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 16
Thank you
for your attention.
Contact
Katrin Baum
GESIS-Leibniz-Institute for the Social Sciences
katrin.baum@gesis.org
Dr. Andreas Oskar Kempf
GESIS – Leibniz-Institute for the Social Sciences
andreas.kempf@gesis.org
www.gesis.org
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 17

Contenu connexe

Tendances

DM2E and eCloud
DM2E and eCloudDM2E and eCloud
DM2E and eCloudErik Duval
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publicationsNees Jan van Eck
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusNees Jan van Eck
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Andrea Scharnhorst
 
Finding References for NSF Proposals
Finding References for NSF ProposalsFinding References for NSF Proposals
Finding References for NSF ProposalsLiz Dorland
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...Nees Jan van Eck
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3CLARIAH
 
Applications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisApplications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisNees Jan van Eck
 
Large-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesLarge-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesNees Jan van Eck
 
Development of an statistical package for genetic evaluation of trees
Development of an statistical package for genetic evaluation of treesDevelopment of an statistical package for genetic evaluation of trees
Development of an statistical package for genetic evaluation of treesFacundo Muñoz
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...Nees Jan van Eck
 

Tendances (13)

DM2E and eCloud
DM2E and eCloudDM2E and eCloud
DM2E and eCloud
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and Scopus
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
 
Finding References for NSF Proposals
Finding References for NSF ProposalsFinding References for NSF Proposals
Finding References for NSF Proposals
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3
 
Applications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisApplications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysis
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
 
CV-mol
CV-molCV-mol
CV-mol
 
Large-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesLarge-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sources
 
Development of an statistical package for genetic evaluation of trees
Development of an statistical package for genetic evaluation of treesDevelopment of an statistical package for genetic evaluation of trees
Development of an statistical package for genetic evaluation of trees
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
 

En vedette

Slideshare assignment
Slideshare assignmentSlideshare assignment
Slideshare assignmentMontex Baron
 
Dissolution of partnership
Dissolution of partnershipDissolution of partnership
Dissolution of partnershipMuneeb Ahsan
 
A. Postnikov & P. Mahrinsky — Drupal Community — це ми
A. Postnikov & P. Mahrinsky — Drupal Community — це миA. Postnikov & P. Mahrinsky — Drupal Community — це ми
A. Postnikov & P. Mahrinsky — Drupal Community — це миLEDC 2016
 
Анатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhereАнатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhereLEDC 2016
 
Let's go to the theatre 2016 Pygmalion
Let's go to the theatre 2016 PygmalionLet's go to the theatre 2016 Pygmalion
Let's go to the theatre 2016 Pygmalionteteg662
 
Bear Grylls- Personality Analysis
Bear Grylls- Personality AnalysisBear Grylls- Personality Analysis
Bear Grylls- Personality AnalysisMugdha Bomble
 
Typography (incomplete)
Typography (incomplete)Typography (incomplete)
Typography (incomplete)Dom Knowles
 
Dissolution of partnership
Dissolution of partnershipDissolution of partnership
Dissolution of partnershipchinthanaC
 
Episode 5 remember those lines pdf
Episode 5 remember those lines pdfEpisode 5 remember those lines pdf
Episode 5 remember those lines pdfKriztine Viray
 
Hemorragia postparto
Hemorragia postpartoHemorragia postparto
Hemorragia postpartoSol Valese
 
Tratamiento de aguas residuales
Tratamiento de aguas residualesTratamiento de aguas residuales
Tratamiento de aguas residualesAna Villarreal
 
Presentation
PresentationPresentation
PresentationJo Lowes
 
La lengua de la cocina
La lengua de la cocinaLa lengua de la cocina
La lengua de la cocinateteg662
 
Barbara Kruger
Barbara KrugerBarbara Kruger
Barbara KrugerJo Lowes
 

En vedette (20)

Slideshare assignment
Slideshare assignmentSlideshare assignment
Slideshare assignment
 
Capitalismo
CapitalismoCapitalismo
Capitalismo
 
Dissolution of partnership
Dissolution of partnershipDissolution of partnership
Dissolution of partnership
 
A. Postnikov & P. Mahrinsky — Drupal Community — це ми
A. Postnikov & P. Mahrinsky — Drupal Community — це миA. Postnikov & P. Mahrinsky — Drupal Community — це ми
A. Postnikov & P. Mahrinsky — Drupal Community — це ми
 
Анатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhereАнатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhere
 
Let's go to the theatre 2016 Pygmalion
Let's go to the theatre 2016 PygmalionLet's go to the theatre 2016 Pygmalion
Let's go to the theatre 2016 Pygmalion
 
The new (short) chapter
The new (short) chapterThe new (short) chapter
The new (short) chapter
 
Bear grylls
Bear gryllsBear grylls
Bear grylls
 
Bear Grylls- Personality Analysis
Bear Grylls- Personality AnalysisBear Grylls- Personality Analysis
Bear Grylls- Personality Analysis
 
Typography (incomplete)
Typography (incomplete)Typography (incomplete)
Typography (incomplete)
 
앱메일보안 대응방안
앱메일보안 대응방안앱메일보안 대응방안
앱메일보안 대응방안
 
Dissolution of partnership
Dissolution of partnershipDissolution of partnership
Dissolution of partnership
 
Episode 5 remember those lines pdf
Episode 5 remember those lines pdfEpisode 5 remember those lines pdf
Episode 5 remember those lines pdf
 
Web Perfection Portfolio
Web Perfection PortfolioWeb Perfection Portfolio
Web Perfection Portfolio
 
Hemorragia postparto
Hemorragia postpartoHemorragia postparto
Hemorragia postparto
 
Tratamiento de aguas residuales
Tratamiento de aguas residualesTratamiento de aguas residuales
Tratamiento de aguas residuales
 
Presentation
PresentationPresentation
Presentation
 
Introduction to Microservices
Introduction  to MicroservicesIntroduction  to Microservices
Introduction to Microservices
 
La lengua de la cocina
La lengua de la cocinaLa lengua de la cocina
La lengua de la cocina
 
Barbara Kruger
Barbara KrugerBarbara Kruger
Barbara Kruger
 

Similaire à Baum, Kempf: Thesaurus based indexing

Introduction to Learning and Teaching in Higher Education (Part 2)
Introduction to Learning and Teaching in Higher Education (Part 2)Introduction to Learning and Teaching in Higher Education (Part 2)
Introduction to Learning and Teaching in Higher Education (Part 2)NewportCELT
 
How to develop and manage a case study database as suggested by Yin (2009) wi...
How to develop and manage a case study database as suggested by Yin (2009) wi...How to develop and manage a case study database as suggested by Yin (2009) wi...
How to develop and manage a case study database as suggested by Yin (2009) wi...stefanie ng
 
NLP applicata a LIS
NLP applicata a LISNLP applicata a LIS
NLP applicata a LISnoemiricci2
 
Dig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologistsDig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologistsDART Project
 
Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...OpenAIRE
 
Comparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentsComparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentspathsproject
 
Data management intro_text
Data management intro_textData management intro_text
Data management intro_textAvoinTiede
 
Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"Pascual Pérez-Paredes
 
The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...R. John Robertson
 
Knowledge codification and abstraction
Knowledge codification and abstractionKnowledge codification and abstraction
Knowledge codification and abstractionEva Ortoll
 
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxDirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxcuddietheresa
 
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...April Knyff
 
Computationalstylistics tbpresented
Computationalstylistics   tbpresentedComputationalstylistics   tbpresented
Computationalstylistics tbpresentedIera Azmi
 
Knowledge Organisation Systems in Digital Libraries: A Comparative Study
Knowledge Organisation Systems in Digital Libraries: A Comparative StudyKnowledge Organisation Systems in Digital Libraries: A Comparative Study
Knowledge Organisation Systems in Digital Libraries: A Comparative StudyBhojaraju Gunjal
 
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...COST Action TD1210
 
Comparative study of major classification schemes
Comparative study of major classification schemesComparative study of major classification schemes
Comparative study of major classification schemesNadeem Nazir
 
Learning Design and ResearchMethods/Statistics
Learning Design and ResearchMethods/StatisticsLearning Design and ResearchMethods/Statistics
Learning Design and ResearchMethods/StatisticsJames Dalziel
 

Similaire à Baum, Kempf: Thesaurus based indexing (20)

Faculty science ngameni
Faculty science ngameniFaculty science ngameni
Faculty science ngameni
 
Introduction to Learning and Teaching in Higher Education (Part 2)
Introduction to Learning and Teaching in Higher Education (Part 2)Introduction to Learning and Teaching in Higher Education (Part 2)
Introduction to Learning and Teaching in Higher Education (Part 2)
 
How to develop and manage a case study database as suggested by Yin (2009) wi...
How to develop and manage a case study database as suggested by Yin (2009) wi...How to develop and manage a case study database as suggested by Yin (2009) wi...
How to develop and manage a case study database as suggested by Yin (2009) wi...
 
NLP applicata a LIS
NLP applicata a LISNLP applicata a LIS
NLP applicata a LIS
 
Dig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologistsDig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologists
 
Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...
 
Comparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentsComparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documents
 
Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...
Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...
Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...
 
Data management intro_text
Data management intro_textData management intro_text
Data management intro_text
 
Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"
 
The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...
 
Knowledge codification and abstraction
Knowledge codification and abstractionKnowledge codification and abstraction
Knowledge codification and abstraction
 
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxDirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
 
art-cross-challenges
art-cross-challengesart-cross-challenges
art-cross-challenges
 
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
 
Computationalstylistics tbpresented
Computationalstylistics   tbpresentedComputationalstylistics   tbpresented
Computationalstylistics tbpresented
 
Knowledge Organisation Systems in Digital Libraries: A Comparative Study
Knowledge Organisation Systems in Digital Libraries: A Comparative StudyKnowledge Organisation Systems in Digital Libraries: A Comparative Study
Knowledge Organisation Systems in Digital Libraries: A Comparative Study
 
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
 
Comparative study of major classification schemes
Comparative study of major classification schemesComparative study of major classification schemes
Comparative study of major classification schemes
 
Learning Design and ResearchMethods/Statistics
Learning Design and ResearchMethods/StatisticsLearning Design and ResearchMethods/Statistics
Learning Design and ResearchMethods/Statistics
 

Plus de GESIS - Leibniz-Institut für Sozialwissenschaften (7)

Brislinger, Recker: Keeping data re-usable in the evs
Brislinger, Recker: Keeping data re-usable in the evsBrislinger, Recker: Keeping data re-usable in the evs
Brislinger, Recker: Keeping data re-usable in the evs
 
Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web
 
Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective... Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective...
 
Recker, Schumann: De-mystifying OAIS compliance
Recker, Schumann: De-mystifying OAIS complianceRecker, Schumann: De-mystifying OAIS compliance
Recker, Schumann: De-mystifying OAIS compliance
 
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
 
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
 
Linking for learning by Thomas Bosch
Linking for learning by Thomas BoschLinking for learning by Thomas Bosch
Linking for learning by Thomas Bosch
 

Dernier

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdfChristopherTHyatt
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Dernier (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Baum, Kempf: Thesaurus based indexing

  • 1. Thesaurus-Based Indexing of Research Data in the Social Sciences Opportunities and Difficulties of Internationalization Efforts Katrin Baum, Dipl.-Bibl. Dr. Andreas Oskar Kempf, M.A. (LIS) GESIS – Leibniz-Institute for the Social Sciences Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data
  • 2. Contents 1. Current Trends and Demands in Describing and Cataloguing Research Data 2. Subject Indexing of Research Data in the Social Sciences – Present Situation in Europe 3. Thesauri in Subject Indexing 4. Recommended Indexing Model 5. Retrieval Model 6. Practical Aspects Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 2
  • 3. 1. Current Trends and Demands in Describing and Cataloguing Research Data Increasing internationalization and standardization efforts:  to enable and facilitate data exchange  to enable and facilitate integrated retrieval across distributed information systems In the social sciences:  DDI (e.g. metadata specification, controlled vocabularies)  Commonly used systems for subject indexing (e.g. ELSST, CESSDA Topic Classification)  … Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 3
  • 4. 2. Subject Indexing of Research Data in the Social Sciences – Present Situation in Europe (1/5) CESSDA (Council of European Social Science Data Archives):  Members = data archives and other organisations all across Europe which archive and provide social science data for secondary use  Provides access to 25,000 data collections + 1,000 data collections every year  Development and maintenance of European Language Social Science Thesaurus (ELSST) and CESSDA Topic Classification  CESSDA catalogue: allows search in data collections of member organisations, e.g. search by topic or search by keyword Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 4
  • 5. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (2/5) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 5
  • 6. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (3/5) European Language Social Science Thesaurus (ELSST):  Multilingual thesaurus for the social sciences (translated into English, Danish, Finnish, French, German, Greek, Norwegian, Spanish and Swedish)  Based on the HASSET Thesaurus of UKDA  Further developed by CESSDA members  Planned: annual release of new version (latest version: 3/2013)  Contains about 3,300 internationally applicable concepts extracted from HASSET  Allows for local extensions of concepts  Used for subject indexing of research data by CESSDA members Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 6
  • 7. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (4/5) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 7
  • 8. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (5/5) But:  No coherent indexing practice throughout the participating archives due to a lack of a binding indexing policy  Limited representation of fine-grained national / local issues (e.g. historical, juridical, religious and political aspects, forms of national organizations, educational system, collection- specific aspects …)  Retrieval limited to internationally applicable concepts Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 8
  • 9. 3. Thesauri in Subject Indexing (1/3) Some general findings on thesauri:  Scope and content of each thesaurus is tightly connected to a specific collection => scope and content of thesauri of the same domain can differ  Different levels of abstraction / specificity  Different perspectives / classification aspects can lead to different semantic relations Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 9
  • 10. 3.1 Thesauri in Subject Indexing - Internationally usable Thesauri (2/3) Internationally usable thesaurus has to:  represent concepts that exist in any language  display these concepts in a hierarchical / semantic structure that fits all languages  be free of any bias  be multilingual But:  Fine-grained local issues cannot be displayed  Retrieval limited to internationally applicable concepts Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 10
  • 11. 3.2 Thesauri in Subject Indexing - Local Thesauri (3/3) Exclusive use of a local indexing system:  Represents scope of local collection  Respects local aspects  Allows for more precise indexing  Easier to maintain  Monolingual or multilingual access to local collection But:  No access to dispersed collections that are indexed with different terminological resources Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 11
  • 12. = Aggregate of local thesauri with common, internationally applicable core concepts Core:  Contains concepts that exist in any language  Hierarchical structure fits all languages  Free of bias  Concepts that are already part of the local systems can be mapped to concepts of core system  Concepts that are still missing in local systems can be added 4. Recommended Indexing Model (1/3) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 12
  • 13. 4. Recommended Indexing Model (2/3) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 13 ELSST (CESSDA CATALOGUE) TheSoz (GESIS) Universal Core Indexing System contains central concepts which exist in any language (e.g. SECONDARY SCHOOLS) contains central concepts which already exist in local indexing systems (e.g. WEITERFÜHRENDE SCHULEN) Local Indexing System: contains local specificities (e.g. GYMNASIUM) contains collection-specific concepts (e.g. NORDRHEIN-WESTFALEN) HASSET (UKDA)
  • 14. 4. Recommended Indexing Model (3/3) SECONDARY SCHOOLS > GYMNASIUM SECONDARY SCHOOL (Gymnasium) GYMNASE SECONDARY SCHOOLS > REALSCHULE INTERMEDIATE SCHOOL ÉCOLE SECONDAIRE PRATIQUE SECONDARY SCHOOLS > HAUPTSCHULE SECONDARY MODERN SCHOOL ÉCOLE SECONDAIRE OBLIGATOIRE Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 14 Thesaurus Cross-Concordances ELSST (D, DK, E, FIN, F, GB, GR, N, S) Relation TheSoz (D, GB, F) SECONDARY SCHOOLS = WEITERFÜHRENDE SCHULE SECONDARY SCHOOL ÉCOLE SECONDAIRE Linkage between International Core and Local Indexing System
  • 15. 5. Retrieval Model Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 15 „schools“ „Schulen“ „écoles“ „colegios“ „koulut“ „skole“ „ΣΧΟΛΕΙΑ“ „skola“ „skoler“ Integrated Retrieval System (e.g. CESSDA Catalogue) ELSST Preferred Term: SCHOOLS Narrower Terms: - SECONDARY SCHOOLS - WEITERFÜHREDE SCHULE - … Narrower Terms: > SECONDARY SCHOOL (GYMNASIUM) - GYMNASIUM > INTERMEDIATE SCHOOL - REALSCHULE > SECONDARY MODERN SCHOOL - Hauptschule = TheSoz - SECONDARY SCHOOLS - WEITERFÜHRENDE SCHULE International Indexing System Local Indexing System
  • 16. 6. Practical Aspects  Need for binding indexing guidelines for core terms  Data already indexed with local system remain useful  User only needs to know one thesaurus  Local system represents local collection  Indexing with local system guarantees a more precise indexing and respects local aspects  Local systems are easier to maintain Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 16
  • 17. Thank you for your attention. Contact Katrin Baum GESIS-Leibniz-Institute for the Social Sciences katrin.baum@gesis.org Dr. Andreas Oskar Kempf GESIS – Leibniz-Institute for the Social Sciences andreas.kempf@gesis.org www.gesis.org Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 17