SlideShare une entreprise Scribd logo
1  sur  17
Thesaurus-Based Indexing of Research
Data in the Social Sciences
Opportunities and Difficulties
of Internationalization Efforts
Katrin Baum, Dipl.-Bibl.
Dr. Andreas Oskar Kempf, M.A. (LIS)
GESIS – Leibniz-Institute for the Social Sciences
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data
Contents
1. Current Trends and Demands in Describing and Cataloguing Research
Data
2. Subject Indexing of Research Data in the Social Sciences – Present
Situation in Europe
3. Thesauri in Subject Indexing
4. Recommended Indexing Model
5. Retrieval Model
6. Practical Aspects
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 2
1. Current Trends and Demands
in Describing and Cataloguing Research Data
Increasing internationalization and standardization efforts:
 to enable and facilitate data exchange
 to enable and facilitate integrated retrieval across distributed
information systems
In the social sciences:
 DDI (e.g. metadata specification, controlled vocabularies)
 Commonly used systems for subject indexing (e.g. ELSST,
CESSDA Topic Classification)
 …
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 3
2. Subject Indexing of Research Data in the Social
Sciences – Present Situation in Europe (1/5)
CESSDA (Council of European Social Science Data Archives):
 Members = data archives and other organisations all across
Europe which archive and provide social science data for
secondary use
 Provides access to 25,000 data collections + 1,000 data
collections every year
 Development and maintenance of European Language Social
Science Thesaurus (ELSST) and CESSDA Topic Classification
 CESSDA catalogue: allows search in data collections of
member organisations, e.g. search by topic or search by
keyword
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 4
2. Subject Indexing of Research Data in the Social
Sciences in Europe – Present Situation (2/5)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 5
2. Subject Indexing of Research Data in the Social
Sciences in Europe – Present Situation (3/5)
European Language Social Science Thesaurus (ELSST):
 Multilingual thesaurus for the social sciences (translated into English,
Danish, Finnish, French, German, Greek, Norwegian, Spanish and
Swedish)
 Based on the HASSET Thesaurus of UKDA
 Further developed by CESSDA members
 Planned: annual release of new version (latest version: 3/2013)
 Contains about 3,300 internationally applicable concepts extracted
from HASSET
 Allows for local extensions of concepts
 Used for subject indexing of research data by CESSDA members
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 6
2. Subject Indexing of Research Data in the Social
Sciences in Europe – Present Situation (4/5)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 7
2. Subject Indexing of Research Data in the Social Sciences
in Europe – Present Situation (5/5)
But:
 No coherent indexing practice throughout the participating
archives due to a lack of a binding indexing policy
 Limited representation of fine-grained national / local issues
(e.g. historical, juridical, religious and political aspects, forms
of national organizations, educational system, collection-
specific aspects …)
 Retrieval limited to internationally applicable concepts
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 8
3. Thesauri in Subject Indexing (1/3)
Some general findings on thesauri:
 Scope and content of each thesaurus is tightly
connected to a specific collection => scope and content
of thesauri of the same domain can differ
 Different levels of abstraction / specificity
 Different perspectives / classification aspects can lead to
different semantic relations
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research
Data
9
3.1 Thesauri in Subject Indexing - Internationally
usable Thesauri (2/3)
Internationally usable thesaurus has to:
 represent concepts that exist in any language
 display these concepts in a hierarchical / semantic structure
that fits all languages
 be free of any bias
 be multilingual
But:
 Fine-grained local issues cannot be displayed
 Retrieval limited to internationally applicable concepts
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research
Data
10
3.2 Thesauri in Subject Indexing - Local Thesauri (3/3)
Exclusive use of a local indexing system:
 Represents scope of local collection
 Respects local aspects
 Allows for more precise indexing
 Easier to maintain
 Monolingual or multilingual access to local collection
But:
 No access to dispersed collections that are indexed with
different terminological resources
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 11
= Aggregate of local thesauri with common, internationally
applicable core concepts
Core:
 Contains concepts that exist in any language
 Hierarchical structure fits all languages
 Free of bias
 Concepts that are already part of the local systems
can be mapped to concepts of core system
 Concepts that are still missing in local systems
can be added
4. Recommended Indexing Model (1/3)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 12
4. Recommended Indexing Model (2/3)
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 13
ELSST
(CESSDA
CATALOGUE)
TheSoz
(GESIS)
Universal Core Indexing System
contains central concepts which exist
in any language
(e.g. SECONDARY SCHOOLS)
contains central concepts which
already exist in local indexing systems
(e.g. WEITERFÜHRENDE SCHULEN)
Local Indexing System:
contains local specificities
(e.g. GYMNASIUM)
contains collection-specific concepts
(e.g. NORDRHEIN-WESTFALEN)
HASSET
(UKDA)
4. Recommended Indexing Model (3/3)
SECONDARY SCHOOLS > GYMNASIUM SECONDARY
SCHOOL
(Gymnasium)
GYMNASE
SECONDARY SCHOOLS > REALSCHULE INTERMEDIATE
SCHOOL
ÉCOLE SECONDAIRE
PRATIQUE
SECONDARY SCHOOLS > HAUPTSCHULE SECONDARY
MODERN SCHOOL
ÉCOLE SECONDAIRE
OBLIGATOIRE
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data
14
Thesaurus Cross-Concordances
ELSST
(D, DK, E, FIN, F, GB, GR, N, S)
Relation TheSoz
(D, GB, F)
SECONDARY SCHOOLS = WEITERFÜHRENDE
SCHULE
SECONDARY
SCHOOL
ÉCOLE SECONDAIRE
Linkage between International Core and Local Indexing
System
5. Retrieval Model
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 15
„schools“
„Schulen“
„écoles“
„colegios“
„koulut“
„skole“
„ΣΧΟΛΕΙΑ“
„skola“
„skoler“
Integrated
Retrieval System
(e.g. CESSDA
Catalogue)
ELSST
Preferred Term:
SCHOOLS
Narrower Terms:
- SECONDARY
SCHOOLS
- WEITERFÜHREDE
SCHULE
- … Narrower Terms:
> SECONDARY SCHOOL
(GYMNASIUM)
- GYMNASIUM
> INTERMEDIATE
SCHOOL
- REALSCHULE
> SECONDARY MODERN
SCHOOL
- Hauptschule
=
TheSoz
- SECONDARY
SCHOOLS
- WEITERFÜHRENDE
SCHULE
International Indexing System Local Indexing System
6. Practical Aspects
 Need for binding indexing guidelines for core terms
 Data already indexed with local system remain useful
 User only needs to know one thesaurus
 Local system represents local collection
 Indexing with local system guarantees a more precise
indexing and respects local aspects
 Local systems are easier to maintain
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 16
Thank you
for your attention.
Contact
Katrin Baum
GESIS-Leibniz-Institute for the Social Sciences
katrin.baum@gesis.org
Dr. Andreas Oskar Kempf
GESIS – Leibniz-Institute for the Social Sciences
andreas.kempf@gesis.org
www.gesis.org
Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 17

Contenu connexe

Tendances

VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
Nees Jan van Eck
 

Tendances (13)

DM2E and eCloud
DM2E and eCloudDM2E and eCloud
DM2E and eCloud
 
Intermediacy of publications
Intermediacy of publicationsIntermediacy of publications
Intermediacy of publications
 
Accuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and ScopusAccuracy of citation data in Web of Science and Scopus
Accuracy of citation data in Web of Science and Scopus
 
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
Comparison of methods – an unloved duty? Examples from an ongoing bibliometri...
 
Finding References for NSF Proposals
Finding References for NSF ProposalsFinding References for NSF Proposals
Finding References for NSF Proposals
 
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
VOSviewer and CitNetExplorer: Software tools for bibliometric analysis of s...
 
2016 05-20-clariah-wp3
2016 05-20-clariah-wp32016 05-20-clariah-wp3
2016 05-20-clariah-wp3
 
Applications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysisApplications of community detection in bibliometric network analysis
Applications of community detection in bibliometric network analysis
 
On cluster stability
On cluster stabilityOn cluster stability
On cluster stability
 
CV-mol
CV-molCV-mol
CV-mol
 
Large-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sourcesLarge-scale analysis of bibliometric data sources
Large-scale analysis of bibliometric data sources
 
Development of an statistical package for genetic evaluation of trees
Development of an statistical package for genetic evaluation of treesDevelopment of an statistical package for genetic evaluation of trees
Development of an statistical package for genetic evaluation of trees
 
A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...A systematic empirical comparison of different approaches for normalizing cit...
A systematic empirical comparison of different approaches for normalizing cit...
 

En vedette

Slideshare assignment
Slideshare assignmentSlideshare assignment
Slideshare assignment
Montex Baron
 
Анатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhereАнатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhere
LEDC 2016
 
Tratamiento de aguas residuales
Tratamiento de aguas residualesTratamiento de aguas residuales
Tratamiento de aguas residuales
Ana Villarreal
 

En vedette (20)

Slideshare assignment
Slideshare assignmentSlideshare assignment
Slideshare assignment
 
Capitalismo
CapitalismoCapitalismo
Capitalismo
 
Dissolution of partnership
Dissolution of partnershipDissolution of partnership
Dissolution of partnership
 
A. Postnikov & P. Mahrinsky — Drupal Community — це ми
A. Postnikov & P. Mahrinsky — Drupal Community — це миA. Postnikov & P. Mahrinsky — Drupal Community — це ми
A. Postnikov & P. Mahrinsky — Drupal Community — це ми
 
Анатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhereАнатолій Поляков — Subdomains everywhere
Анатолій Поляков — Subdomains everywhere
 
Let's go to the theatre 2016 Pygmalion
Let's go to the theatre 2016 PygmalionLet's go to the theatre 2016 Pygmalion
Let's go to the theatre 2016 Pygmalion
 
The new (short) chapter
The new (short) chapterThe new (short) chapter
The new (short) chapter
 
Bear grylls
Bear gryllsBear grylls
Bear grylls
 
Bear Grylls- Personality Analysis
Bear Grylls- Personality AnalysisBear Grylls- Personality Analysis
Bear Grylls- Personality Analysis
 
Typography (incomplete)
Typography (incomplete)Typography (incomplete)
Typography (incomplete)
 
앱메일보안 대응방안
앱메일보안 대응방안앱메일보안 대응방안
앱메일보안 대응방안
 
Dissolution of partnership
Dissolution of partnershipDissolution of partnership
Dissolution of partnership
 
Episode 5 remember those lines pdf
Episode 5 remember those lines pdfEpisode 5 remember those lines pdf
Episode 5 remember those lines pdf
 
Web Perfection Portfolio
Web Perfection PortfolioWeb Perfection Portfolio
Web Perfection Portfolio
 
Hemorragia postparto
Hemorragia postpartoHemorragia postparto
Hemorragia postparto
 
Tratamiento de aguas residuales
Tratamiento de aguas residualesTratamiento de aguas residuales
Tratamiento de aguas residuales
 
Presentation
PresentationPresentation
Presentation
 
Introduction to Microservices
Introduction  to MicroservicesIntroduction  to Microservices
Introduction to Microservices
 
La lengua de la cocina
La lengua de la cocinaLa lengua de la cocina
La lengua de la cocina
 
Barbara Kruger
Barbara KrugerBarbara Kruger
Barbara Kruger
 

Similaire à Baum, Kempf: Thesaurus based indexing

Comparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentsComparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documents
pathsproject
 
Knowledge codification and abstraction
Knowledge codification and abstractionKnowledge codification and abstraction
Knowledge codification and abstraction
Eva Ortoll
 
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxDirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
cuddietheresa
 
Computationalstylistics tbpresented
Computationalstylistics   tbpresentedComputationalstylistics   tbpresented
Computationalstylistics tbpresented
Iera Azmi
 

Similaire à Baum, Kempf: Thesaurus based indexing (20)

Faculty science ngameni
Faculty science ngameniFaculty science ngameni
Faculty science ngameni
 
Introduction to Learning and Teaching in Higher Education (Part 2)
Introduction to Learning and Teaching in Higher Education (Part 2)Introduction to Learning and Teaching in Higher Education (Part 2)
Introduction to Learning and Teaching in Higher Education (Part 2)
 
How to develop and manage a case study database as suggested by Yin (2009) wi...
How to develop and manage a case study database as suggested by Yin (2009) wi...How to develop and manage a case study database as suggested by Yin (2009) wi...
How to develop and manage a case study database as suggested by Yin (2009) wi...
 
NLP applicata a LIS
NLP applicata a LISNLP applicata a LIS
NLP applicata a LIS
 
Dig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologistsDig the new breed: how open approaches can empower archaeologists
Dig the new breed: how open approaches can empower archaeologists
 
Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...
 
Comparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documentsComparing taxonomies for organising collections of documents
Comparing taxonomies for organising collections of documents
 
Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...
Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...
Kempf, Sondergeld: Indicator-Based Monitoring of an Interdisciplinary Field o...
 
Data management intro_text
Data management intro_textData management intro_text
Data management intro_text
 
Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"
 
The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...The repository ecology: an approach to understanding repository and service i...
The repository ecology: an approach to understanding repository and service i...
 
Knowledge codification and abstraction
Knowledge codification and abstractionKnowledge codification and abstraction
Knowledge codification and abstraction
 
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docxDirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
DirectionsLength ~3-4 typed, double-spaced pages (approx. 750-1.docx
 
art-cross-challenges
art-cross-challengesart-cross-challenges
art-cross-challenges
 
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
ANALYSIS OF RHETORICAL MOVES OF JOURNAL ARTICLES AND ITS IMPLICATION TO THE T...
 
Computationalstylistics tbpresented
Computationalstylistics   tbpresentedComputationalstylistics   tbpresented
Computationalstylistics tbpresented
 
Knowledge Organisation Systems in Digital Libraries: A Comparative Study
Knowledge Organisation Systems in Digital Libraries: A Comparative StudyKnowledge Organisation Systems in Digital Libraries: A Comparative Study
Knowledge Organisation Systems in Digital Libraries: A Comparative Study
 
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
Richard Smiraglia: Empirical methods for knowledge evolution across Knowledge...
 
Comparative study of major classification schemes
Comparative study of major classification schemesComparative study of major classification schemes
Comparative study of major classification schemes
 
Learning Design and ResearchMethods/Statistics
Learning Design and ResearchMethods/StatisticsLearning Design and ResearchMethods/Statistics
Learning Design and ResearchMethods/Statistics
 

Plus de GESIS - Leibniz-Institut für Sozialwissenschaften

Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
GESIS - Leibniz-Institut für Sozialwissenschaften
 
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
GESIS - Leibniz-Institut für Sozialwissenschaften
 

Plus de GESIS - Leibniz-Institut für Sozialwissenschaften (7)

Brislinger, Recker: Keeping data re-usable in the evs
Brislinger, Recker: Keeping data re-usable in the evsBrislinger, Recker: Keeping data re-usable in the evs
Brislinger, Recker: Keeping data re-usable in the evs
 
Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web Bosch, Wackerow: Linked data on the web
Bosch, Wackerow: Linked data on the web
 
Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective... Zloch, Bosch, Wegener: A technical perspective...
Zloch, Bosch, Wegener: A technical perspective...
 
Recker, Schumann: De-mystifying OAIS compliance
Recker, Schumann: De-mystifying OAIS complianceRecker, Schumann: De-mystifying OAIS compliance
Recker, Schumann: De-mystifying OAIS compliance
 
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
Von der Ein-Datenbank-Suche zum verteilten Suchszenario: Zum Aufbau von Cross...
 
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
Nah am Nutzer (Steinberg) - Konzept und Umsetzung eines Discovery-Services mi...
 
Linking for learning by Thomas Bosch
Linking for learning by Thomas BoschLinking for learning by Thomas Bosch
Linking for learning by Thomas Bosch
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Baum, Kempf: Thesaurus based indexing

  • 1. Thesaurus-Based Indexing of Research Data in the Social Sciences Opportunities and Difficulties of Internationalization Efforts Katrin Baum, Dipl.-Bibl. Dr. Andreas Oskar Kempf, M.A. (LIS) GESIS – Leibniz-Institute for the Social Sciences Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data
  • 2. Contents 1. Current Trends and Demands in Describing and Cataloguing Research Data 2. Subject Indexing of Research Data in the Social Sciences – Present Situation in Europe 3. Thesauri in Subject Indexing 4. Recommended Indexing Model 5. Retrieval Model 6. Practical Aspects Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 2
  • 3. 1. Current Trends and Demands in Describing and Cataloguing Research Data Increasing internationalization and standardization efforts:  to enable and facilitate data exchange  to enable and facilitate integrated retrieval across distributed information systems In the social sciences:  DDI (e.g. metadata specification, controlled vocabularies)  Commonly used systems for subject indexing (e.g. ELSST, CESSDA Topic Classification)  … Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 3
  • 4. 2. Subject Indexing of Research Data in the Social Sciences – Present Situation in Europe (1/5) CESSDA (Council of European Social Science Data Archives):  Members = data archives and other organisations all across Europe which archive and provide social science data for secondary use  Provides access to 25,000 data collections + 1,000 data collections every year  Development and maintenance of European Language Social Science Thesaurus (ELSST) and CESSDA Topic Classification  CESSDA catalogue: allows search in data collections of member organisations, e.g. search by topic or search by keyword Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 4
  • 5. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (2/5) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 5
  • 6. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (3/5) European Language Social Science Thesaurus (ELSST):  Multilingual thesaurus for the social sciences (translated into English, Danish, Finnish, French, German, Greek, Norwegian, Spanish and Swedish)  Based on the HASSET Thesaurus of UKDA  Further developed by CESSDA members  Planned: annual release of new version (latest version: 3/2013)  Contains about 3,300 internationally applicable concepts extracted from HASSET  Allows for local extensions of concepts  Used for subject indexing of research data by CESSDA members Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 6
  • 7. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (4/5) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 7
  • 8. 2. Subject Indexing of Research Data in the Social Sciences in Europe – Present Situation (5/5) But:  No coherent indexing practice throughout the participating archives due to a lack of a binding indexing policy  Limited representation of fine-grained national / local issues (e.g. historical, juridical, religious and political aspects, forms of national organizations, educational system, collection- specific aspects …)  Retrieval limited to internationally applicable concepts Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 8
  • 9. 3. Thesauri in Subject Indexing (1/3) Some general findings on thesauri:  Scope and content of each thesaurus is tightly connected to a specific collection => scope and content of thesauri of the same domain can differ  Different levels of abstraction / specificity  Different perspectives / classification aspects can lead to different semantic relations Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 9
  • 10. 3.1 Thesauri in Subject Indexing - Internationally usable Thesauri (2/3) Internationally usable thesaurus has to:  represent concepts that exist in any language  display these concepts in a hierarchical / semantic structure that fits all languages  be free of any bias  be multilingual But:  Fine-grained local issues cannot be displayed  Retrieval limited to internationally applicable concepts Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 10
  • 11. 3.2 Thesauri in Subject Indexing - Local Thesauri (3/3) Exclusive use of a local indexing system:  Represents scope of local collection  Respects local aspects  Allows for more precise indexing  Easier to maintain  Monolingual or multilingual access to local collection But:  No access to dispersed collections that are indexed with different terminological resources Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 11
  • 12. = Aggregate of local thesauri with common, internationally applicable core concepts Core:  Contains concepts that exist in any language  Hierarchical structure fits all languages  Free of bias  Concepts that are already part of the local systems can be mapped to concepts of core system  Concepts that are still missing in local systems can be added 4. Recommended Indexing Model (1/3) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 12
  • 13. 4. Recommended Indexing Model (2/3) Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 13 ELSST (CESSDA CATALOGUE) TheSoz (GESIS) Universal Core Indexing System contains central concepts which exist in any language (e.g. SECONDARY SCHOOLS) contains central concepts which already exist in local indexing systems (e.g. WEITERFÜHRENDE SCHULEN) Local Indexing System: contains local specificities (e.g. GYMNASIUM) contains collection-specific concepts (e.g. NORDRHEIN-WESTFALEN) HASSET (UKDA)
  • 14. 4. Recommended Indexing Model (3/3) SECONDARY SCHOOLS > GYMNASIUM SECONDARY SCHOOL (Gymnasium) GYMNASE SECONDARY SCHOOLS > REALSCHULE INTERMEDIATE SCHOOL ÉCOLE SECONDAIRE PRATIQUE SECONDARY SCHOOLS > HAUPTSCHULE SECONDARY MODERN SCHOOL ÉCOLE SECONDAIRE OBLIGATOIRE Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 14 Thesaurus Cross-Concordances ELSST (D, DK, E, FIN, F, GB, GR, N, S) Relation TheSoz (D, GB, F) SECONDARY SCHOOLS = WEITERFÜHRENDE SCHULE SECONDARY SCHOOL ÉCOLE SECONDAIRE Linkage between International Core and Local Indexing System
  • 15. 5. Retrieval Model Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 15 „schools“ „Schulen“ „écoles“ „colegios“ „koulut“ „skole“ „ΣΧΟΛΕΙΑ“ „skola“ „skoler“ Integrated Retrieval System (e.g. CESSDA Catalogue) ELSST Preferred Term: SCHOOLS Narrower Terms: - SECONDARY SCHOOLS - WEITERFÜHREDE SCHULE - … Narrower Terms: > SECONDARY SCHOOL (GYMNASIUM) - GYMNASIUM > INTERMEDIATE SCHOOL - REALSCHULE > SECONDARY MODERN SCHOOL - Hauptschule = TheSoz - SECONDARY SCHOOLS - WEITERFÜHRENDE SCHULE International Indexing System Local Indexing System
  • 16. 6. Practical Aspects  Need for binding indexing guidelines for core terms  Data already indexed with local system remain useful  User only needs to know one thesaurus  Local system represents local collection  Indexing with local system guarantees a more precise indexing and respects local aspects  Local systems are easier to maintain Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 16
  • 17. Thank you for your attention. Contact Katrin Baum GESIS-Leibniz-Institute for the Social Sciences katrin.baum@gesis.org Dr. Andreas Oskar Kempf GESIS – Leibniz-Institute for the Social Sciences andreas.kempf@gesis.org www.gesis.org Cologne, May 28 – 31 May │ Baum, Kempf │ IASSIST 2013 │ Thesaurus-Based Indexing of Research Data 17