SlideShare une entreprise Scribd logo
1  sur  19
Télécharger pour lire hors ligne
EVA/Minerva 2016
Integration and Retrieval of
Heterogeneous Archival Metadata
CONNECTING
COLLECTIONS
Kepa J. Rodriguez – Archives Yad Vashem
09/11/2016
Outline
●
Data integration in the first phase of the project
●
Our actual integration approach
●
Retrieval of data using controlled vocabularies
●
Development of the EHRI controlled vocabularies
Data integration in the first phase of the project
●
Holding institutions delivered data in very different formats:
●
XML, text files, CSV, JSON, etc...
●
Ingestion into the portal was made case by case
●
We interpreted data model and map it with our model
●
Sometimes without help of the institution
●
Lots of data introduced by hand
●
Process no sustainable, it cannot be repeated
●
No automatic updates are possible
●
If an institution updates content, data has to be updated by hand
●
Other problems: infrastructure, persistent identifiers, etc.
Proposal for the second phase of the project
● Data conversion
● Data publication and synchronization
● Data ingestion
Data conversion
●
Converstion tool: different data formats into EAD:
●
XML, JSON, CSV...
●
Generic transformation
●
Useful for a relevant number of institutions
●
Reusable functions, as mappings for specific fields of their export
format into EAD
●
Utilities to configure specific transformations
●
Validation of the output:
●
Machine validation: XML validation protocols
●
Schematron, RNG
●
Human validation: HTML preview including mark-up
for validation errors
EAD File sample (1)
<archdesc level="subgrp">
<did>
<unitid>M.49.E</unitid>
<unittitle encodinganalog="3.1.2">Testimonies of Holocaust Survivors collected by the
Central Jewish Historical Commission in Poland, 1944-1947</unittitle>
<physdesc encodinganalog="3.1.5">6845 files</physdesc>
<langmaterial>
<language langcode="deu" encodinganalog="3.4.3">German</language>
<language langcode="pol" encodinganalog="3.4.3">Polish</language>
<language langcode="yid" encodinganalog="3.4.3">Yiddish</language>
</langmaterial>
<repository>
<corpname>‫ושם‬ ‫יד‬ ‫ארכיון‬ / Yad Vashem Archives</corpname>
</repository>
</did>
<scopecontent encodinganalog="3.3.1">
<p>The collection consists of approximately 7,200 testimonies collected by the
Centralna Żydowska Komisja Historyczna (Central Jewish Historical Committee) in
Poland during its during its active years, 1944-1947.
…..
as well as testimonies from survivors who fought in partisan units and survivors who
were in hiding.</p>
</scopecontent>
…....
EAD File sample (2)
…...
<originalsloc encodinganalog="3.5.1">
<p>ZYDOWSKI INSTYTUT HISTORYCZNY - ZIH, WARSZAWA, POLAND</p>
</originalsloc>
…...
<controlaccess>
<geogname>Poland</geogname>
<geogname>Warsaw</geogname>
</controlaccess>
<controlaccess>
<subject>Persecution of Jews</subject>
<subject>Testimonies, Biographies</subject>
<subject>Holocaust survivors</subject>
</controlaccess>
<controlaccess>
<corpname>Centralna Żydowska Komisja Historyczna</corpname>
</controlaccess>
</archdesc>
Data publication and synchronization
●
We plan to use two data publication protocols:
●
OAI-PMH: one of the first protocols for publication of data
●
Publication of data in different formats: Dublin Core (default), EAD,
etc.
●
PMH-servers are not easy to implement and to mantain for small
archives
●
But we want to implement a client for institutions that already use it
●
RessourceSync: a new protocol
●
Based on SiteMaps
●
Data can be published on the web page of the institution
●
Higher security
●
Use sitemaps to expose changes and updates
●
Only modified and new data will be tranferred to the portal
●
Both are standard protocols of the Open Archives Initiative
Data ingestion
●
After data is ingested into the portal, it will receive a
permanent URL:
●
Formal protocol is in progress
●
Necessary to publish our data in the Linked Open Data cloud
●
Updates: data will be overwritten
●
But the portal keeps the user generated data
●
But... is it enough for the user just to have all
information in a single infrastructure?
Data retrieval
●
The user needs to be able to retrieve information related to
selected topics, places, people, organizations, creators...
●
Regardless which institution holds it
●
Regardless in which language the metadata is written
EHRI controlled vocabularies
●
EHRI Thesaurus
●
Concepts: hierarchy of concepts formalized in SKOS
●
A first set translated into 10 languages
●
Made by historians and content specialists
●
Authority lists:
●
Named entities or instances of the concepts
●
Proposed by historians and especialists: not really useful for indexing
and retrieval of data
●
During import a lot were added by hand to address necessities of the real
data
●
Domain specific authorities: Ghettos, Camps, Administrative Districts
●
Vocabularies created for applications in the portal:
●
Two research guides
●
Linked to the EHRI Thesaurus
Problems of the first approach of the project
●
A vocabulary built with knowledge about the Shoah can be
helpful to represent the history, but not necessarily the
documentation:
●
The complilation of an encyclopedia and the implementation of an
engine for cataloguing and retrieval are two very different things
and require different strategies and kinds of expertise.
●
The vocabularies should be able to retrieve the real existing
data:
●
Vocabularies should be able to describe the data, not only the
content... i.e: types of documents, physical format of the data...
●
A strategy to increase te datasets when new data addresses new
necessities has to be implemented.
The reality of the data
●
Different institutions use different systems to assign
keywords (or no system)
●
Keywords can have different relevance in different systems
●
In a National Archive “holocaust” can be a relevant keyword, but it
is not relevant for the EHRI portal.
●
A same keyword can have different meanings in different
knowledge basis
●
i.e: “labor” in one set of imported data corresponds to “forced
labor”, in another set to “trade unions”
●
Relevant information is often given as free text:
●
Necessary to use Natural Language Processing to extract this
information, but we can do in the project only in a experimental
level.
EHRI's data driven approach (1)
●
Extraction of access points of the EAD files during import
<controlaccess>
<geogname>Poland</geogname>
<geogname>Warsaw</geogname>
</controlaccess>
<controlaccess>
<subject>Persecution of Jews</subject>
<subject>Testimonies, Biographies</subject>
<subject>Holocaust survivors</subject>
</controlaccess>
<controlaccess>
<corpname>Centralna Żydowska Komisja Historyczna</corpname>
</controlaccess>
EHRI's data driven approach (2)
●
Person, corporate bodies:
●
Check whether we have corresponding authority files
●
If we have: link the description unit with the correspoinding authority
file
●
If we don't have: create a new authority file
●
Priority of EHRI: creators of archival collections
●
Places:
●
Link the places with the geographical database GeoNames
●
Problematic for historical places, some of them will be added as extra
vocabulary.
EHRI's data driven approach (3)
●
Concepts/terms: the most complicated case
●
Archives used very different strategies for concepts:
●
Some institutions make composition of terms using different rules
(or no-rule)
●
Subject: “Jews--Persecution--France” (data of USHMM)
●
EHRI has an atomic approach
●
Subject: “Persecution of Jews”
●
Place: “France”
●
Steps to process concepts/terms:
●
Terms are normalized and de-duplicated
●
If there are equivalent terms in the thesaurus we establish a link
●
If there are not equivalent terms the concept goes to further
analysis
●
If necessary a board of experts will consider to accomodate a new
concept in our concept hierarchy.
Ghethos and Concentration Camps
●
We evaluate to start a WikiData project for ghettos and
concentration camps
●
Strategy:
●
Extract information from the actual thesaurus and alternative
sources
●
Encyclopedic knowledge
●
Data from project partners
●
Integration of all this data in the WikiData platform
●
Enrichment with help of the community
●
Multilingual labels and no controversial information
●
Finally the data in WikiData and in the portal should be
synchronized
NIOD Institute for War, Holocaust and Genocide
Studies (NL)
 
CEGESOMA Centre for Historical Research and
Documentation
on War and Contemporary Society (BE)
 
Jewish Museum in Prague (CZ)
 
Center for Holocaust Studies at the Institute for
Contemporary History in Munich (DE)
 
YAD VASHEM The Holocaust Martyrs’ and
Heroes’ Remembrance Authority (IL)
United States Holocaust Memorial Museum (USA)
Bundesarchiv (DE)
 
The Wiener Library Institute for the Study of
the Holocaust & Genocide (UK)
Holocaust Documentation Centre (SK)
Polish Center for Holocaust Research (PL)
 
The Jewish Museum of Greece (GR)
Jewish Historical Institute (PL)
King’s College London (UK)
 
Ontotext AD (BG)
 
Elie Wiesel National Institute for the Study of Holocaust
in Romania (RO)
 
DANS Data Archiving and Networked Services (NL)
 
Shoah Memorial, Museum, Center for Contemporary
Jewish Documentation (FR)
 
ITS International Tracing Service (DE)
 
Hungarian Jewish Archives (HU)
 
INRIA Institute for Research in Computer Science and Automation (FR)
 
Vilna Gaon State Jewish Museum (LT)
 
VWI Vienna Wiesenthal Institute for Holocaust Studies (AT)
Foundation Jewish Contemporary Documentation Center (IT)
CONNECTING
KNOWLEDGE
CONNECTING
COLLECTIONS
Integration and Retrieval of
Heterogeneous Archival
Metadata
09/11/2016

Contenu connexe

Tendances

Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked dataLaura Po
 
Linked Open Data - State of the Art, Challenges and Applications
Linked Open Data - State of the Art, Challenges and ApplicationsLinked Open Data - State of the Art, Challenges and Applications
Linked Open Data - State of the Art, Challenges and ApplicationsRui Vieira
 
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgJindřich Mynarz
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greaterCristina Sarasua
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
 
Open hpi semweb-06-part5
Open hpi semweb-06-part5Open hpi semweb-06-part5
Open hpi semweb-06-part5Nadine Ludwig
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudOntotext
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vecHeiko Paulheim
 
Exploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sourcesExploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sourcesLaura Po
 

Tendances (11)

Introduction to linked data
Introduction to linked dataIntroduction to linked data
Introduction to linked data
 
Linked Open Data - State of the Art, Challenges and Applications
Linked Open Data - State of the Art, Challenges and ApplicationsLinked Open Data - State of the Art, Challenges and Applications
Linked Open Data - State of the Art, Challenges and Applications
 
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.orgEC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
EC-WEB: Validator and Preview for the JobPosting Data Model of Schema.org
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data
 
How links can make your open data even greater
How links can make your open data even greaterHow links can make your open data even greater
How links can make your open data even greater
 
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageBuild Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
 
Open hpi semweb-06-part5
Open hpi semweb-06-part5Open hpi semweb-06-part5
Open hpi semweb-06-part5
 
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the CloudFirst Steps in Semantic Data Modelling and Search & Analytics in the Cloud
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
 
Linking library data
Linking library dataLinking library data
Linking library data
 
New Adventures in RDF2vec
New Adventures in RDF2vecNew Adventures in RDF2vec
New Adventures in RDF2vec
 
Exploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sourcesExploration, visualization and querying of linked open data sources
Exploration, visualization and querying of linked open data sources
 

Similaire à F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016

Technical details of the P2Pvalue directory
Technical details of the P2Pvalue directoryTechnical details of the P2Pvalue directory
Technical details of the P2Pvalue directoryDavid Rozas
 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and publicStephanie Simms
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for educationChristophe Guéret
 
When a local project becomes beneficial for the whole community (and vice ver...
When a local project becomes beneficial for the whole community (and vice ver...When a local project becomes beneficial for the whole community (and vice ver...
When a local project becomes beneficial for the whole community (and vice ver...4Science
 
Comsode tools - pushing data to open ecosystem
Comsode tools - pushing data to open ecosystemComsode tools - pushing data to open ecosystem
Comsode tools - pushing data to open ecosystemComsode - FP7 project
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...WARCnet
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management Oscar Corcho
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflowsSSSW
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryMarcus Hanwell
 
Introducingthe anu datacommons
Introducingthe anu datacommonsIntroducingthe anu datacommons
Introducingthe anu datacommonsDoug Moncur
 
Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe Research Data Alliance
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSemantic Web Company
 
Automated interpretability of linked data ontologies: an evaluation within th...
Automated interpretability of linked data ontologies: an evaluation within th...Automated interpretability of linked data ontologies: an evaluation within th...
Automated interpretability of linked data ontologies: an evaluation within th...Nuno Freire
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationEnno Meijers
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research dataARDC
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...Martin Klein
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...openminted_eu
 

Similaire à F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016 (20)

Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Publishing Linked Data using Schema.org
Publishing Linked Data using Schema.orgPublishing Linked Data using Schema.org
Publishing Linked Data using Schema.org
 
Technical details of the P2Pvalue directory
Technical details of the P2Pvalue directoryTechnical details of the P2Pvalue directory
Technical details of the P2Pvalue directory
 
Making DMPs actionable and public
Making DMPs actionable and publicMaking DMPs actionable and public
Making DMPs actionable and public
 
Downscaling information systems for education
Downscaling information systems for educationDownscaling information systems for education
Downscaling information systems for education
 
When a local project becomes beneficial for the whole community (and vice ver...
When a local project becomes beneficial for the whole community (and vice ver...When a local project becomes beneficial for the whole community (and vice ver...
When a local project becomes beneficial for the whole community (and vice ver...
 
Comsode tools - pushing data to open ecosystem
Comsode tools - pushing data to open ecosystemComsode tools - pushing data to open ecosystem
Comsode tools - pushing data to open ecosystem
 
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
Wednesday 6 May: Hand me the data! What you should know as a humanities resea...
 
Open Data (and Software, and other Research Artefacts) - A proper management
Open Data (and Software, and other Research Artefacts) -A proper managementOpen Data (and Software, and other Research Artefacts) -A proper management
Open Data (and Software, and other Research Artefacts) - A proper management
 
Tutorial Data Management and workflows
Tutorial Data Management and workflowsTutorial Data Management and workflows
Tutorial Data Management and workflows
 
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistryOpen Chemistry, JupyterLab and data: Reproducible quantum chemistry
Open Chemistry, JupyterLab and data: Reproducible quantum chemistry
 
Introducingthe anu datacommons
Introducingthe anu datacommonsIntroducingthe anu datacommons
Introducingthe anu datacommons
 
Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe Cultural Heritage: when data are much worst than one can believe
Cultural Heritage: when data are much worst than one can believe
 
SKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategiesSKOS as the focal point of linked data strategies
SKOS as the focal point of linked data strategies
 
KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
Automated interpretability of linked data ontologies: an evaluation within th...
Automated interpretability of linked data ontologies: an evaluation within th...Automated interpretability of linked data ontologies: an evaluation within th...
Automated interpretability of linked data ontologies: an evaluation within th...
 
CLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage informationCLARIAH Toogdag 2018: A distributed network of digital heritage information
CLARIAH Toogdag 2018: A distributed network of digital heritage information
 
Using Archivemedia to preserve research data
Using Archivemedia to preserve research dataUsing Archivemedia to preserve research data
Using Archivemedia to preserve research data
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
 

Plus de evaminerva

G14 eyal reuven_nli_theopenlibrary
G14 eyal reuven_nli_theopenlibraryG14 eyal reuven_nli_theopenlibrary
G14 eyal reuven_nli_theopenlibraryevaminerva
 
G12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewishG12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewishevaminerva
 
G12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewishG12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewishevaminerva
 
G11 alex valdman_yerushaproject
G11 alex valdman_yerushaprojectG11 alex valdman_yerushaproject
G11 alex valdman_yerushaprojectevaminerva
 
G11 alex valdman_yerushaproject
G11 alex valdman_yerushaprojectG11 alex valdman_yerushaproject
G11 alex valdman_yerushaprojectevaminerva
 
G10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminologyG10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminologyevaminerva
 
G10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminologyG10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminologyevaminerva
 
G8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishcultureG8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishcultureevaminerva
 
G8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishcultureG8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishcultureevaminerva
 
G7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariantsG7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariantsevaminerva
 
G7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariantsG7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariantsevaminerva
 
G6 jonathan bendovsqe_minerva 2016
G6 jonathan bendovsqe_minerva 2016G6 jonathan bendovsqe_minerva 2016
G6 jonathan bendovsqe_minerva 2016evaminerva
 
G5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrollsG5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrollsevaminerva
 
G5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrollsG5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrollsevaminerva
 
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritageG3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritageevaminerva
 
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritageG3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritageevaminerva
 
G2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestineG2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestineevaminerva
 
G2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestineG2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestineevaminerva
 
F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016evaminerva
 
F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016evaminerva
 

Plus de evaminerva (20)

G14 eyal reuven_nli_theopenlibrary
G14 eyal reuven_nli_theopenlibraryG14 eyal reuven_nli_theopenlibrary
G14 eyal reuven_nli_theopenlibrary
 
G12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewishG12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewish
 
G12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewishG12 susan hazan_roundtableopenaccesjewish
G12 susan hazan_roundtableopenaccesjewish
 
G11 alex valdman_yerushaproject
G11 alex valdman_yerushaprojectG11 alex valdman_yerushaproject
G11 alex valdman_yerushaproject
 
G11 alex valdman_yerushaproject
G11 alex valdman_yerushaprojectG11 alex valdman_yerushaproject
G11 alex valdman_yerushaproject
 
G10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminologyG10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminology
 
G10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminologyG10 ronit gadish_alexandervainer_hebrewterminology
G10 ronit gadish_alexandervainer_hebrewterminology
 
G8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishcultureG8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishculture
 
G8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishcultureG8 seroussi sprinzak_mappingjewishculture
G8 seroussi sprinzak_mappingjewishculture
 
G7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariantsG7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariants
 
G7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariantsG7 menahem katz_hillelgershuni_textualvariants
G7 menahem katz_hillelgershuni_textualvariants
 
G6 jonathan bendovsqe_minerva 2016
G6 jonathan bendovsqe_minerva 2016G6 jonathan bendovsqe_minerva 2016
G6 jonathan bendovsqe_minerva 2016
 
G5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrollsG5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrolls
 
G5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrollsG5 orit rosengarten_leonlevy_dl_deadseascrolls
G5 orit rosengarten_leonlevy_dl_deadseascrolls
 
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritageG3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
 
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritageG3 stoeck and_hayim_lapin_nextgenerationculturalheritage
G3 stoeck and_hayim_lapin_nextgenerationculturalheritage
 
G2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestineG2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestine
 
G2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestineG2 michale satlow_inscriptionsisraelpalestine
G2 michale satlow_inscriptionsisraelpalestine
 
F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016
 
F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016F3 sigal arieerez_reconnectingpast_evaminerva2016
F3 sigal arieerez_reconnectingpast_evaminerva2016
 

Dernier

Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxNikitaBankoti2
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIShubhangi Sonawane
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 

Dernier (20)

Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

F2 kepa rodriguez_ehri_integration_retrieva_minerva_2016

  • 1. EVA/Minerva 2016 Integration and Retrieval of Heterogeneous Archival Metadata CONNECTING COLLECTIONS Kepa J. Rodriguez – Archives Yad Vashem 09/11/2016
  • 2. Outline ● Data integration in the first phase of the project ● Our actual integration approach ● Retrieval of data using controlled vocabularies ● Development of the EHRI controlled vocabularies
  • 3. Data integration in the first phase of the project ● Holding institutions delivered data in very different formats: ● XML, text files, CSV, JSON, etc... ● Ingestion into the portal was made case by case ● We interpreted data model and map it with our model ● Sometimes without help of the institution ● Lots of data introduced by hand ● Process no sustainable, it cannot be repeated ● No automatic updates are possible ● If an institution updates content, data has to be updated by hand ● Other problems: infrastructure, persistent identifiers, etc.
  • 4. Proposal for the second phase of the project ● Data conversion ● Data publication and synchronization ● Data ingestion
  • 5. Data conversion ● Converstion tool: different data formats into EAD: ● XML, JSON, CSV... ● Generic transformation ● Useful for a relevant number of institutions ● Reusable functions, as mappings for specific fields of their export format into EAD ● Utilities to configure specific transformations ● Validation of the output: ● Machine validation: XML validation protocols ● Schematron, RNG ● Human validation: HTML preview including mark-up for validation errors
  • 6. EAD File sample (1) <archdesc level="subgrp"> <did> <unitid>M.49.E</unitid> <unittitle encodinganalog="3.1.2">Testimonies of Holocaust Survivors collected by the Central Jewish Historical Commission in Poland, 1944-1947</unittitle> <physdesc encodinganalog="3.1.5">6845 files</physdesc> <langmaterial> <language langcode="deu" encodinganalog="3.4.3">German</language> <language langcode="pol" encodinganalog="3.4.3">Polish</language> <language langcode="yid" encodinganalog="3.4.3">Yiddish</language> </langmaterial> <repository> <corpname>‫ושם‬ ‫יד‬ ‫ארכיון‬ / Yad Vashem Archives</corpname> </repository> </did> <scopecontent encodinganalog="3.3.1"> <p>The collection consists of approximately 7,200 testimonies collected by the Centralna Żydowska Komisja Historyczna (Central Jewish Historical Committee) in Poland during its during its active years, 1944-1947. ….. as well as testimonies from survivors who fought in partisan units and survivors who were in hiding.</p> </scopecontent> …....
  • 7. EAD File sample (2) …... <originalsloc encodinganalog="3.5.1"> <p>ZYDOWSKI INSTYTUT HISTORYCZNY - ZIH, WARSZAWA, POLAND</p> </originalsloc> …... <controlaccess> <geogname>Poland</geogname> <geogname>Warsaw</geogname> </controlaccess> <controlaccess> <subject>Persecution of Jews</subject> <subject>Testimonies, Biographies</subject> <subject>Holocaust survivors</subject> </controlaccess> <controlaccess> <corpname>Centralna Żydowska Komisja Historyczna</corpname> </controlaccess> </archdesc>
  • 8. Data publication and synchronization ● We plan to use two data publication protocols: ● OAI-PMH: one of the first protocols for publication of data ● Publication of data in different formats: Dublin Core (default), EAD, etc. ● PMH-servers are not easy to implement and to mantain for small archives ● But we want to implement a client for institutions that already use it ● RessourceSync: a new protocol ● Based on SiteMaps ● Data can be published on the web page of the institution ● Higher security ● Use sitemaps to expose changes and updates ● Only modified and new data will be tranferred to the portal ● Both are standard protocols of the Open Archives Initiative
  • 9. Data ingestion ● After data is ingested into the portal, it will receive a permanent URL: ● Formal protocol is in progress ● Necessary to publish our data in the Linked Open Data cloud ● Updates: data will be overwritten ● But the portal keeps the user generated data ● But... is it enough for the user just to have all information in a single infrastructure?
  • 10. Data retrieval ● The user needs to be able to retrieve information related to selected topics, places, people, organizations, creators... ● Regardless which institution holds it ● Regardless in which language the metadata is written
  • 11. EHRI controlled vocabularies ● EHRI Thesaurus ● Concepts: hierarchy of concepts formalized in SKOS ● A first set translated into 10 languages ● Made by historians and content specialists ● Authority lists: ● Named entities or instances of the concepts ● Proposed by historians and especialists: not really useful for indexing and retrieval of data ● During import a lot were added by hand to address necessities of the real data ● Domain specific authorities: Ghettos, Camps, Administrative Districts ● Vocabularies created for applications in the portal: ● Two research guides ● Linked to the EHRI Thesaurus
  • 12. Problems of the first approach of the project ● A vocabulary built with knowledge about the Shoah can be helpful to represent the history, but not necessarily the documentation: ● The complilation of an encyclopedia and the implementation of an engine for cataloguing and retrieval are two very different things and require different strategies and kinds of expertise. ● The vocabularies should be able to retrieve the real existing data: ● Vocabularies should be able to describe the data, not only the content... i.e: types of documents, physical format of the data... ● A strategy to increase te datasets when new data addresses new necessities has to be implemented.
  • 13. The reality of the data ● Different institutions use different systems to assign keywords (or no system) ● Keywords can have different relevance in different systems ● In a National Archive “holocaust” can be a relevant keyword, but it is not relevant for the EHRI portal. ● A same keyword can have different meanings in different knowledge basis ● i.e: “labor” in one set of imported data corresponds to “forced labor”, in another set to “trade unions” ● Relevant information is often given as free text: ● Necessary to use Natural Language Processing to extract this information, but we can do in the project only in a experimental level.
  • 14. EHRI's data driven approach (1) ● Extraction of access points of the EAD files during import <controlaccess> <geogname>Poland</geogname> <geogname>Warsaw</geogname> </controlaccess> <controlaccess> <subject>Persecution of Jews</subject> <subject>Testimonies, Biographies</subject> <subject>Holocaust survivors</subject> </controlaccess> <controlaccess> <corpname>Centralna Żydowska Komisja Historyczna</corpname> </controlaccess>
  • 15. EHRI's data driven approach (2) ● Person, corporate bodies: ● Check whether we have corresponding authority files ● If we have: link the description unit with the correspoinding authority file ● If we don't have: create a new authority file ● Priority of EHRI: creators of archival collections ● Places: ● Link the places with the geographical database GeoNames ● Problematic for historical places, some of them will be added as extra vocabulary.
  • 16. EHRI's data driven approach (3) ● Concepts/terms: the most complicated case ● Archives used very different strategies for concepts: ● Some institutions make composition of terms using different rules (or no-rule) ● Subject: “Jews--Persecution--France” (data of USHMM) ● EHRI has an atomic approach ● Subject: “Persecution of Jews” ● Place: “France” ● Steps to process concepts/terms: ● Terms are normalized and de-duplicated ● If there are equivalent terms in the thesaurus we establish a link ● If there are not equivalent terms the concept goes to further analysis ● If necessary a board of experts will consider to accomodate a new concept in our concept hierarchy.
  • 17. Ghethos and Concentration Camps ● We evaluate to start a WikiData project for ghettos and concentration camps ● Strategy: ● Extract information from the actual thesaurus and alternative sources ● Encyclopedic knowledge ● Data from project partners ● Integration of all this data in the WikiData platform ● Enrichment with help of the community ● Multilingual labels and no controversial information ● Finally the data in WikiData and in the portal should be synchronized
  • 18. NIOD Institute for War, Holocaust and Genocide Studies (NL)   CEGESOMA Centre for Historical Research and Documentation on War and Contemporary Society (BE)   Jewish Museum in Prague (CZ)   Center for Holocaust Studies at the Institute for Contemporary History in Munich (DE)   YAD VASHEM The Holocaust Martyrs’ and Heroes’ Remembrance Authority (IL) United States Holocaust Memorial Museum (USA) Bundesarchiv (DE)   The Wiener Library Institute for the Study of the Holocaust & Genocide (UK) Holocaust Documentation Centre (SK) Polish Center for Holocaust Research (PL)   The Jewish Museum of Greece (GR) Jewish Historical Institute (PL) King’s College London (UK)   Ontotext AD (BG)   Elie Wiesel National Institute for the Study of Holocaust in Romania (RO)   DANS Data Archiving and Networked Services (NL)   Shoah Memorial, Museum, Center for Contemporary Jewish Documentation (FR)   ITS International Tracing Service (DE)   Hungarian Jewish Archives (HU)   INRIA Institute for Research in Computer Science and Automation (FR)   Vilna Gaon State Jewish Museum (LT)   VWI Vienna Wiesenthal Institute for Holocaust Studies (AT) Foundation Jewish Contemporary Documentation Center (IT) CONNECTING KNOWLEDGE
  • 19. CONNECTING COLLECTIONS Integration and Retrieval of Heterogeneous Archival Metadata 09/11/2016