During the second World War some 1.300 illegal newspapers were issued by the Dutch resistance.
Right after the war as many of these newspapers as possible were physically preserved by Dutch memory institutions. They were described in formal library catalogues, that were digitized and brought online in the ‘90s. In 2010 the national collection of underground newspapers – some 200.000 pages – was full-text digitized in Delpher, the national aggregator for historical full-texts.
Having created online metadata and full-texts for these publications, the third pillar ''context'' was still missing, making it hard for people to understand the historic background of the newspapers.
We are currently running a project to tackle this contextual problem. We started by extracting contextual entries from a hard-copy standard work on Dutch illegal press and combined these with data from the library catalogue and Delpher into a central LOD triple store.
We then created links between historically related newspapers and used Named Entity Recognition to find persons, organisations and places related to the newspapers. We further semantically enriched the data using DBPedia.
Next, using an article template to ensure uniformity and consistency, we generated 1.300 Wikipedia article stubs from the database.
Finally, we sought collaboration with the Dutch Wikipedia volunteer community to extend these stubs into full encyclopedic articles.
In this way we can give every newspaper its own Wikipedia article, making these WW2 materials much more visible to the Dutch public, over 80% of whom uses Wikipedia.
At the same time the triple store can serve as a source for alternative applications, like data visualizations. This will enable us to visualize connections and networks between underground newspapers, as they developed over time between 1940 and 1945.
---------------
Presentation during the DCH (Digital Cultural Heritage) conference, 30th Aug - 1st Sept 2017, Staatsbibliothek Berlin, Germany - http://dch2017.net
Using LOD to crowdsource Dutch WW2 underground newspapers on Wikipedia - DCH, 30-08-2017, Berlin, Germany
1. Using LOD to crowdsource Dutch WW2
underground newspapers on Wikipedia
Olaf Janssen, National Library of the Netherlands
Digital Cultural Heritage, Berlin, 30-08-2017
olaf.janssen@kb.nl - @ookgezellig - slideshare.net/OlafJanssenNL
6. After the war 1.300 newspaper titles
were collected & preserved at the NIOD …
https://commons.wikimedia.org/wiki/File:Verzetskrant_in_archiefdozen_bij_het_NIOD.jpg – CC-BY-SA - OlafJanssen
The national Institute for
War, Holocaust
and Genocide Studies
in Amsterdam
10. In Delpher you can read and word-search
these newspapers…
11. But say, I want to know more about this newspaper
• What sort of illegal newspaper was it?
• What is the history of this newspaper?
• Who wrote it?
• Where was this newspaper printed?
• How was it distributed?
• Were there any relations with other underground newspapers?
• Etc…
12. But say, I want to know more about this newspaper
• What sort of illegal newspaper was it?
• What is the history of this newspaper?
• Who wrote it?
• Where was this newspaper printed?
• How was it distributed?
• Were there any relations with other underground newspapers or
resistance groups?
• Etc…
13. But say, I want to know more about this newspaper
• What sort of illegal newspaper was it?
• What is the history of this newspaper?
• Who wrote it?
• Where was this newspaper printed?
• How was it distributed?
• Were there any relations with other underground newspapers?
• Etc…
You can’t answer these
questions from Delpher
14. Big drawback of Delpher:
No contextual information
about WW2 underground newspapers
https://thejungleisneutral.files.wordpress.com/2013/11/lost.jpg
15. Where would many people go to find
contextual information about historic newspapers?
Probably Wikipedia (via Google)
16. Where would many people go to find
contextual information about historic newspapers?
Probably Wikipedia (via Google)
36. IDs of related students’
newspapers
This newspaper Other newspapers
37. We OCRed this book into PDF
+ put it online under CC-BY-SA
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
38. We OCRed this book into PDF
(CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
Convert PDF into structured database.
Link titles to places, persons, other titles
Link titles to KB-catalogue (metadata) and
Delpher (full-text)
Link titles, persons and places to external
sources
39. We OCRed this book into PDF
(CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
Convert PDF into structured database.
Link titles to places, persons, other titles
Link titles to KB-catalogue (metadata) and
Delpher (full-text)
Link titles, persons and places to external
sources
40. We OCRed this book into PDF
(CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
---------------------------------------------------
Convert PDF into structured database
Link: titles places, persons, other titles
Link titles to KB-catalogue (metadata) and
Delpher (full-text)
Link titles, persons and places to external
sources
41. We OCRed this book into PDF
(CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
---------------------------------------------------
Convert PDF into structured database
Link: titles places, persons, other titles
Link: titles library catalogue (metadata)
and Delpher (full-text)
Link titles, persons and places to external
sources
42. We OCRed this book into PDF
(CC-BY-SA)
http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)
Available online (PDF, flat file)
Open license (CC-BY-SA)
---------------------------------------------------
Convert PDF into structured database.
Link: titles places, persons, other titles
Link: titles library catalogue (metadata)
and Delpher (full-text)
Link: titles, persons and places external
sources
43. Convert PDF into structured database.
Link: titles places, persons, other titles
Link: titles library catalogue (metadata)
and Delpher (full-text)
Link: titles, persons and places external
sources
LOD & database expert
Gerard Kuys
44. Convert PDF into structured database.
Link: titles places, persons, other titles
Link: titles library catalogue (metadata)
and Delpher (full-text)
Link: titles, persons and places external
sources
47. Available online (PDF, flat file)
Open license (CC-BY-SA)
---------------------------------------------------
Convert PDF into structured database.
Link: titles places, persons, other titles
Link: titles library catalogue (metadata)
and Delpher (full-text)
Link: titles, persons and places external
sources
48. Summer 2016
This LOD database is unique in the Netherlands.
First time data about underground newspapers was
systematically collected and linked online!
https://www.pinterest.com/freethewronged/world-war-ii/
50. We have: LOD database
Using an article template we generated
1.300 uniform and interlinked Wikipedia stubs
https://c1.staticflickr.com/9/8281/7699231918_11a7356c38_b.jpg
54. This bit
was added manually
to expand stub into full article
Crowdsourcing by
Dutch Wikipedia community
https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)
55. Wikipedia volunteers are expanding the
1.300 stubs…
gradually creating more and more full articles.
Door Sebastiaan ter Burg [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons