SlideShare une entreprise Scribd logo
1  sur  11
Europeana Newspapers
in a Nutshell
Clemens Neudecker (@cneudecker)
Staatsbibliothek zu Berlin –
Preußischer Kulturbesitz
Introduction/Background
• Europeana Newspapers EU Project
(2012 – 2015)
• http://www.europeana-newspapers.eu/
• Main objectives:
– Collect metadata for digitised newspapers in EU
– Perform OCR (text recognition) and OLR (article
separation) on the digitised newspapers
– Develop a common portal for search & discovery
– Establish standards and best practices for
(historical) newspaper digitisation
Collection Stats
• Covers newspapers from 1618 – 2016
• 12 EU national and/or research libraries
• >1.000 newspaper titles, ca. 3.3m issues
• 40 languages, 4 alphabets
• Metadata for approx. 20m pages
• 12m pages fully searchable by keyword
(OCR errors – your mileage may vary…)
• Data (scans & OCR) public domain,
metadata CC0 licensed
TEL Historic Newspaper Browser
Keyword
search
TEL Historic Newspaper Browser
Various
facets
TEL Historic Newspaper Browser
Browse by
newspaper
title list
TEL Historic Newspaper Browser
Browse by
date of issue
Collaboration with Researchers
• Oceanic Exchanges (Digging Into Data
Transatlantic Platform, 2017-2019)
• impresso (Swiss National Science Fund,
2017-2020)
• NewsEye (EU H2020, 2018 – 2020)
• CLARIN (EU DSI, ongoing)
• Interviews with researchers
• Numerous research groups throughout EU
(though mainly DACH)
Outlook
• Relaunch of Europeana Newspapers with redeveloped
search and browse interface integrated directly with
the Europeana Portal as a thematic collection
(July/August 2018)
• Support of IIIF API for the online presentation
and aggregation of newspapers in Europeana
in the longer term, this will open up the possibility to
bookmark, annotate, transcribe or correct and connect
disparate newspaper sources directly in your web browser
• Named Entity Recognition for newspapers
Thank you for your
attention!
Clemens Neudecker (@cneudecker)
Staatsbibliothek zu Berlin –
Preußischer Kulturbesitz

Contenu connexe

Tendances

Historical newspapers in the context of Digital Library of Slovenia
Historical newspapers in the context of Digital Library of SloveniaHistorical newspapers in the context of Digital Library of Slovenia
Historical newspapers in the context of Digital Library of SloveniaEuropeana Newspapers
 
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...ariadnenetwork
 
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...LIBER Europe
 
Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02The European Library
 
France: ARIADNE - Success stories from partners and the research community
France: ARIADNE - Success stories from partners and the research communityFrance: ARIADNE - Success stories from partners and the research community
France: ARIADNE - Success stories from partners and the research communityariadnenetwork
 
Consolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research ServicesConsolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research ServicesSaskia Scheltjens
 
Text and Data Mining at the Royal Library in the Netherlands
Text and Data Mining at the Royal Library in the NetherlandsText and Data Mining at the Royal Library in the Netherlands
Text and Data Mining at the Royal Library in the Netherlandsopenminted_eu
 
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...The European Library
 
You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?The European Library
 
2017 IIIF Conference - The Vatican - SACHA
2017 IIIF Conference - The Vatican - SACHA2017 IIIF Conference - The Vatican - SACHA
2017 IIIF Conference - The Vatican - SACHAGeorg Petz
 
Hungarian National Digital Archive and Hungarian participation in Europeana
Hungarian National Digital Archive and Hungarian participation in EuropeanaHungarian National Digital Archive and Hungarian participation in Europeana
Hungarian National Digital Archive and Hungarian participation in EuropeanaEuropeanaLocal Project
 
Europeana Newspapers - Data, Tools & Future Plans
 Europeana Newspapers - Data, Tools & Future Plans  Europeana Newspapers - Data, Tools & Future Plans
Europeana Newspapers - Data, Tools & Future Plans cneudecker
 
Estermann wikidata performing-arts-20181109
Estermann wikidata performing-arts-20181109Estermann wikidata performing-arts-20181109
Estermann wikidata performing-arts-20181109Beat Estermann
 
Austrian Books Online
Austrian Books OnlineAustrian Books Online
Austrian Books OnlineMax Kaiser
 
Czech Republic: ARIADNE - Success stories from partners and the research comm...
Czech Republic: ARIADNE - Success stories from partners and the research comm...Czech Republic: ARIADNE - Success stories from partners and the research comm...
Czech Republic: ARIADNE - Success stories from partners and the research comm...ariadnenetwork
 
Welcome and introduction to the ARIADNE project
Welcome and introduction to the ARIADNE projectWelcome and introduction to the ARIADNE project
Welcome and introduction to the ARIADNE projectariadnenetwork
 
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomCorpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomStella Wisdom
 

Tendances (20)

Historical newspapers in the context of Digital Library of Slovenia
Historical newspapers in the context of Digital Library of SloveniaHistorical newspapers in the context of Digital Library of Slovenia
Historical newspapers in the context of Digital Library of Slovenia
 
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
“Archäologische Informationen” and Open Journal Systems. Chances and Possibil...
 
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
LIBER DH Working Group Workshop: Digital Humanities Activities at Göttingen S...
 
About company
About companyAbout company
About company
 
Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02Dunning seedi-2013-130517083015-phpapp02
Dunning seedi-2013-130517083015-phpapp02
 
France: ARIADNE - Success stories from partners and the research community
France: ARIADNE - Success stories from partners and the research communityFrance: ARIADNE - Success stories from partners and the research community
France: ARIADNE - Success stories from partners and the research community
 
Consolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research ServicesConsolidating Openness : Developing Rijksmuseum Research Services
Consolidating Openness : Developing Rijksmuseum Research Services
 
Text and Data Mining at the Royal Library in the Netherlands
Text and Data Mining at the Royal Library in the NetherlandsText and Data Mining at the Royal Library in the Netherlands
Text and Data Mining at the Royal Library in the Netherlands
 
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
Alastair Dunning, The successes of the Europeana Libraries project, The Europ...
 
You've Digitised. What Next ?
You've Digitised. What Next ?You've Digitised. What Next ?
You've Digitised. What Next ?
 
You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?You’ve Digitised Your Collection. What Next ?
You’ve Digitised Your Collection. What Next ?
 
2017 IIIF Conference - The Vatican - SACHA
2017 IIIF Conference - The Vatican - SACHA2017 IIIF Conference - The Vatican - SACHA
2017 IIIF Conference - The Vatican - SACHA
 
Hungarian National Digital Archive and Hungarian participation in Europeana
Hungarian National Digital Archive and Hungarian participation in EuropeanaHungarian National Digital Archive and Hungarian participation in Europeana
Hungarian National Digital Archive and Hungarian participation in Europeana
 
Europeana Newspapers - Data, Tools & Future Plans
 Europeana Newspapers - Data, Tools & Future Plans  Europeana Newspapers - Data, Tools & Future Plans
Europeana Newspapers - Data, Tools & Future Plans
 
Estermann wikidata performing-arts-20181109
Estermann wikidata performing-arts-20181109Estermann wikidata performing-arts-20181109
Estermann wikidata performing-arts-20181109
 
Austrian Books Online
Austrian Books OnlineAustrian Books Online
Austrian Books Online
 
Europeana in a Research Context
Europeana in a Research ContextEuropeana in a Research Context
Europeana in a Research Context
 
Czech Republic: ARIADNE - Success stories from partners and the research comm...
Czech Republic: ARIADNE - Success stories from partners and the research comm...Czech Republic: ARIADNE - Success stories from partners and the research comm...
Czech Republic: ARIADNE - Success stories from partners and the research comm...
 
Welcome and introduction to the ARIADNE project
Welcome and introduction to the ARIADNE projectWelcome and introduction to the ARIADNE project
Welcome and introduction to the ARIADNE project
 
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella WisdomCorpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
Corpus Protocols IFLA Geneva August 2014 by Neil Smyth and Stella Wisdom
 

Similaire à Europeana Newspapers in a Nutshell

The European(a) Newspapers Project
The European(a) Newspapers ProjectThe European(a) Newspapers Project
The European(a) Newspapers ProjectEuropeana Newspapers
 
LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectEuropeana Newspapers
 
LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER Europe
 
The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012Europeana Newspapers
 
The Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventThe Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventEuropeana Newspapers
 
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...IMPACT Centre of Competence
 
GI2012 pekarek-liber
GI2012 pekarek-liberGI2012 pekarek-liber
GI2012 pekarek-liberIGN Vorstand
 
Facilitating Access and Reuse of Research Materials: the Case of The European...
Facilitating Access and Reuse of Research Materials: the Case of The European...Facilitating Access and Reuse of Research Materials: the Case of The European...
Facilitating Access and Reuse of Research Materials: the Case of The European...The European Library
 
Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?cneudecker
 
Europeana Newspapers (Project Details and Aggregation Workflow)
Europeana Newspapers (Project Details and Aggregation Workflow)Europeana Newspapers (Project Details and Aggregation Workflow)
Europeana Newspapers (Project Details and Aggregation Workflow)The European Library
 
Des nouvelles d’Europeana
Des nouvelles d’EuropeanaDes nouvelles d’Europeana
Des nouvelles d’EuropeanaDouglas McCarthy
 
ENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project OverviewENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project OverviewEuropeana Newspapers
 
Why join Europeana?
Why join Europeana?Why join Europeana?
Why join Europeana?Europeana
 
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...TU Delft, Netherlands
 

Similaire à Europeana Newspapers in a Nutshell (20)

The European(a) Newspapers Project
The European(a) Newspapers ProjectThe European(a) Newspapers Project
The European(a) Newspapers Project
 
The Europeana Newspapers Project
The Europeana Newspapers ProjectThe Europeana Newspapers Project
The Europeana Newspapers Project
 
LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers Project
 
LIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers ProjectLIBER, Europeana and the Europeana Newspapers Project
LIBER, Europeana and the Europeana Newspapers Project
 
The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012The Europeana Newspapers Presentation - Cyberspace 2012
The Europeana Newspapers Presentation - Cyberspace 2012
 
The Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final EventThe Europeana Newspapers Project at IMPACT Final Event
The Europeana Newspapers Project at IMPACT Final Event
 
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...IMPACT Final Event 26-06-2012  - Use of IMPACT tools in the Europeana Newspap...
IMPACT Final Event 26-06-2012 - Use of IMPACT tools in the Europeana Newspap...
 
Museums and Europeana
Museums and EuropeanaMuseums and Europeana
Museums and Europeana
 
GI2012 pekarek-liber
GI2012 pekarek-liberGI2012 pekarek-liber
GI2012 pekarek-liber
 
Facilitating Access and Reuse of Research Materials: the Case of The European...
Facilitating Access and Reuse of Research Materials: the Case of The European...Facilitating Access and Reuse of Research Materials: the Case of The European...
Facilitating Access and Reuse of Research Materials: the Case of The European...
 
Webarchiv - Curatorial approaches, topic collections and cooperation with the...
Webarchiv - Curatorial approaches, topic collections and cooperation with the...Webarchiv - Curatorial approaches, topic collections and cooperation with the...
Webarchiv - Curatorial approaches, topic collections and cooperation with the...
 
Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?Digitisation and Digital Humanities - what is the role of Libraries?
Digitisation and Digital Humanities - what is the role of Libraries?
 
Europeana Newspapers (Project Details and Aggregation Workflow)
Europeana Newspapers (Project Details and Aggregation Workflow)Europeana Newspapers (Project Details and Aggregation Workflow)
Europeana Newspapers (Project Details and Aggregation Workflow)
 
Des nouvelles d’Europeana
Des nouvelles d’EuropeanaDes nouvelles d’Europeana
Des nouvelles d’Europeana
 
Europeana Newspapers -
Europeana Newspapers - Europeana Newspapers -
Europeana Newspapers -
 
ENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project OverviewENP Belgrade Workshop Project Overview
ENP Belgrade Workshop Project Overview
 
Europeana datainaction nov2012
Europeana datainaction nov2012Europeana datainaction nov2012
Europeana datainaction nov2012
 
Data Mining Newspapers Metadata
Data Mining Newspapers MetadataData Mining Newspapers Metadata
Data Mining Newspapers Metadata
 
Why join Europeana?
Why join Europeana?Why join Europeana?
Why join Europeana?
 
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...Representation and Absence in Digital Resources: The Case of Europeana Newspa...
Representation and Absence in Digital Resources: The Case of Europeana Newspa...
 

Plus de cneudecker

EuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State LibraryEuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State Librarycneudecker
 
ALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für VolltexteALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für Volltextecneudecker
 
OCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für ZeitungenOCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für Zeitungencneudecker
 
Multimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical NewspapersMultimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical Newspaperscneudecker
 
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...cneudecker
 
AI for digitized cultural heritage
AI for digitized cultural heritageAI for digitized cultural heritage
AI for digitized cultural heritagecneudecker
 
Kuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher IntelligenzKuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher Intelligenzcneudecker
 
Überblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-DÜberblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-Dcneudecker
 
The many uses of digitized newspapers
The many uses of digitized newspapersThe many uses of digitized newspapers
The many uses of digitized newspaperscneudecker
 
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...cneudecker
 
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...cneudecker
 
OCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documentsOCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documentscneudecker
 
Text and Data Mining
Text and Data MiningText and Data Mining
Text and Data Miningcneudecker
 
Formate für Volltexte
Formate für VolltexteFormate für Volltexte
Formate für Volltextecneudecker
 
Reise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 MinutenReise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 Minutencneudecker
 
lab.sbb.berlin
lab.sbb.berlinlab.sbb.berlin
lab.sbb.berlincneudecker
 
Named Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana NewspapersNamed Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana Newspaperscneudecker
 
Active archives @SBB
Active archives @SBBActive archives @SBB
Active archives @SBBcneudecker
 
Coding da Vinci Berlin 2017 - Europeana Newspapers
Coding da Vinci Berlin 2017 - Europeana NewspapersCoding da Vinci Berlin 2017 - Europeana Newspapers
Coding da Vinci Berlin 2017 - Europeana Newspaperscneudecker
 
Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918
Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918
Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918cneudecker
 

Plus de cneudecker (20)

EuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State LibraryEuropeanaTech x AI: Qurator.ai @ Berlin State Library
EuropeanaTech x AI: Qurator.ai @ Berlin State Library
 
ALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für VolltexteALTO, PAGE & Co. Formate für Volltexte
ALTO, PAGE & Co. Formate für Volltexte
 
OCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für ZeitungenOCR und Strukturerkennung für Zeitungen
OCR und Strukturerkennung für Zeitungen
 
Multimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical NewspapersMultimodal Perspectives for Digitised Historical Newspapers
Multimodal Perspectives for Digitised Historical Newspapers
 
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
OCR und Strukturerkennung: Herausforderungen und Ansätze für die Zeitungsdigi...
 
AI for digitized cultural heritage
AI for digitized cultural heritageAI for digitized cultural heritage
AI for digitized cultural heritage
 
Kuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher IntelligenzKuratieren mit künstlicher Intelligenz
Kuratieren mit künstlicher Intelligenz
 
Überblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-DÜberblick zum DFG-Projekt OCR-D
Überblick zum DFG-Projekt OCR-D
 
The many uses of digitized newspapers
The many uses of digitized newspapersThe many uses of digitized newspapers
The many uses of digitized newspapers
 
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
Digitalisate kuratieren mit KI - von unstrukturierten Daten zu strukturierten...
 
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
Von der Zeitungsdigitalisierung zu historischen Netzwerken - Methoden und Her...
 
OCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documentsOCR-D: An end-to-end open source OCR framework for historical printed documents
OCR-D: An end-to-end open source OCR framework for historical printed documents
 
Text and Data Mining
Text and Data MiningText and Data Mining
Text and Data Mining
 
Formate für Volltexte
Formate für VolltexteFormate für Volltexte
Formate für Volltexte
 
Reise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 MinutenReise durch Europeana Collections in 11 Minuten
Reise durch Europeana Collections in 11 Minuten
 
lab.sbb.berlin
lab.sbb.berlinlab.sbb.berlin
lab.sbb.berlin
 
Named Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana NewspapersNamed Entity Recognition for Europeana Newspapers
Named Entity Recognition for Europeana Newspapers
 
Active archives @SBB
Active archives @SBBActive archives @SBB
Active archives @SBB
 
Coding da Vinci Berlin 2017 - Europeana Newspapers
Coding da Vinci Berlin 2017 - Europeana NewspapersCoding da Vinci Berlin 2017 - Europeana Newspapers
Coding da Vinci Berlin 2017 - Europeana Newspapers
 
Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918
Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918
Coding da Vinci Berlin 2017 - Europeana Collections 1914-1918
 

Dernier

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

Europeana Newspapers in a Nutshell

  • 1. Europeana Newspapers in a Nutshell Clemens Neudecker (@cneudecker) Staatsbibliothek zu Berlin – Preußischer Kulturbesitz
  • 2. Introduction/Background • Europeana Newspapers EU Project (2012 – 2015) • http://www.europeana-newspapers.eu/ • Main objectives: – Collect metadata for digitised newspapers in EU – Perform OCR (text recognition) and OLR (article separation) on the digitised newspapers – Develop a common portal for search & discovery – Establish standards and best practices for (historical) newspaper digitisation
  • 3. Collection Stats • Covers newspapers from 1618 – 2016 • 12 EU national and/or research libraries • >1.000 newspaper titles, ca. 3.3m issues • 40 languages, 4 alphabets • Metadata for approx. 20m pages • 12m pages fully searchable by keyword (OCR errors – your mileage may vary…) • Data (scans & OCR) public domain, metadata CC0 licensed
  • 4. TEL Historic Newspaper Browser Keyword search
  • 5. TEL Historic Newspaper Browser Various facets
  • 6. TEL Historic Newspaper Browser Browse by newspaper title list
  • 7. TEL Historic Newspaper Browser Browse by date of issue
  • 8. Collaboration with Researchers • Oceanic Exchanges (Digging Into Data Transatlantic Platform, 2017-2019) • impresso (Swiss National Science Fund, 2017-2020) • NewsEye (EU H2020, 2018 – 2020) • CLARIN (EU DSI, ongoing) • Interviews with researchers • Numerous research groups throughout EU (though mainly DACH)
  • 9. Outlook • Relaunch of Europeana Newspapers with redeveloped search and browse interface integrated directly with the Europeana Portal as a thematic collection (July/August 2018) • Support of IIIF API for the online presentation and aggregation of newspapers in Europeana in the longer term, this will open up the possibility to bookmark, annotate, transcribe or correct and connect disparate newspaper sources directly in your web browser • Named Entity Recognition for newspapers
  • 10.
  • 11. Thank you for your attention! Clemens Neudecker (@cneudecker) Staatsbibliothek zu Berlin – Preußischer Kulturbesitz