SlideShare a Scribd company logo
1 of 33
Download to read offline
eBooks on Demand and FEP
Andreas Parschalk, University of Innsbruck
            (UIBK), Library
     andreas.parschalk@uibk.ac.at
Overview

EOD – the service
    Overview
    Libraries workflow
    End-users view
EOD and the Functional Extension Parser
    The Functional Extension Parser (FEP)
    Integration into the workflow
    Current status
EOD – the service

What is EOD?
    Network of libraries
    Digitisation on demand for copyright-free
      books
    Started 2006 co-founded by the EC in
      eTEN program
    Delivering digitised books since 2007
EOD – the service
                               Incorporation
                                 into Digital
                                  Library &
                                Europeana


 EOD button:
 digitising this
book on request




Library: scans
  & transfers
    images
Who is currently
offering the service?
 > 30 libraries, 12 countries
EOD libraries
              University Libraries of Innsbruck, Graz and Vienna (2x),
    Austria
              Vienna City Library
          Bavarian State Library (Munich), University Libraries of
  Germany Regensburg, Greifswald, Berlin (Humboldt University),
          Saxon State Library (Dresden), STABI Berlin
  Denmark Royal Library
    Estonia National Library, University Library of Tartu
     France Academic health library (Paris)

   Hungary National Széchényi Library of Hungary, Library of the
           Hungarian Academy of Science
   Portugal National Library
   Slovakia University Library of Bratislava, Slovak Academy of Sciences
   Slovenia National and University Library
   Sweden University Library of Umeå, National Library of Sweden
Switzerland National Library of Switzerland, Library at Guisanplatz
EOD – the service

What is being digitised
     Only public domain books according to
       laws and regulations of the libraries'
       country
     Aim: „Full informational capture“
          Whole books cover to cover
          Virtually counted blank pages
          Supplements (maps, tables, …) that form
           an integral part of the document
EOD: The Libraries‘ point of view

 Central services used by libraries
   Web application for the administration of orders
    and generation of eBooks
   Automation of communication (automated e-mails
    to end-users, tracking page with status update)
   OCR (optical character recognition) services:
    antiqua and gothic font
   NEW: Structural Analysis (FEP)
   Delivery of CD-ROMs (optional)
   Preprint preparation for reprint orders (optional)
   Reprint creation and delivery
   Central management of credit card payments
Carried out locally at library sites
     Scanning and uploading of material
     Handling orders in Order Data Manager
     Uploading to local digital repositories
     Long term storage
EOD: The Libraries‘ point of view


Workflow for the libraries
    Order arrives
    Order the book in the library
    Check the order details (can it be digitised,
     correct automatically fetched metadata)
    Scan book cover to cover
    Upload the images
    Start eBook generation
    Check results and finish the order
EOD: The Libraries‘ point of view
EOD: The Libraries‘ point of view


Ebook generation
     Configuring settings
          Resolution and jpeg quality
          With or without OCR
          OCR settings (language, font type)
          Deskew despeckle
     Start eBook generation
     Create EOD cover pages
Alternatively generate eBook locally
EOD: The Libraries‘ point of view


The library can download the OCR output
 as zipped single pages xml and as RTF
     Use in local repository (e.g. full text search)
     Digitisation for the visually impaired
     Possible full text correction
     Conversion to other formats (e.g. ePub)
No structural information. Requests for
 METS/ALTO output until now
The end-users point of view
The end-users point of view


Find the record of the book in catalogue
Click EOD button
Fill out orderform
1-2 weeks delivery time depending on the
  library
Pay online
Download and use
The end-users point of view

The catalogue situation diverse and dispersed
   OPACs
   Digitised card catalogues
   Union catalogues
The EOD search engine
   In addition to the EOD button in the libraries'
     catalogues
   http://search.books2ebooks.eu
   3 million records of digitisable and digitised items
   19 EOD libraries already integrated their records
The end-users point of view
eBooks on Demand and the
Functional Extension Parser
EOD and the FEP

Motivation
    Improve output for libraries
          Structural information
          METS/ALTO
    Improve output for end-users Enhance
      PDF with clickable TOC
EOD and the FEP

Prerequisites
    XML output of OCR of complete document
    Images of the scanned document
    Coordinates of the OCR xml must correspond with the coordinates in
      the images (deskew images before)
    Quality of the scans and OCR as good as possible
FEP works with the XML output of EOD eBook generation
Automatically extracts structural information about the document
    Page numbers
    Table of Contents
Offers webinterface to manually correct enhance the result
EOD and the FEP

Integration of FEP into EOD workflow
    Regular EOD eBook generation
    Operators decide if FEP is possible/useful
         Scan quality
         OCR quality
         Structure of the book
    Start automatic recognition
    Check/correct/modify results in FEP
      webinterface
EOD and the FEP




Operator finds books with automatically
 recognized structure in the FEP webinterface
 and can then enhance/correct the recognized
 printspace, pagination and TOC (optionally
 also the logical structure)
EOD and the FEP

After all correction steps are done
     METS/ALTO files
     Enhanced PDF
If results are ok
     Replace regular PDF with enhanced PDF
      by uploading to ODM via FTP
     End-users download enhanced PDF as
      usual through their EOD trackingpage
EOD and the FEP
EOD and the FEP

Current status
     Interface OrderDataManager – FEP core
       implemented and workflow adapted
     Internal testing phase finished
     Online and offline workshops to familiarize EOD
       operators on FEP correction webinterface
       were held
     Ready for production environment
     Betatesting and feedback period with 10
       selected EOD network libraries until end of
       July
Thank you for your attention!
Andreas.Parschalk@uibk.ac.at

More Related Content

More from IMPACT Centre of Competence

More from IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Recently uploaded

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 

Recently uploaded (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 

IMPACT Final Event 26-06-2012 - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

  • 1. eBooks on Demand and FEP Andreas Parschalk, University of Innsbruck (UIBK), Library andreas.parschalk@uibk.ac.at
  • 2. Overview EOD – the service Overview Libraries workflow End-users view EOD and the Functional Extension Parser The Functional Extension Parser (FEP) Integration into the workflow Current status
  • 3. EOD – the service What is EOD? Network of libraries Digitisation on demand for copyright-free books Started 2006 co-founded by the EC in eTEN program Delivering digitised books since 2007
  • 4. EOD – the service Incorporation into Digital Library & Europeana EOD button: digitising this book on request Library: scans & transfers images
  • 6.  > 30 libraries, 12 countries
  • 7. EOD libraries University Libraries of Innsbruck, Graz and Vienna (2x), Austria Vienna City Library Bavarian State Library (Munich), University Libraries of Germany Regensburg, Greifswald, Berlin (Humboldt University), Saxon State Library (Dresden), STABI Berlin Denmark Royal Library Estonia National Library, University Library of Tartu France Academic health library (Paris) Hungary National Széchényi Library of Hungary, Library of the Hungarian Academy of Science Portugal National Library Slovakia University Library of Bratislava, Slovak Academy of Sciences Slovenia National and University Library Sweden University Library of Umeå, National Library of Sweden Switzerland National Library of Switzerland, Library at Guisanplatz
  • 8. EOD – the service What is being digitised Only public domain books according to laws and regulations of the libraries' country Aim: „Full informational capture“ Whole books cover to cover Virtually counted blank pages Supplements (maps, tables, …) that form an integral part of the document
  • 9. EOD: The Libraries‘ point of view  Central services used by libraries  Web application for the administration of orders and generation of eBooks  Automation of communication (automated e-mails to end-users, tracking page with status update)  OCR (optical character recognition) services: antiqua and gothic font  NEW: Structural Analysis (FEP)  Delivery of CD-ROMs (optional)  Preprint preparation for reprint orders (optional)  Reprint creation and delivery  Central management of credit card payments
  • 10. Carried out locally at library sites Scanning and uploading of material Handling orders in Order Data Manager Uploading to local digital repositories Long term storage
  • 11. EOD: The Libraries‘ point of view Workflow for the libraries Order arrives Order the book in the library Check the order details (can it be digitised, correct automatically fetched metadata) Scan book cover to cover Upload the images Start eBook generation Check results and finish the order
  • 12. EOD: The Libraries‘ point of view
  • 13. EOD: The Libraries‘ point of view Ebook generation Configuring settings Resolution and jpeg quality With or without OCR OCR settings (language, font type) Deskew despeckle Start eBook generation Create EOD cover pages Alternatively generate eBook locally
  • 14.
  • 15. EOD: The Libraries‘ point of view The library can download the OCR output as zipped single pages xml and as RTF Use in local repository (e.g. full text search) Digitisation for the visually impaired Possible full text correction Conversion to other formats (e.g. ePub) No structural information. Requests for METS/ALTO output until now
  • 17. The end-users point of view Find the record of the book in catalogue Click EOD button Fill out orderform 1-2 weeks delivery time depending on the library Pay online Download and use
  • 18. The end-users point of view The catalogue situation diverse and dispersed OPACs Digitised card catalogues Union catalogues The EOD search engine In addition to the EOD button in the libraries' catalogues http://search.books2ebooks.eu 3 million records of digitisable and digitised items 19 EOD libraries already integrated their records
  • 19.
  • 20.
  • 22. eBooks on Demand and the Functional Extension Parser
  • 23. EOD and the FEP Motivation Improve output for libraries Structural information METS/ALTO Improve output for end-users Enhance PDF with clickable TOC
  • 24. EOD and the FEP Prerequisites XML output of OCR of complete document Images of the scanned document Coordinates of the OCR xml must correspond with the coordinates in the images (deskew images before) Quality of the scans and OCR as good as possible FEP works with the XML output of EOD eBook generation Automatically extracts structural information about the document Page numbers Table of Contents Offers webinterface to manually correct enhance the result
  • 25. EOD and the FEP Integration of FEP into EOD workflow Regular EOD eBook generation Operators decide if FEP is possible/useful Scan quality OCR quality Structure of the book Start automatic recognition Check/correct/modify results in FEP webinterface
  • 26. EOD and the FEP Operator finds books with automatically recognized structure in the FEP webinterface and can then enhance/correct the recognized printspace, pagination and TOC (optionally also the logical structure)
  • 27.
  • 28.
  • 29. EOD and the FEP After all correction steps are done METS/ALTO files Enhanced PDF If results are ok Replace regular PDF with enhanced PDF by uploading to ODM via FTP End-users download enhanced PDF as usual through their EOD trackingpage
  • 30. EOD and the FEP
  • 31. EOD and the FEP Current status Interface OrderDataManager – FEP core implemented and workflow adapted Internal testing phase finished Online and offline workshops to familiarize EOD operators on FEP correction webinterface were held Ready for production environment Betatesting and feedback period with 10 selected EOD network libraries until end of July
  • 32.
  • 33. Thank you for your attention! Andreas.Parschalk@uibk.ac.at