SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
static void
                           _f_do_barnacle_install_properties(GObjectClass
                                                            *gobject_class)
                                                                           {




OCRFeeder
                                                     GParamSpec *pspec;


                                                   /* Party code attribute */
                                            pspec = g_param_spec_uint64
                                              (F_DO_BARNACLE_CODE,
                                                          "Barnacle code.",
                                                           "Barnacle code",
                                                                           0,
                                                            G_MAXUINT64,
                                                          G_MAXUINT64 /*
                                                            default value */,

Converting printed documents into
                                                   G_PARAM_READABLE
                                                | G_PARAM_WRITABLE |
                                                    G_PARAM_PRIVATE);


digital formats             g_object_class_install_property (gobject_class,

                                        F_DO_BARNACLE_PROP_CODE,



Joaquim Rocha
jrocha@igalia.com




       Berlin, May 2011
What is it?

Document Analysis and Optical
   Character Recognition
        for GNOME


                   Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Why?

 Paper has a number of problems

No applications for GNU/Linux to do
             a fair job

                      Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Paper problems:
   Security




                        Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
          CC Photo by: http://www.flickr.com/photos/badwsky/
Paper problems:
 Preservation




                        Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
   CC Photo by: http://www.flickr.com/photos/98469445@N00/
Paper problems:
Data processing




                     Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
     CC Photo by: http://www.flickr.com/photos/hugovk/
Paper problems:
   Ecology




                          Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
        CC Photo by: http://www.flickr.com/photos/pranavsingh/
No fair conversion apps for
          GNU/Linux

apart from OCR engines, but...



                   Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
OCR != Document Conversion

    (it only deals with chars)
 (does not consider the layout)
(does not distinguish contents)


                    Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
What's needed is

  Document Analysis and
      Recognition

(conversion of documents to an
        electronic format)
   (first projects in the 80s)
                   Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Where are were we at?

   * Some closed solutions
* Only for proprietary systems
        * Various prices
   * still... arguable results

                   Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
How




      Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
So many layouts...




                              Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
          CC Photo by: http://www.flickr.com/photos/uber-tuber/
Layouts vary with the type of
            document

What works on detecting one, won't
         work on others


                      Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
OCRFeeder focuses on contents,
       not on layouts!




                   Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Key concept:

   If a document image can be
divided in windows of 1 (content)
         or 0 (not content),
then it is possible to group all the
    1s and outline the contents

                      Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Recognition:

System-wide OCR engines are used

 Engines are configured from the
        GUI or XML files

                     Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Most known free OCR engines are
     detected and configured
          automatically:

          * Tesseract
            * GOCR
           * OCRAD
          * Cuneiform
                   Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Exportation formats:

       ODT
      HTML
     Plain text


                  Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
User interaction:

   Users can edit everything
and review the algorithm's results

So, UI can work in attended and
        unattended ways
CLI only works in an unattended
              mode
                     Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Demo time!




         Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Other features:

    * PDF importation
* Unpaper preprocessor
   * Font style edition
   * Image deskewing
 * OCR results cleaning
* Project saving/loading
                Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
A11y:

* OCRFeeder is a very useful tool
    for visually impaired users
 * Last year, the main target of its
development was to improve a11y

                       Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Future:

    * Integrate Ocropus as an
   alternative analysis backend
* More exportation formats: HOCR,
             PDF, etc.
* Make OCR engines' management
               easier
                      Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Webpage:
http://live.gnome.org/OCRFeeder

git:
http://git.gnome.org/ocrfeeder

Bugzilla:
http://bugzilla.gnome.org
product: OCRFeeder

                      Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Manual in German:

http://wiki.ubuntuusers.de/OCRFeeder




                      Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
Thank you!

         Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011

Contenu connexe

Tendances

Python and GObject Introspection
Python and GObject IntrospectionPython and GObject Introspection
Python and GObject Introspection
Yuren Ju
 
ADG Poznań - Kotlin for Android developers
ADG Poznań - Kotlin for Android developersADG Poznań - Kotlin for Android developers
ADG Poznań - Kotlin for Android developers
Bartosz Kosarzycki
 
Object Oriented Programming in JavaScript
Object Oriented Programming in JavaScriptObject Oriented Programming in JavaScript
Object Oriented Programming in JavaScript
zand3rs
 
Advanced Javascript
Advanced JavascriptAdvanced Javascript
Advanced Javascript
Adieu
 

Tendances (20)

OCRFeeder - OCR made easy on GNOME (GUADEC 2012)
OCRFeeder - OCR made easy on GNOME (GUADEC 2012)OCRFeeder - OCR made easy on GNOME (GUADEC 2012)
OCRFeeder - OCR made easy on GNOME (GUADEC 2012)
 
Js training
Js trainingJs training
Js training
 
Python and GObject Introspection
Python and GObject IntrospectionPython and GObject Introspection
Python and GObject Introspection
 
A Deeper look into Javascript Basics
A Deeper look into Javascript BasicsA Deeper look into Javascript Basics
A Deeper look into Javascript Basics
 
Advanced JavaScript
Advanced JavaScriptAdvanced JavaScript
Advanced JavaScript
 
GDG Madrid - Dart Event - By Iván Zaera
GDG Madrid - Dart Event - By Iván ZaeraGDG Madrid - Dart Event - By Iván Zaera
GDG Madrid - Dart Event - By Iván Zaera
 
Basic Javascript
Basic JavascriptBasic Javascript
Basic Javascript
 
Exciting JavaScript - Part I
Exciting JavaScript - Part IExciting JavaScript - Part I
Exciting JavaScript - Part I
 
Golang
GolangGolang
Golang
 
Functional go
Functional goFunctional go
Functional go
 
ADG Poznań - Kotlin for Android developers
ADG Poznań - Kotlin for Android developersADG Poznań - Kotlin for Android developers
ADG Poznań - Kotlin for Android developers
 
Object Oriented Programming in JavaScript
Object Oriented Programming in JavaScriptObject Oriented Programming in JavaScript
Object Oriented Programming in JavaScript
 
JavaScript - Chapter 8 - Objects
 JavaScript - Chapter 8 - Objects JavaScript - Chapter 8 - Objects
JavaScript - Chapter 8 - Objects
 
Advanced Javascript
Advanced JavascriptAdvanced Javascript
Advanced Javascript
 
Cocoa for Web Developers
Cocoa for Web DevelopersCocoa for Web Developers
Cocoa for Web Developers
 
Google Dart
Google DartGoogle Dart
Google Dart
 
Java objects on steroids
Java objects on steroidsJava objects on steroids
Java objects on steroids
 
PDC Video on C# 4.0 Futures
PDC Video on C# 4.0 FuturesPDC Video on C# 4.0 Futures
PDC Video on C# 4.0 Futures
 
Advanced JavaScript
Advanced JavaScriptAdvanced JavaScript
Advanced JavaScript
 
TDD and mobile development: some forgotten techniques, illustrated with Android
TDD and mobile development: some forgotten techniques, illustrated with AndroidTDD and mobile development: some forgotten techniques, illustrated with Android
TDD and mobile development: some forgotten techniques, illustrated with Android
 

En vedette (6)

Introduction to Django
Introduction to DjangoIntroduction to Django
Introduction to Django
 
Seriesfinale, a TV shows' tracker for Maemo 5
Seriesfinale, a TV shows' tracker for Maemo 5Seriesfinale, a TV shows' tracker for Maemo 5
Seriesfinale, a TV shows' tracker for Maemo 5
 
Ocrfeeder
OcrfeederOcrfeeder
Ocrfeeder
 
Introducción a Django
Introducción a DjangoIntroducción a Django
Introducción a Django
 
Hands On The New Hildon
Hands On The New HildonHands On The New Hildon
Hands On The New Hildon
 
Adapting GNOME Applications to Maemo Fremantle
Adapting GNOME Applications to Maemo FremantleAdapting GNOME Applications to Maemo Fremantle
Adapting GNOME Applications to Maemo Fremantle
 

Similaire à OCRFeeder LinuxTag 2011

Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)
Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)
Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)
Igalia
 
Grilo: Integrating Multimedia Content in Applications (ELCE 2010)
Grilo: Integrating Multimedia Content in Applications (ELCE 2010)Grilo: Integrating Multimedia Content in Applications (ELCE 2010)
Grilo: Integrating Multimedia Content in Applications (ELCE 2010)
Igalia
 

Similaire à OCRFeeder LinuxTag 2011 (20)

Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)
Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)
Converting printed documents into digital formats with OCRFeeder (LinuxTag 2011)
 
OCRFeeder (FOSDEM 2010)
OCRFeeder (FOSDEM 2010)OCRFeeder (FOSDEM 2010)
OCRFeeder (FOSDEM 2010)
 
Grilo: Integration of Multimedia Contents in Applications Made Easy (FOSDEM 2...
Grilo: Integration of Multimedia Contents in Applications Made Easy (FOSDEM 2...Grilo: Integration of Multimedia Contents in Applications Made Easy (FOSDEM 2...
Grilo: Integration of Multimedia Contents in Applications Made Easy (FOSDEM 2...
 
Grilo: Integrating Multimedia Content in Applications (ELCE 2010)
Grilo: Integrating Multimedia Content in Applications (ELCE 2010)Grilo: Integrating Multimedia Content in Applications (ELCE 2010)
Grilo: Integrating Multimedia Content in Applications (ELCE 2010)
 
Modest Maemo Summit 09
Modest Maemo Summit 09Modest Maemo Summit 09
Modest Maemo Summit 09
 
Mender.io | Develop embedded applications faster | Comparing C and Golang
Mender.io | Develop embedded applications faster | Comparing C and GolangMender.io | Develop embedded applications faster | Comparing C and Golang
Mender.io | Develop embedded applications faster | Comparing C and Golang
 
Grilo
GriloGrilo
Grilo
 
Cross-Platform Mobile Development with Ionic Framework and Angular
Cross-Platform Mobile Development with Ionic Framework and AngularCross-Platform Mobile Development with Ionic Framework and Angular
Cross-Platform Mobile Development with Ionic Framework and Angular
 
Are app servers still fascinating
Are app servers still fascinatingAre app servers still fascinating
Are app servers still fascinating
 
Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)Update on the open source browser space (16th GENIVI AMM)
Update on the open source browser space (16th GENIVI AMM)
 
BSidesROC 2016 - Jaime Geiger - Android Application Function Hooking With Xposed
BSidesROC 2016 - Jaime Geiger - Android Application Function Hooking With XposedBSidesROC 2016 - Jaime Geiger - Android Application Function Hooking With Xposed
BSidesROC 2016 - Jaime Geiger - Android Application Function Hooking With Xposed
 
Comparing C and Go
Comparing C and GoComparing C and Go
Comparing C and Go
 
Grilo: Feeding applications with multimedia content (GUADEC 2010)
Grilo: Feeding applications with multimedia content (GUADEC 2010)Grilo: Feeding applications with multimedia content (GUADEC 2010)
Grilo: Feeding applications with multimedia content (GUADEC 2010)
 
Go at Skroutz
Go at SkroutzGo at Skroutz
Go at Skroutz
 
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud RunDesigning flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
 
Import golang; struct microservice
Import golang; struct microserviceImport golang; struct microservice
Import golang; struct microservice
 
The Go features I can't live without, 2nd round
The Go features I can't live without, 2nd roundThe Go features I can't live without, 2nd round
The Go features I can't live without, 2nd round
 
[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive[KubeCon EU 2020] containerd Deep Dive
[KubeCon EU 2020] containerd Deep Dive
 
Automate Mobile App Testing—Or Go Crazy
Automate Mobile App Testing—Or Go CrazyAutomate Mobile App Testing—Or Go Crazy
Automate Mobile App Testing—Or Go Crazy
 
SeriesFinale, a TV shows' tracker for Maemo 5 (FOSDEM 2010)
SeriesFinale, a TV shows' tracker for Maemo 5 (FOSDEM 2010)SeriesFinale, a TV shows' tracker for Maemo 5 (FOSDEM 2010)
SeriesFinale, a TV shows' tracker for Maemo 5 (FOSDEM 2010)
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

OCRFeeder LinuxTag 2011

  • 1. static void _f_do_barnacle_install_properties(GObjectClass *gobject_class) { OCRFeeder GParamSpec *pspec; /* Party code attribute */ pspec = g_param_spec_uint64 (F_DO_BARNACLE_CODE, "Barnacle code.", "Barnacle code", 0, G_MAXUINT64, G_MAXUINT64 /* default value */, Converting printed documents into G_PARAM_READABLE | G_PARAM_WRITABLE | G_PARAM_PRIVATE); digital formats g_object_class_install_property (gobject_class, F_DO_BARNACLE_PROP_CODE, Joaquim Rocha jrocha@igalia.com Berlin, May 2011
  • 2. What is it? Document Analysis and Optical Character Recognition for GNOME Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 3. Why? Paper has a number of problems No applications for GNU/Linux to do a fair job Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 4. Paper problems: Security Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/badwsky/
  • 5. Paper problems: Preservation Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/98469445@N00/
  • 6. Paper problems: Data processing Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/hugovk/
  • 7. Paper problems: Ecology Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/pranavsingh/
  • 8. No fair conversion apps for GNU/Linux apart from OCR engines, but... Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 9. OCR != Document Conversion (it only deals with chars) (does not consider the layout) (does not distinguish contents) Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 10. What's needed is Document Analysis and Recognition (conversion of documents to an electronic format) (first projects in the 80s) Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 11. Where are were we at? * Some closed solutions * Only for proprietary systems * Various prices * still... arguable results Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 12. How Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 13. So many layouts... Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011 CC Photo by: http://www.flickr.com/photos/uber-tuber/
  • 14. Layouts vary with the type of document What works on detecting one, won't work on others Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 15. OCRFeeder focuses on contents, not on layouts! Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 16. Key concept: If a document image can be divided in windows of 1 (content) or 0 (not content), then it is possible to group all the 1s and outline the contents Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 17. Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 18. Recognition: System-wide OCR engines are used Engines are configured from the GUI or XML files Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 19. Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 20. Most known free OCR engines are detected and configured automatically: * Tesseract * GOCR * OCRAD * Cuneiform Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 21. Exportation formats: ODT HTML Plain text Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 22. User interaction: Users can edit everything and review the algorithm's results So, UI can work in attended and unattended ways CLI only works in an unattended mode Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 23. Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 24. Demo time! Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 25. Other features: * PDF importation * Unpaper preprocessor * Font style edition * Image deskewing * OCR results cleaning * Project saving/loading Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 26. A11y: * OCRFeeder is a very useful tool for visually impaired users * Last year, the main target of its development was to improve a11y Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 27. Future: * Integrate Ocropus as an alternative analysis backend * More exportation formats: HOCR, PDF, etc. * Make OCR engines' management easier Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 29. Manual in German: http://wiki.ubuntuusers.de/OCRFeeder Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011
  • 30. Thank you! Joaquim Rocha (Igalia) · OCRFeeder · LinuxTag 2011