Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

OCR using Tesseract

2 606 vues

Publié le

This presentation explains the working of OCR engine for character recognition. It demonstrates the working of the Tesseract by Google.

Publié dans : Ingénierie
  • Soyez le premier à commenter

OCR using Tesseract

  1. 1. Real time OCR using Tesseract 12BCE094 SHOBHIT CHITTORA
  2. 2. Brief History Of Tesseract  Open Source OCR engine sponsored by Google since 2006.  One of the most accurate open source OCR engines currently available.  Originally developed by HP between 1985-1994.  Lot of it is written in C and C++.
  3. 3. TessOCR Architecture
  4. 4. Adaptive Thresholding is Essential
  5. 5. Baselines are rarely perfectly straight
  6. 6. Spaces between words are tricky too  Italics, digits, punctuation all create special-case font-dependent spacing.  Fully justified text in narrow columns can have vastly varying spacing on different lines.
  7. 7. Tesseract Word Recognizer
  8. 8. Outline Approximation  Polygonal approximation is a double-edged sword.  Noise and some pertinent information are both lost.
  9. 9. Why it’s called Tesseract?  Elements of the polygonal approximation, clustered within a character/font combination.  x, y position, direction, and length (as a multiple of feature length)
  10. 10. Character Classifier (Features and Matching)  Static classifier uses outline fragments as features. Broken characters are easily recognizable by a small->large matching process in classifier. (This is slow.)  Adaptive classifier uses the same technique!
  11. 11. Classifier as Histogram of Gradients  Quantize character area.  Compute gradients within.  Histograms of gradients map to fixed dimension feature vector.
  12. 12. Character Segmentation  Segmentation Graphs
  13. 13. Rating and Certainty  Rating = Distance * Outline length ○ Total rating over a word (or line if you prefer) is normalized ○ Different length transcriptions are fairly comparable  Certainty = -20 * Distance ○ Measures the absolute classification confidence ○ Surrogate for log probability and is used to decide what needs more work.
  14. 14. Tesseract Training
  15. 15. Implementation using Tess-two( Tess port for Android)  The Tess-two library is an open source port of Tesseract engine for Android.  Only the most basic and popular functionalities are ported.  Things such as deep neutral nets are not ported.  A lot of tweaking is required to produce desired results.
  16. 16. DEMO
  17. 17. Implementing Real Time OCR and challenges  Image processing on memory limited devices is difficult.  Limited clock speeds to process huge matrices.  Running the Camera Surface Holder in MainUI and preprocessing and OCR on user threads.  Maintaining huge Bitmaps for preprocessing and sending to multiple threads.  Avoiding Garbage Collection of important preprocessed data.
  18. 18. Thank You