Building a Naive OCR System

BUILDING A NAÏVE OCR
SYSTEM
Kaur Alasoo and Jaana Metsamaa
Data Mining (2009 fall)

28 January 2010 http://kaur.pri.ee/projects/wordocr/

VISION

Take a
snapshot of a
page.

VISION

Extract one
word from
the picture.

VISION

Look up the word
on Google,
dictionary, etc...

REALITY
• To increase the likelihood of success, we limited ourselves to:

• One font from one book.

• Only lowercase characters.

• Pictures taken with one phone.

GETTING THE DATA

44 pictures

GETTING THE DATA

44 pictures 242 words

0. Take a picture

1. Convert to grayscale

2. Apply thresholding

WORKFLOW 3. Cut horizontally

4. Cut vertically
5. Resize

8x16 px

FROM PICTURE TO FEATURE
VECTOR

VECTOR
255

VECTOR
255 255

VECTOR
255 255 255

VECTOR
255 255 255 0

VECTOR
255 255 255 0 0

VECTOR
255 255 255 0 0 155

VECTOR
255 255 255 0 0 155 255

VECTOR
255 255 255 0 0 155 255 255

TRAINING THE CLASSIFIER

We used 1343 characters to train a SVM with RBF
kernel.

RESULTS

• 10-fold cross-validation accuracy with our method:

98.86%
• Character classiﬁcation accuracy using Tesseract OCR:

91.47%

SOME LETTERS ARE
EXTREMELY RARE
15

English Our data

11

8

4

0
e t a o i n s h r d l c u m w f g y p b v k j x q z

NOT ALL FONTS CAN BE EASILY
SEPARATED INTO CHARACTERS

WHAT WE LEARNED?

• Installing libraries is the most difﬁcult thing.

• Do not overly restrict yourself.

• Donot build your on OCR system unless its absolutely
necessary.

• The letters are the easy part.

http://kaur.pri.ee/projects/wordocr/

Building a Naive OCR System

Recommandé

Recommandé

Contenu connexe

Similaire à Building a Naive OCR System

Similaire à Building a Naive OCR System (20)

Plus de Kaur Alasoo

Plus de Kaur Alasoo (6)

Dernier

Dernier (20)

Building a Naive OCR System