2. INTRO
1. Ingenious piece of software.
2. Involves the mechanical/electronic
conversion of scanned images of
typewritten/printed text into machine-
encoded/computer-readable text.
• 3. Heavily used in the
industry.
3. INTRO ii
• Common method of digitizing printed texts
• Subtle software which is as highly overlooked as it is simple.
• Numerous applications and uses – editing, scanning,
searching, comparison, compact storage and many more!
• OCR is a field of research in pattern recognition, artificial
intelligence and computer vision.
4. Problem Statement
Ever since Charles Babbage invented the computer back in the early 19th
century, Computer machines have held man's imagination for numerous reasons - the
primary being what all is this collection of nuts, bolts and wires capable of doing.
Character Recognition is one such concept which has beheld mankind’s attention. There
can be no greater testimony to the same than the fact that people were already working on
this idea - a few decades before John McCarthy even coined the term "Artificial
Intelligence".
Today, especially, Character Recognition plays a very important part of our daily lives as
they are incorporated so subtly that we even forget their presence. Some examples are
their implementation in Microsoft Word, Adobe Acrobat and even Pen computing.
Optical Character Recognition (OCR) is the mechanical or electronic conversion of scanned
or photoed images of typewritten or printed text into machine-encoded/computer-
readable text. This text can then be used in numerous ways - ranging from assisting the
visually impaired (text-to-speech), extracting information from the image, pen computing
and so on. Optical Character Recognition (OCR) is a result of cross-linking various avenues
of technology like Machine Learning, Artificial Intelligence and Neural Networks. We
propose to develop a system based on mathematical algorithms and principles which
involve all the aforementioned technologies. That being said, Optical Character Recognition
(OCR) also depends on a few other factors : the quality of the image taken, the orientation
of and the dialect being used. Our paper aims to address the aforementioned
problems, which enables its application in numerous new fields as well as the obvious &
established aspects of our surroundings.
5. Tech Jargon - I
• Pre-processing
Used to improve the successful
recognition of the image (include De-
skew, Layout analysis, Despeckle)
• Character/glyph recognition
• Post-processing
• Application specific optimization
Tweaking the system to better deal
with specific or different inputs.
6. Tech Jargon - II
Segmentation
Includes two important phases:
1) Obtaining training samples
2) Recognizing new images after
training
Feature Extraction
Feature of the character are extracted
and hence are compared with the glyph
Classification
After the extraction, neural network is
trained using the training data
7. Our Current Progress
• We started with the Neural Networks / Machine Learning
aspect of the project.
• We have implemented Univariate / Multivariate
Linear/Regularized Linear Regression, Gradient Descent for
Multiple Variables and Logistic/ Regularized Logistic
Regression.
• Currently, we are studying & working on the
implementation of Neural Nets using Forward Propogation.
• We plan on tackling character segmentation and feature
extraction next.
8. Technology to be used
• We are using the following technology
platforms :
– GNU Octave
To develop and test the OCR software.
– 5MP HD camera (720p @ 30fps)
To take images for detection
In 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code. Around the same time, Edmund Fournied'Albe developed the Otophone, a handheld scanner that when moved across a printed page, produced tones that corresponded to specific letters or characters.