ocr with N N

OCR with Neural Network Made By: Marwa Fadhel Jassim Karam Samir Khalid

Introduction Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website.

programmed with images of each character, and worked on one font at a time. "Intelligent" systems with a high degree of recognition accuracy for most fonts are now common.Some systems are capable ofreproducing formatted output that closely approximates the original scanned page including images, columns and other non-textual components.

OCR: picture of text -> text ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

OCR Step 2: Identify each letter ,[object Object]

Identifying letters is hard ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Approaches ,[object Object],[object Object],[object Object],[object Object]

What is Neural Networks? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Firing neurons excite others Firing threshold

...which in turn excite others Firing threshold

Inputs can be weighted Firing threshold 0.7 0.4

Neurons can suppress others Firing threshold 0.7 0.4 -0.5

And they can have a starting bias Firing threshold Bias 0.3

(So they're basically logic gates) 0.5 0.5 1 1 -1 Bias: 1 AND OR NOT

Neurons arranged in layers ,[object Object],[object Object],[object Object],[object Object],In Hid Out

Simple letter identification network ,[object Object],[object Object],[object Object],A B C D E F G H I J K L M ...

Training Method The most popular and simple approach to OCR problem is based on feed forward neural network with back propagation learning. The main idea is that we should first prepare a training set and then train a neural network to recognize patterns from the training set. In the training step we teach the network to respond with desired output for a specified input. For this purpose each training sample is represented by two components: possible input and the desired network's output for the input. After the training step is done, we can give an arbitrary input to the network and the network will form an output, from which we can resolve a pattern type presented to the network.

Let's assume that we want to train a network to recognize 26 capital letters represented as images of 5x6 pixels, something like this one: One of the most obvious ways to convert an image to an input part of a training sample is to create a vector of size 30 (for our case), containing "1" in all positions corresponding to the letter pixel and "0" in all positions corresponding to the background pixels. But, in many neural network training tasks, it's preferred to represent training patterns in so called "bipolar" way, placing into input vector "0.5" instead of "1" and "-0.5" instead of "0". Such sort of pattern coding will lead to a greater learning performance improvement.

our training sample should look something like this: For each possible input we need to create a desired network's output to complete the training samples. For OCR task it's very common to code each pattern as a vector of size 26 (because we have 26 different letters), placing into the vector "0.5" for positions corresponding to the patterns type number and "-0.5" for all other positions

So, a desired output vector for letter "K“ will look something like this: After having such training samples for all letters, we can start to train our network. But, the last question is about the network's structure. For the above task we can use one layer of neural network, which will have 30 inputs corresponding to the size of input vector and 26 neurons in the layer corresponding to the size of the output vector.

The OCR software breaks the image into sub-images, each containing a single character. The sub-images are then translated from an image format into a binary format, where each 0 and 1 represents an individual pixel of the sub-image. The binary data is then fed into a neural network that has been trained to make the association between the character image data and a numeric value that corresponds to the character. The output from the neural network is then translated into ASCII text and saved as a file. Another Method

ocr with N N

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à ocr with N N

Similaire à ocr with N N (20)

Dernier

Dernier (20)

ocr with N N