SlideShare une entreprise Scribd logo
1  sur  35
OCR System
Presented By:-
Vijay apurva(9910103462),
From 4th
year,CSEGuided By:-
Mr. Ankur
kulhari
The current capacity to translate paper documents quickly and
accurately into machine readable form using optical character
recognition technology augments the opportunities in document
searching and storing, as well as the automated document
processing. A fast response in translating large collections of image-
based electronic documents into structured electronic documents is
still a problem. The availability of a large number of processing units
in Grid environments and of free optical character recognition
tools can be exploited to produce a fast translation.
ABSTRACT:-
CONTENTS :-
 What is OCR?
 When and Why OCR?
 Existing System.
 Proposed System.
 Architecture of OCR.
 Algorithms of OCR.
 Modules of OCR.
 Design of OCR.
 Design of Screen shots for OCR.
 Conclusion.
WHAT IS OCR? :-
OCR stands for Optical Character Recognition. It is
one such system that allows us to scan printed, typewritten or
hand written text (numerals, letters or symbols) and/or convert
scanned image in to a computer process able format, either in the
form of a plain text or a word document.
 Later the converted documents can be edited, used or reused
in other documents. Thus the documents become editable.
WHEN AND WHY OCR? :-
 OCR is used when recreating a similar document in paper as
a document in electronic form takes more time.
 The converted text files take less space than the original
image file and can be indexed. Hence the use of OCR adds an
advantage to the user who had to deal with conversion of great
amount of paper works in to electronic form.
EXISTING SYSTEM:-
In the running world there is a growing demand for
the users to convert the printed documents in to electronic
documents for maintaining the security of their data. Hence the
basic OCR system was invented to convert the data available on
papers in to computer process able documents, So that the
documents can be editable and reusable.
PROPOSED SYSTEM:-
Our proposed system is OCR ON A GRID
INFRASTRUCTURE which is a character recognition system that
supports recognition of the characters of multiple languages. This
feature is what we call grid infrastructure which eliminates the
problem of heterogeneous character recognition. In this context,
Grid infrastructure means the infrastructure that supports group of
specific set of languages. Thus OCR on a grid infrastructure is multi-
lingual.
ARCHITECTURE :-
 The Architecture of the optical character recognition system on a
grid infrastructure consists of the three main components. They are:-
 Scanner
 OCR Hardware or Software
 Output Interface
Document
Illuminator
Detector
Document
Analysis
Character
Recognition Contextual
Processing
Scanner
OCR Hard-Ware Or Soft-Ware
Document image
Output
Interface
Recognition Results
To application user
TYPES OF TRAINING:-
Basically there are two major types of training using which we can
train a neural network system. They are:-
 Supervised Training
 Unsupervised Training
FLOWCHART FOR UNSUPERVISED LEARNING:-
KOHONEN NETWORK:-
The Kohonen network is presented with data, but the correct
output that corresponds to that data is not specified. Using the
Kohonen network this data can be classified into groups.
FLOWCHART FOR KOHONEN TRAINING:-
ALGORITHMS OF OCR:-
TRAINING ALGORITHM:-
One of the most common learning algorithms is called Hebb’s
Rule. This rule was developed to assist with unsupervised training.
 Hebb’s rule is expressed as:
Δ Wi j= µ ai aj (d-a)
MODULES :-
The Modules that were identified in the Optical Character
Recognition system are as follows:-
 Document Processing
 Neural network System Training
 Document Recognition
 Document Editing and
 Document Searching
DESIGN OF OCR :-
The design of our OCR system can be best explained
with the following diagram:-
Scan
Store
Recognize Editing
Searching
Document
and users
Database
OVERALL USECASE DIAGRAM:-
end-user1
end-user2
Document modification Document deletion
Document recognition
scan documents
store documents
Document processing
<<includes>>
<<includes>>
Document processing
Document editing
administrator
Trains the system
end-user
OVERALL CLASS DIAGRAM:-
Document
docid : integer
docname : String
docsize : integer
doctype : String
getDocumentDetails()
scanDocument()
covertToImage()
storeImage()
Editor
cut()
copy()
paste()
new()
open()
find()
HelpFrame
HEntry
hLineClear()
vLineClear()
findBounds()
TrainingSet
inputCount : int
outputcount : int
trainingSetCount : int
setInputCount()
setOutputCount()
setTrainingSetCount()
setClassify()
1..*
1
1..*
1
MainScreen
editor()
helpFrame()
printedFrame()
handWrittenFrame()
Entry
recog : int
downSampleLeft : int
downSampleRight : int
downSampleTop : int
downSampleBottom : int
hLineClear()
hLineClearWithin()
vLineClear()
vLineClearWithin()
PrintedFrame
open_action()
train_action()
topen_action()
recogniseAll_action()
1..*
1
1..*
1
KohenNetwork
LearnMethod = 1:int
LearnRate = 0.3:double
quitError : double
copyWeights()
clearWeights()
winner()
normalizeInput()
1..*1..* 1..*1..* 1..*1..* 1..*1..*
DESIGN OF SCREEN SHOTS FOR OCR:-
 Main Screen
 Hand Written Recognition Screen
 Scanned Document Recognition Screen
 Training Screen
 Recognition Screen
 Editor Screen
The screenshots that describe the operations carried out by our
system are as follows :-
CONCLUSION:-
The Grid infrastructure used in the implementation of
Optical Character Recognition system can be efficiently used to
speed up the translation of image based documents into structured
documents that are currently easy to discover, search and process.
The automated entry of data by OCR is one of the most
attractive, labor reducing technology
The recognition of new font characters by the system is very
easy and quick.
We can edit the information of the documents more
conveniently and we can reuse the edited information as and
when required.
The extension to software other than editing and searching is
topic for future works.
• Training and recognition speeds can
be increased greater and greater by
making it more user-friendly.
• Many applications exist where it
would be desirable to read
handwritten entries. Reading
handwriting is a very difficult task
considering the diversities that exist
in ordinary penmanship. However,
progress is being made.
optical character recognition system

Contenu connexe

Tendances

A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESA STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
ijcsitcejournal
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
Chiranjeevi Adi
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
Naiyan Noor
 

Tendances (20)

Optical character recognition (ocr) ppt
Optical character recognition (ocr) pptOptical character recognition (ocr) ppt
Optical character recognition (ocr) ppt
 
Optical Character Recognition
Optical Character RecognitionOptical Character Recognition
Optical Character Recognition
 
Text reader [OCR]
Text reader [OCR]Text reader [OCR]
Text reader [OCR]
 
Ocr abstract
Ocr abstractOcr abstract
Ocr abstract
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition
 
Basics of-optical-character-recognition
Basics of-optical-character-recognitionBasics of-optical-character-recognition
Basics of-optical-character-recognition
 
ocr
ocrocr
ocr
 
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUESA STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
A STUDY ON OPTICAL CHARACTER RECOGNITION TECHNIQUES
 
Optical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTechOptical Character Reader - Project Report BTech
Optical Character Reader - Project Report BTech
 
OCR speech using Labview
OCR speech using LabviewOCR speech using Labview
OCR speech using Labview
 
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
Handwritten Character Recognition: A Comprehensive Review on Geometrical Anal...
 
Presentation on OCR
Presentation on OCRPresentation on OCR
Presentation on OCR
 
OCR using Tesseract
OCR using TesseractOCR using Tesseract
OCR using Tesseract
 
Text Detection and Recognition
Text Detection and RecognitionText Detection and Recognition
Text Detection and Recognition
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using Python
 
Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks Hand Written Character Recognition Using Neural Networks
Hand Written Character Recognition Using Neural Networks
 
Reconnaissance d'ecriture manuscrite
Reconnaissance d'ecriture manuscriteReconnaissance d'ecriture manuscrite
Reconnaissance d'ecriture manuscrite
 
Handwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer VersionHandwriting Recognition Using Deep Learning and Computer Version
Handwriting Recognition Using Deep Learning and Computer Version
 
Text Extraction from Image using Python
Text Extraction from Image using PythonText Extraction from Image using Python
Text Extraction from Image using Python
 

En vedette

Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0
STUDIO BARONI
 

En vedette (9)

Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0Glossario Domotica Scuola 3.0
Glossario Domotica Scuola 3.0
 
OCR vs. Urjanet
OCR vs. UrjanetOCR vs. Urjanet
OCR vs. Urjanet
 
SPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product VisionSPARK16 Presentation: Urjanet Product Vision
SPARK16 Presentation: Urjanet Product Vision
 
Spark 2017 Key Takeaways
Spark 2017 Key TakeawaysSpark 2017 Key Takeaways
Spark 2017 Key Takeaways
 
How to Access Utility Data
How to Access Utility DataHow to Access Utility Data
How to Access Utility Data
 
The Credit Score Present and Future
The Credit Score Present and FutureThe Credit Score Present and Future
The Credit Score Present and Future
 
SPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & SustainabilitySPARK15: Architecting The Future of Energy & Sustainability
SPARK15: Architecting The Future of Energy & Sustainability
 
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
SPARK16 Presentation: Measuring for Results: Data and the Changing Energy Lan...
 
SPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through GamificationSPARK15: Simplifying Sustainability Through Gamification
SPARK15: Simplifying Sustainability Through Gamification
 

Similaire à optical character recognition system

Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
Dhana K
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
Editor IJARCET
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
Editor IJARCET
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PI
ijtsrd
 

Similaire à optical character recognition system (20)

50120130406005
5012013040600550120130406005
50120130406005
 
D017222226
D017222226D017222226
D017222226
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition
 
Z04405149151
Z04405149151Z04405149151
Z04405149151
 
PB.docx
PB.docxPB.docx
PB.docx
 
Optical Recognition of Handwritten Text
Optical Recognition of Handwritten TextOptical Recognition of Handwritten Text
Optical Recognition of Handwritten Text
 
A12REVIEW.pptx
A12REVIEW.pptxA12REVIEW.pptx
A12REVIEW.pptx
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
How to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutionsHow to create a corpus of machine-readable texts: challenges and solutions
How to create a corpus of machine-readable texts: challenges and solutions
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx300GroupProject_handwritingsoftware.pptx
300GroupProject_handwritingsoftware.pptx
 
Ocr 1
Ocr 1Ocr 1
Ocr 1
 
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
BLOB DETECTION TECHNIQUE USING IMAGE PROCESSING FOR IDENTIFICATION OF MACHINE...
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AI
 
CRC Final Report
CRC Final ReportCRC Final Report
CRC Final Report
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten Characters
 
Smart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PISmart Assistant for Blind Humans using Rashberry PI
Smart Assistant for Blind Humans using Rashberry PI
 
OCR Datasets Unleashed.docx
OCR Datasets Unleashed.docxOCR Datasets Unleashed.docx
OCR Datasets Unleashed.docx
 

Dernier

Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
ssuserdda66b
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Dernier (20)

Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

optical character recognition system

  • 1. OCR System Presented By:- Vijay apurva(9910103462), From 4th year,CSEGuided By:- Mr. Ankur kulhari
  • 2. The current capacity to translate paper documents quickly and accurately into machine readable form using optical character recognition technology augments the opportunities in document searching and storing, as well as the automated document processing. A fast response in translating large collections of image- based electronic documents into structured electronic documents is still a problem. The availability of a large number of processing units in Grid environments and of free optical character recognition tools can be exploited to produce a fast translation. ABSTRACT:-
  • 3. CONTENTS :-  What is OCR?  When and Why OCR?  Existing System.  Proposed System.  Architecture of OCR.  Algorithms of OCR.  Modules of OCR.  Design of OCR.  Design of Screen shots for OCR.  Conclusion.
  • 4. WHAT IS OCR? :- OCR stands for Optical Character Recognition. It is one such system that allows us to scan printed, typewritten or hand written text (numerals, letters or symbols) and/or convert scanned image in to a computer process able format, either in the form of a plain text or a word document.  Later the converted documents can be edited, used or reused in other documents. Thus the documents become editable.
  • 5. WHEN AND WHY OCR? :-  OCR is used when recreating a similar document in paper as a document in electronic form takes more time.  The converted text files take less space than the original image file and can be indexed. Hence the use of OCR adds an advantage to the user who had to deal with conversion of great amount of paper works in to electronic form.
  • 6. EXISTING SYSTEM:- In the running world there is a growing demand for the users to convert the printed documents in to electronic documents for maintaining the security of their data. Hence the basic OCR system was invented to convert the data available on papers in to computer process able documents, So that the documents can be editable and reusable.
  • 7. PROPOSED SYSTEM:- Our proposed system is OCR ON A GRID INFRASTRUCTURE which is a character recognition system that supports recognition of the characters of multiple languages. This feature is what we call grid infrastructure which eliminates the problem of heterogeneous character recognition. In this context, Grid infrastructure means the infrastructure that supports group of specific set of languages. Thus OCR on a grid infrastructure is multi- lingual.
  • 8. ARCHITECTURE :-  The Architecture of the optical character recognition system on a grid infrastructure consists of the three main components. They are:-  Scanner  OCR Hardware or Software  Output Interface
  • 9. Document Illuminator Detector Document Analysis Character Recognition Contextual Processing Scanner OCR Hard-Ware Or Soft-Ware Document image Output Interface Recognition Results To application user
  • 10. TYPES OF TRAINING:- Basically there are two major types of training using which we can train a neural network system. They are:-  Supervised Training  Unsupervised Training
  • 12. KOHONEN NETWORK:- The Kohonen network is presented with data, but the correct output that corresponds to that data is not specified. Using the Kohonen network this data can be classified into groups.
  • 13. FLOWCHART FOR KOHONEN TRAINING:-
  • 14. ALGORITHMS OF OCR:- TRAINING ALGORITHM:- One of the most common learning algorithms is called Hebb’s Rule. This rule was developed to assist with unsupervised training.  Hebb’s rule is expressed as: Δ Wi j= µ ai aj (d-a)
  • 15. MODULES :- The Modules that were identified in the Optical Character Recognition system are as follows:-  Document Processing  Neural network System Training  Document Recognition  Document Editing and  Document Searching
  • 16. DESIGN OF OCR :- The design of our OCR system can be best explained with the following diagram:- Scan Store Recognize Editing Searching Document and users Database
  • 17. OVERALL USECASE DIAGRAM:- end-user1 end-user2 Document modification Document deletion Document recognition scan documents store documents Document processing <<includes>> <<includes>> Document processing Document editing administrator Trains the system end-user
  • 18. OVERALL CLASS DIAGRAM:- Document docid : integer docname : String docsize : integer doctype : String getDocumentDetails() scanDocument() covertToImage() storeImage() Editor cut() copy() paste() new() open() find() HelpFrame HEntry hLineClear() vLineClear() findBounds() TrainingSet inputCount : int outputcount : int trainingSetCount : int setInputCount() setOutputCount() setTrainingSetCount() setClassify() 1..* 1 1..* 1 MainScreen editor() helpFrame() printedFrame() handWrittenFrame() Entry recog : int downSampleLeft : int downSampleRight : int downSampleTop : int downSampleBottom : int hLineClear() hLineClearWithin() vLineClear() vLineClearWithin() PrintedFrame open_action() train_action() topen_action() recogniseAll_action() 1..* 1 1..* 1 KohenNetwork LearnMethod = 1:int LearnRate = 0.3:double quitError : double copyWeights() clearWeights() winner() normalizeInput() 1..*1..* 1..*1..* 1..*1..* 1..*1..*
  • 19. DESIGN OF SCREEN SHOTS FOR OCR:-  Main Screen  Hand Written Recognition Screen  Scanned Document Recognition Screen  Training Screen  Recognition Screen  Editor Screen The screenshots that describe the operations carried out by our system are as follows :-
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33. CONCLUSION:- The Grid infrastructure used in the implementation of Optical Character Recognition system can be efficiently used to speed up the translation of image based documents into structured documents that are currently easy to discover, search and process. The automated entry of data by OCR is one of the most attractive, labor reducing technology The recognition of new font characters by the system is very easy and quick. We can edit the information of the documents more conveniently and we can reuse the edited information as and when required. The extension to software other than editing and searching is topic for future works.
  • 34. • Training and recognition speeds can be increased greater and greater by making it more user-friendly. • Many applications exist where it would be desirable to read handwritten entries. Reading handwriting is a very difficult task considering the diversities that exist in ordinary penmanship. However, progress is being made.