SlideShare une entreprise Scribd logo
1  sur  26
Sovann EN
5th
Year Engineering student
Dept. Computer Science & Communication
Institute of Technology of Cambodia
Phnom Penh, Cambodia
Khmer OCR System
1Khmer OCR System
Native of research work
• A collaboration work with Mr. Kruy Vanna, PhDA collaboration work with Mr. Kruy Vanna, PhD
student at kameyama Laboratory , GITS, Wasedastudent at kameyama Laboratory , GITS, Waseda
UniversityUniversity
• The Objective is to produce a reliable Khmer OCRThe Objective is to produce a reliable Khmer OCR
system which is independent of Font and Sizesystem which is independent of Font and Size
2Khmer OCR System
Outline
• Overview of OCROverview of OCR
• Training dataTraining data
• Pre-processing and SegmentationPre-processing and Segmentation
• features extraction and recognition processfeatures extraction and recognition process
• Post-processingPost-processing
3Khmer OCR System
Outline
4Khmer OCR System
• Overview of OCROverview of OCR
• Training dataTraining data
• Pre-processing and features extractionPre-processing and features extraction
• Training process and recognition systemTraining process and recognition system
• Post-processingPost-processing
What is OCR ???
• Optical Character Recognition (OCR) is theOptical Character Recognition (OCR) is the
mechanical or electronic translation of scannedmechanical or electronic translation of scanned
images of handwritten, typewritten or printed textimages of handwritten, typewritten or printed text
into machine-encoded textinto machine-encoded text
http://en.wikipedia.org/wiki/Optical_character_recognitionhttp://en.wikipedia.org/wiki/Optical_character_recognition
5Khmer OCR System
• a document , a scan process , ocr system and ana document , a scan process , ocr system and an
out put text. (put all these pictures here. Png &out put text. (put all these pictures here. Png &
.docx).docx)
Its applications…
Lister certaines applicationsLister certaines applications
6Khmer OCR System
Overview of OCR System
Khmer OCR System
7
Training
data
Recognition
system
(Knowledge)
Pre-
processing
Features
extraction
Input
pattern
Character
Reordering
Recognition
result
Training process
Recognition process Features
selection/reduction
Outline
• Overview of OCROverview of OCR
• Training dataTraining data
• Pre-processing and features extractionPre-processing and features extraction
• Training process and recognition systemTraining process and recognition system
• Post-processingPost-processing
8Khmer OCR System
Training Data
9Khmer OCR System
• To cover more font and size, it is necessary toTo cover more font and size, it is necessary to
have more training sample of different fonthave more training sample of different font
 Train Computer to recognize each of them isTrain Computer to recognize each of them is
១១
Outline
• Overview of OCROverview of OCR
• Training dataTraining data
• Pre-processing and features extractionPre-processing and features extraction
• Training process and recognition systemTraining process and recognition system
• Post-processingPost-processing
10Khmer OCR System
Pre-processing and Segmentation
• Pre-processing aims to produce data that arePre-processing aims to produce data that are
easy for the OCR systems to operate accuratelyeasy for the OCR systems to operate accurately
• The main objectives of pre-processing are :The main objectives of pre-processing are :
• BinarizationBinarization
• Particle removalParticle removal
11Khmer OCR System
Binarization (Thresholding)
• Image linearization (thresholding) refers to the
conversion of a gray-scale image into a binary
image.
12Khmer OCR System
Particle removal
• Salt-and-pepper noise is a kind of noise which isSalt-and-pepper noise is a kind of noise which is
usually caused by small unnecessary dotsusually caused by small unnecessary dots
produced by either the scanner or the sourceproduced by either the scanner or the source
document itself.document itself.
13Khmer OCR System
Segmentation
• Segmentation aims to produce each componentSegmentation aims to produce each component
to be recognized by the system.to be recognized by the system.
• The process is to separate the text of a pageThe process is to separate the text of a page
into each separate line, then to separate eachinto each separate line, then to separate each
line into Vertical Component, and finally produceline into Vertical Component, and finally produce
each independent symbol.each independent symbol.
14Khmer OCR System
Segmentation
• Segmentation aims to produce each componentSegmentation aims to produce each component
to be recognized by the system.to be recognized by the system.
• The process is to separate the text of a pageThe process is to separate the text of a page
into each separate line, then to separate eachinto each separate line, then to separate each
line into Vertical Component, and finally produceline into Vertical Component, and finally produce
each independent symbol.each independent symbol.
15Khmer OCR System
Example using CCA
16Khmer OCR System
Feature Extraction & Recognition
• In feature extraction stage, each character is
represented as a feature vector which becomes
its identity.
• The major goal of feature extraction is to
extract a set of features which maximizes the
recognition rate with the least amount of
elements.
17Khmer OCR System
Recognition Process
• GFD is derived by applying two-dimensionalGFD is derived by applying two-dimensional
Fourier transform on a polar-raster sampledFourier transform on a polar-raster sampled
shape image.shape image.
18Khmer OCR System
Generic Fourier Descriptor
19Khmer OCR System
well…this iswell…this is កកកក
GFD Feature vectorGFD Feature vector
well…this iswell…this is កកកក
Recognition Process
20Khmer OCR System
???
Input imageInput image Training ImageTraining Image
Recognition Process
• The similarity between two shapes is measuredThe similarity between two shapes is measured
by the City-Block distance of the two featureby the City-Block distance of the two feature
vectors of the shape.vectors of the shape.
• The lower value means the more similar theThe lower value means the more similar the
shapes are.shapes are.
21Khmer OCR System
Post-processing : Reordering
22Khmer OCR System
SegmentationSegmentation
Recognized wordRecognized word
ReorderingReordering
Experimental Result
23Khmer OCR System
precision
Recall
F-Mesure
• The test was conducted on a document with aThe test was conducted on a document with a
resolution of 300 dpi of … symbols.resolution of 300 dpi of … symbols.
Khmer OCR Using Generic Fourier Descriptor Back
Thank for your attention !!!
24Khmer OCR System
Reference
28
Khmer OCR Using Generic Fourier Descriptor Back
[1] V. Kruy. Preliminary Experiment on Khmer OCR. Kameyama Laboratory,
Waseda Univerisy, Japan.
[2] Thesis for master degree, Khmer OCR, Vanna Kruy.
[3] D. Zhang and G. Lu. Shape-based image retrieval using generic Fourier descriptor.
Gippsland School of Computing and Information
Technology. Monash University. Churchill, Victoria 3842, Australia.
[4] Thesis for Doctoral Degree, chapter 6: Generic Fourier Descriptor, Dengsheng
Zhang.
[5] J.C.Rupe. Vision-Based Hand Shape Identification for Sign Language
Recognition. Department of Computer Engineering Kate Gleason College of
Engineering Rochester Institute of Technology Rochester, NY.
25Khmer OCR System
Reference
29
Khmer OCR Using Generic Fourier Descriptor Back
[6] D. Dimov. A polar-Fourier-Wavelet’s Transform for Effective CBIR. 3rd
ADBIS
workshop on Data mining & Knowledge Discovery
[7] I. Lengieng, K. Sochenda and C. Sokhour. , Khmer OCR for Limon R1 Size 22
Report, PAN Localization Cambodia (PLC) of IDRC.er OCR
[8] A. Averbuch, R.R. Coifmany , D.L. Donohoz M. Eladx M. Israeli. Fast and
Accurate Polar Fourier Transform.
Department of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel.
Department of mathematics, Yale University, New Haven CT 06520-8283 USA
Department of Statistics, Stanford University, Stanford 94305-9025 CA. USA.
26Khmer OCR System

Contenu connexe

En vedette

0 tutorial contents
0 tutorial contents0 tutorial contents
0 tutorial contentsSolin TEM
 
cs15 slide_v2
cs15 slide_v2cs15 slide_v2
cs15 slide_v2Solin TEM
 
8 modernization efforts
8 modernization efforts8 modernization efforts
8 modernization effortsSolin TEM
 
7 layout analysis
7 layout analysis7 layout analysis
7 layout analysisSolin TEM
 
CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1Solin TEM
 
1 intro history
1 intro history1 intro history
1 intro historySolin TEM
 

En vedette (6)

0 tutorial contents
0 tutorial contents0 tutorial contents
0 tutorial contents
 
cs15 slide_v2
cs15 slide_v2cs15 slide_v2
cs15 slide_v2
 
8 modernization efforts
8 modernization efforts8 modernization efforts
8 modernization efforts
 
7 layout analysis
7 layout analysis7 layout analysis
7 layout analysis
 
CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1CS_rapport_final_fr_v3_1
CS_rapport_final_fr_v3_1
 
1 intro history
1 intro history1 intro history
1 intro history
 

Similaire à Khmer OCR System Overview Using GFD

Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalBiniam Asnake
 
Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...
Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...
Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...osify
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition Shobhit Saxena
 
Design and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontDesign and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontIRJET Journal
 
optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition systemVijay Apurva
 
IRJET- Optical Character Recognition using Image Processing
IRJET-  	  Optical Character Recognition using Image ProcessingIRJET-  	  Optical Character Recognition using Image Processing
IRJET- Optical Character Recognition using Image ProcessingIRJET Journal
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHarshana Madusanka Jayamaha
 
OCR by Abdullah Ahmed Abu Rtima
OCR by Abdullah Ahmed Abu Rtima OCR by Abdullah Ahmed Abu Rtima
OCR by Abdullah Ahmed Abu Rtima as af
 
Pattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptxPattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptxEngRSMY2
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Editor IJARCET
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Editor IJARCET
 
OCR Presentation (Optical Character Recognition)
OCR Presentation (Optical Character Recognition)OCR Presentation (Optical Character Recognition)
OCR Presentation (Optical Character Recognition)Neeraj Neupane
 
OCR for Urdu translation
OCR for Urdu translation OCR for Urdu translation
OCR for Urdu translation Yasar Hayat
 
Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...
Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...
Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...GBM
 
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...IRJET Journal
 
OCV & OCR - A Validation Perspective
OCV & OCR - A Validation PerspectiveOCV & OCR - A Validation Perspective
OCV & OCR - A Validation PerspectiveMALAY MEHTA
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) Systemiosrjce
 

Similaire à Khmer OCR System Overview Using GFD (20)

Optical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based RetrievalOptical Character Recognition (OCR) based Retrieval
Optical Character Recognition (OCR) based Retrieval
 
Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...
Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...
Support Vector Machine (SVM) Based Classifier For Khmer Printed Character-set...
 
Opticalcharacter recognition
Opticalcharacter recognition Opticalcharacter recognition
Opticalcharacter recognition
 
Design and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontDesign and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English Font
 
optical character recognition system
optical character recognition systemoptical character recognition system
optical character recognition system
 
IRJET- Optical Character Recognition using Image Processing
IRJET-  	  Optical Character Recognition using Image ProcessingIRJET-  	  Optical Character Recognition using Image Processing
IRJET- Optical Character Recognition using Image Processing
 
Handwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural networkHandwritten character recognition using artificial neural network
Handwritten character recognition using artificial neural network
 
OCR by Abdullah Ahmed Abu Rtima
OCR by Abdullah Ahmed Abu Rtima OCR by Abdullah Ahmed Abu Rtima
OCR by Abdullah Ahmed Abu Rtima
 
A12REVIEW.pptx
A12REVIEW.pptxA12REVIEW.pptx
A12REVIEW.pptx
 
05a
05a05a
05a
 
Pattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptxPattern_Recognition_via_Character_Recogn.pptx
Pattern_Recognition_via_Character_Recogn.pptx
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015Volume 2-issue-6-2009-2015
Volume 2-issue-6-2009-2015
 
OCR Presentation (Optical Character Recognition)
OCR Presentation (Optical Character Recognition)OCR Presentation (Optical Character Recognition)
OCR Presentation (Optical Character Recognition)
 
OCR for Urdu translation
OCR for Urdu translation OCR for Urdu translation
OCR for Urdu translation
 
Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...
Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...
Offline Omni Font Arabic Optical Text Recognition System using Prolog Classif...
 
Assignment-1-NF.docx
Assignment-1-NF.docxAssignment-1-NF.docx
Assignment-1-NF.docx
 
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
­­­­Cursive Handwriting Recognition System using Feature Extraction and Artif...
 
OCV & OCR - A Validation Perspective
OCV & OCR - A Validation PerspectiveOCV & OCR - A Validation Perspective
OCV & OCR - A Validation Perspective
 
Optical Character Recognition (OCR) System
Optical Character Recognition (OCR) SystemOptical Character Recognition (OCR) System
Optical Character Recognition (OCR) System
 

Dernier

Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 

Dernier (20)

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 

Khmer OCR System Overview Using GFD

  • 1. Sovann EN 5th Year Engineering student Dept. Computer Science & Communication Institute of Technology of Cambodia Phnom Penh, Cambodia Khmer OCR System 1Khmer OCR System
  • 2. Native of research work • A collaboration work with Mr. Kruy Vanna, PhDA collaboration work with Mr. Kruy Vanna, PhD student at kameyama Laboratory , GITS, Wasedastudent at kameyama Laboratory , GITS, Waseda UniversityUniversity • The Objective is to produce a reliable Khmer OCRThe Objective is to produce a reliable Khmer OCR system which is independent of Font and Sizesystem which is independent of Font and Size 2Khmer OCR System
  • 3. Outline • Overview of OCROverview of OCR • Training dataTraining data • Pre-processing and SegmentationPre-processing and Segmentation • features extraction and recognition processfeatures extraction and recognition process • Post-processingPost-processing 3Khmer OCR System
  • 4. Outline 4Khmer OCR System • Overview of OCROverview of OCR • Training dataTraining data • Pre-processing and features extractionPre-processing and features extraction • Training process and recognition systemTraining process and recognition system • Post-processingPost-processing
  • 5. What is OCR ??? • Optical Character Recognition (OCR) is theOptical Character Recognition (OCR) is the mechanical or electronic translation of scannedmechanical or electronic translation of scanned images of handwritten, typewritten or printed textimages of handwritten, typewritten or printed text into machine-encoded textinto machine-encoded text http://en.wikipedia.org/wiki/Optical_character_recognitionhttp://en.wikipedia.org/wiki/Optical_character_recognition 5Khmer OCR System • a document , a scan process , ocr system and ana document , a scan process , ocr system and an out put text. (put all these pictures here. Png &out put text. (put all these pictures here. Png & .docx).docx)
  • 6. Its applications… Lister certaines applicationsLister certaines applications 6Khmer OCR System
  • 7. Overview of OCR System Khmer OCR System 7 Training data Recognition system (Knowledge) Pre- processing Features extraction Input pattern Character Reordering Recognition result Training process Recognition process Features selection/reduction
  • 8. Outline • Overview of OCROverview of OCR • Training dataTraining data • Pre-processing and features extractionPre-processing and features extraction • Training process and recognition systemTraining process and recognition system • Post-processingPost-processing 8Khmer OCR System
  • 9. Training Data 9Khmer OCR System • To cover more font and size, it is necessary toTo cover more font and size, it is necessary to have more training sample of different fonthave more training sample of different font  Train Computer to recognize each of them isTrain Computer to recognize each of them is ១១
  • 10. Outline • Overview of OCROverview of OCR • Training dataTraining data • Pre-processing and features extractionPre-processing and features extraction • Training process and recognition systemTraining process and recognition system • Post-processingPost-processing 10Khmer OCR System
  • 11. Pre-processing and Segmentation • Pre-processing aims to produce data that arePre-processing aims to produce data that are easy for the OCR systems to operate accuratelyeasy for the OCR systems to operate accurately • The main objectives of pre-processing are :The main objectives of pre-processing are : • BinarizationBinarization • Particle removalParticle removal 11Khmer OCR System
  • 12. Binarization (Thresholding) • Image linearization (thresholding) refers to the conversion of a gray-scale image into a binary image. 12Khmer OCR System
  • 13. Particle removal • Salt-and-pepper noise is a kind of noise which isSalt-and-pepper noise is a kind of noise which is usually caused by small unnecessary dotsusually caused by small unnecessary dots produced by either the scanner or the sourceproduced by either the scanner or the source document itself.document itself. 13Khmer OCR System
  • 14. Segmentation • Segmentation aims to produce each componentSegmentation aims to produce each component to be recognized by the system.to be recognized by the system. • The process is to separate the text of a pageThe process is to separate the text of a page into each separate line, then to separate eachinto each separate line, then to separate each line into Vertical Component, and finally produceline into Vertical Component, and finally produce each independent symbol.each independent symbol. 14Khmer OCR System
  • 15. Segmentation • Segmentation aims to produce each componentSegmentation aims to produce each component to be recognized by the system.to be recognized by the system. • The process is to separate the text of a pageThe process is to separate the text of a page into each separate line, then to separate eachinto each separate line, then to separate each line into Vertical Component, and finally produceline into Vertical Component, and finally produce each independent symbol.each independent symbol. 15Khmer OCR System
  • 17. Feature Extraction & Recognition • In feature extraction stage, each character is represented as a feature vector which becomes its identity. • The major goal of feature extraction is to extract a set of features which maximizes the recognition rate with the least amount of elements. 17Khmer OCR System
  • 18. Recognition Process • GFD is derived by applying two-dimensionalGFD is derived by applying two-dimensional Fourier transform on a polar-raster sampledFourier transform on a polar-raster sampled shape image.shape image. 18Khmer OCR System
  • 19. Generic Fourier Descriptor 19Khmer OCR System well…this iswell…this is កកកក GFD Feature vectorGFD Feature vector well…this iswell…this is កកកក
  • 20. Recognition Process 20Khmer OCR System ??? Input imageInput image Training ImageTraining Image
  • 21. Recognition Process • The similarity between two shapes is measuredThe similarity between two shapes is measured by the City-Block distance of the two featureby the City-Block distance of the two feature vectors of the shape.vectors of the shape. • The lower value means the more similar theThe lower value means the more similar the shapes are.shapes are. 21Khmer OCR System
  • 22. Post-processing : Reordering 22Khmer OCR System SegmentationSegmentation Recognized wordRecognized word ReorderingReordering
  • 23. Experimental Result 23Khmer OCR System precision Recall F-Mesure • The test was conducted on a document with aThe test was conducted on a document with a resolution of 300 dpi of … symbols.resolution of 300 dpi of … symbols.
  • 24. Khmer OCR Using Generic Fourier Descriptor Back Thank for your attention !!! 24Khmer OCR System
  • 25. Reference 28 Khmer OCR Using Generic Fourier Descriptor Back [1] V. Kruy. Preliminary Experiment on Khmer OCR. Kameyama Laboratory, Waseda Univerisy, Japan. [2] Thesis for master degree, Khmer OCR, Vanna Kruy. [3] D. Zhang and G. Lu. Shape-based image retrieval using generic Fourier descriptor. Gippsland School of Computing and Information Technology. Monash University. Churchill, Victoria 3842, Australia. [4] Thesis for Doctoral Degree, chapter 6: Generic Fourier Descriptor, Dengsheng Zhang. [5] J.C.Rupe. Vision-Based Hand Shape Identification for Sign Language Recognition. Department of Computer Engineering Kate Gleason College of Engineering Rochester Institute of Technology Rochester, NY. 25Khmer OCR System
  • 26. Reference 29 Khmer OCR Using Generic Fourier Descriptor Back [6] D. Dimov. A polar-Fourier-Wavelet’s Transform for Effective CBIR. 3rd ADBIS workshop on Data mining & Knowledge Discovery [7] I. Lengieng, K. Sochenda and C. Sokhour. , Khmer OCR for Limon R1 Size 22 Report, PAN Localization Cambodia (PLC) of IDRC.er OCR [8] A. Averbuch, R.R. Coifmany , D.L. Donohoz M. Eladx M. Israeli. Fast and Accurate Polar Fourier Transform. Department of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel. Department of mathematics, Yale University, New Haven CT 06520-8283 USA Department of Statistics, Stanford University, Stanford 94305-9025 CA. USA. 26Khmer OCR System