Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Arabic Handwritten Text Recognition and Writer Identification

A seminar of Ph.D. theses which explain a proposed system for recognize the Arabic handwritten text and identify the text writer. Several proposed steps are described in details in this seminar and the obtained results are viewed in detail.

  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Arabic Handwritten Text Recognition and Writer Identification

  1. 1. Arabic Handwritten Text Recognition and Writer Identification Supervisor: Asst. Prof . Dr. Alia K. Abdul Hassan Prepared by: Mustafa Salam Kadhm 2017 Ministry of Higher Education & Scientific Research University of Technology Department of Computer Science
  2. 2. Contents Problem Statement1 Aim of Thesis2 Proposed System3 Experiments and Results4 Conclusions5
  3. 3. Problem Statement 3 • Most of the governments and organizations have handwritten need to be editable and searchable. • Arabic handwritten text recognition is a complex process compared with other handwritten languages because it is cursive in nature. • Poor obtained accuracy of existing recognition systems (depended on character segmentation). • Unauthenticated recognition results of the existing systems. • The availability problem of Arabic handwritten database.
  4. 4. Aim of Thesis 4  Develop an accurate Arabic handwritten text recognition system based on multi-scale features extraction methods and SVM classifier.  Employ the proposed system in a security application by identifying the writer of the input handwritten text.  Develop an Arabic handwritten database with colored and gray handwritten images that works for character, word, text recognition system and can be used for the security applications.
  5. 5. The Proposed System Architecture 5 Input Output
  6. 6. The Proposed System Architecture 6 Input Output
  7. 7. Image Acquisition 7 Gray Color Color
  8. 8. 8 Segmentation Input Image applying dilation and filling drawing rectangles around the objectsDrawing the obtained rectangles on original image handwritten sub-images applying Sobel filter
  9. 9. Preprocessing (Image Thresholding ) 9 AHDB IESK-arDB Proposed Input Image (gray) Output Image (binary)
  10. 10. Preprocessing (Noise Removal) 10 Noise Removed Noise Removed
  11. 11. Preprocessing 11 Image 1 Image 2 Edge Detection Image 3
  12. 12. Preprocessing (Image Normalization) 12 Image 1 Image 2 Image 3 128 x 128
  13. 13. Features Base1 Construction 13 Image 1 Structural Features
  14. 14. Features Base1 Construction 14 Block 1 Block 3 Block 2 Block 4 Image 2 Statistical Features
  15. 15. Features Base1 Construction 15 Image 2 Take first 10 coefficients in zigzag order DCT Features Discrete Cosine Transform (DCT) Features
  16. 16. Features Base1 Construction 16 Image 3 X-axis Y-axis Magnitude Orientation Modified Histogram of Oriented Gradient (MHOG1) Features
  17. 17. Features Base1 Construction 17 Modified Histogram of Oriented Gradient (MHOG1) Features
  18. 18. Features Base1 Construction 18 Interpolation votes of gradient orientation Modified Histogram of Oriented Gradient (MHOG1) Features
  19. 19. Features Base1 Construction 19 Feature Vector
  20. 20. Classification 20
  21. 21. Post-processing 21 Arabic Text Unicode ‫هللا‬ [ 1575,1604,1604,1607] ‫عبد‬ [1593,1576,1583] ‫العام‬ [ 1575,1604,1593,1575,1605] Proposed Arabic lexicon
  22. 22. Features Base2 Construction (module2) 22 Modified Histogram of Oriented Gradient (MHOG2) Features
  23. 23. Shape Features 23 Image 4 Shape Features
  24. 24. Classification (module2) 24
  25. 25. Post-processing (module2) 25 Class Label Writer Name [1] Writer1 [2] Writer2 . . . . . . [n] Writer(n) Proposed writers’ lexicon
  26. 26. Proposed Arabic Handwritten Database 26
  27. 27. Proposed Arabic Handwritten Database 27
  28. 28. Proposed Arabic Handwritten Database 28
  29. 29. Proposed Arabic Handwritten Database 29
  30. 30. Experiments and Results 30 Database Correct Segm. Under Segm. Over Segm. Misplaced Segm. AHDB 89% 3% 6% 2% Proposed 92% 4% 2% 2%  Segmentation Testing Set = 50 True Positive = 46 Accuracy = 46 50 x 100
  31. 31. Evaluation of The AHTRS System (module1) 31  Preprocessing ( image thresholding)
  32. 32. Evaluation of The AHTRS System (module1) 32  Preprocessing ( noise removal)
  33. 33. Evaluation of The AHTRS System (module1) 33 Preprocessing ( noise removal)
  34. 34. Evaluation of The AHTRS System (module1) 34 System Accuracy AHTRS system without BSE algorithm 93% AHTRS system + BSE algorithm 96.317% Preprocessing ( black space elimination) Image Size Accuracy 32 x 32 94 % 64 x 64 94.8% 64 x 128 95.22% 128 x 64 95% 128 x 128 96.317% Preprocessing ( image normalization)
  35. 35. Evaluation of The AHTRS System (module1) 35 Edge Detection Filter Accuracy HOG filter 89.2% Sobel 89% Canny 87% Roberts 90.1% Proposed 92.70%  Features Extraction( MHOG1) Approach Accuracy un-overlapped blocks 88.5% overlapped blocks 92.70%
  36. 36. Evaluation of The AHTRS System (module1) 36 Blocks Accuracy 1 block 67.92% 4x4 blocks 60% 6x6 blocks 61.22% 8x8 blocks 64.7% Ordering technique Accuracy Sequential 66.7% Zig-zag 67.92% Method Extraction Time DCT 1.6 FCT 0.8 Features Extraction( DCT)
  37. 37. Evaluation of The AHTRS System (module1) 37 Features Accuracy DCT 67.92% MHOG1 92.70% Statistical + Structural 70.88% All features 96.317% Features Extraction Features Classification Time Without FN 4.5 With FN 0.9  Features Normalization
  38. 38. Evaluation of The AHTRS System (module1) 38  Classification
  39. 39. Evaluation of The AHTRS System (module1) 39 Database Kernel Accuracy AHDB linear 92% AHDB polynomial 96.317 % AHDB RBF 93.1% IESK-arDB linear 76% IESK-arDB polynomial 82 % IESK-arDB RBF 78.66% Proposed linear 96.2% Proposed polynomial 98% Proposed RBF 97% Classification Testing Set = 1365 True Positive = 1314 Accuracy = 1314 1365 x 100
  40. 40. Evaluation of The AHTRS System (module2) 40 Features Accuracy MHOG2 95.9% Shape 93 % MHOG2 + Shape 100%  Features Extraction ( module2)
  41. 41. Evaluation of The AHTRS System (module2) 41 Approach Kernel Accuracy Sub-word level linear 80% Sub-word level polynomial 85 % Sub-word level RBF 81.9% Text level linear 98% Text level polynomial 100% Text level RBF 98.6%  Classification ( module2)
  42. 42. Evaluation of The AHTRS System 42 Module Classifier Accuracy 1 KNN 93% 1 SVM 98% 1 ANN 94% 2 KNN 95% 2 SVM 100% 2 ANN 98% Classification ( module 1 & 2)
  43. 43. Conclusions 1. The proposed system depends on handwritten sub-images segmentation approach which is simple, practical and efficient and leads to more accurate accuracy than of the systems that depends on the character segmentation. 2. The steps of the proposed preprocessing stage lead to efficient results of binary, thinned and cropped images without noise that increase the system accuracy. Besides, the choose of appropriate edge detector and image normalization size enhance the obtained outcomes of the system. information of the handwritten text. 43
  44. 44. Conclusions Cont. 3. The employment of MHOG1 and MHOG2 in the proposed system is the main successful part of this thesis which leads to better recognition and identification accuracy. Furthermore, the obtained results show the strength of using the proposed edge detection filter for HMOG1 over the other filters. 4. The proposed features, DCT, statistical and shape features in another hand, are made the system more accurate. 5. The training and classification time are reduced by features normalization (FN) algorithm, subsequently reducing the system processing time. 44
  45. 45. Conclusions Cont. 6. The use of one vs all approach with polynomial kernel of Support Vector Machines (SVM) classification algorithm yields more robust recognition results and identification performance than the use of other approaches, kernels and classifiers. 7. The proposed system has achieved better accuracy with three different Arabic handwritten databases than all the previous works. 8. The proposed text handwritten database gives a better accuracy result than the other handwritten databases, and it can works in identification. Besides, the database can work for character and word recognition 45
  46. 46. List of Publications Journals: 1. Mustafa S., Alia K., ”ACRS: Arabic Character Recognition System Based on Multi Features Extraction Methods”, International Journal of Scientific and Engineering Research, vol. 6, Issue 10, pp. 656-661, 2015. 2. Alia K., Mustafa S., “Handwriting Word Recognition Based on SVM Classifier”, International Journal of Advanced Computer Science & Applications, vol. 1, issue 6, pp. 64- 68, 2015. 3. Mustafa S., Alia K., “Handwriting Word Recognition Based on Neural Networks” International Journal of Applied Engineering Research, vol. 10, issue 22, pp. 43120- 43124, 2015. 4. Alia K., Mustafa S., “An Efficient Image Thresholding Method for Arabic Handwriting Recognition System”, Engineering and Technology Journal, vol. 34, issue 1, pp. 26-34, 2016. 5. Alia K., Mustafa S., “An Efficient Preprocessing Framework for Arabic Handwriting Recognition System”, Diyala Journal For Pure Sciences, vol. 12, issue 3, pp. 147- 163, 2016. 6. Alia K., Mustafa S., “Arabic Handwriting Text Recognition Based on Efficient Segmentation, DCT and HOG Features”, International Journal of Multimedia and Ubiquitous Engineering, vol. 11, issue 10, pp. 83-92, 2016. Conferences: 1. Alia K., Mustafa S., “AHCR: Arabic Handwriting Character Recognition System Using Multi-scale Features, SVM And KNN Classifiers”, 2nd Global Conference on Contemporary Issues in Education, 2nd Global Conference on Contemporary Issues in Education, pp. 46, 2015. 2. Alia K., Mustafa S., “Arabic Handwriting Text Recognition Based on EOD and HOG Features”, SAI Intelligent Systems Conference (IntelliSys), 2016. (Accepted) 46
  47. 47. Thank you

×