SlideShare une entreprise Scribd logo
1  sur  6
Télécharger pour lire hors ligne
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1939
Handwritten Text Recognition and Translation with Audio
Patibandla Vishnu Deepak1, Vishnu Vardhan2, Yogesh Gautam3, Jayashree4
1VIII Semester, Dept. of CSE, BNMIT
2VIII Semester, Dept. of CSE BNMIT
3Asst. Professor, Dept of CSE, BNMIT, Karnataka, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - This project attempts to classify any unique
handwritten text in order to convert the text material to a
digital format. To complete this objective, we used two
different methods: direct word classification and separating
each character. Using a Convolutional Neural Network (CNN)
with different architectures, we train a model that can
accurately categorize words for the former. To create boxes
for each particular character, we employ a special LSTM with
convolution. Where LSTM is Long Short-Term Memory. These
separated characters are then forwarded into a CNN for
categorizing, and each particular word is subsequently
reassembled using the classification and separation findings.
Key Words: Optical character recognition, Convolutional
Networks, Text detection and recognition, translation,audio
speech.
1.INTRODUCTION
Despite the prevalence of electronic writing devices, many
people still choose taking notes on paper and a pen. Though,
there are certainadvantages therearespecific disadvantages
to writing text by hand and are provided below. It's tough to
efficiently keep and access physical documents, as well as to
efficiently search, sort them from a huge pile of documents.
Sharing these documents can also stressful.
Consequently, lots of valuable information is misplaced or
never analysed since these documents are never converted
to digital format. We chose to address this issue in our
research because we believe that the far better ease of
managing digital text compared to written language will
enable people to more easily access, search, share, and
analyse their records while still allowing them to use their
preferred writing method. The purpose of this project is to
delve deeper into the task of classifyinghandwrittentext and
converting it to a digital representation. Handwritten text is
a broad phrase, and we intended to reduce the scope of the
project by defining handwritten writing specifically for our
needs. We took on the task of identifying the visual of any
handwritten word, whether it was in cursive or block script,
for this research. This work can be combined with
algorithms that separate words from lines in a given line
image, which can then be used with algorithmsthatseparate
lines from lines in a given image of a whole handwritten
page. With these classes added, our project could be in the
form of a distributable file used by the end user and would
be a fully functional template that would prompt the user to
take a picture of the notes page, allowing users solve the
problem of converting handwritten documents to digital
format. Note that while some extra layers need to be added
to our model to create a fully functional deliverable product
to end users, we believe the categorization of the problem is
interesting. the most delicious and the hardest, that's why
we chose to approach it. instead of line-to-word, document-
to-line, etc.
Because CNNs function better on raw input pixels data
rather than specific features or sections of an particular
image, we can address this problem with complete word
pictures. We expect the improvement by extracting
characters from each particular word image and then
classifying eachwordcharacterindividuallytore-constructa
whole word, based on our findingsusing entirewordimages.
In summary, our models in both ways take in an
image/picture of the word and thenoutputthe word'sname.
2. RELATED WORK
2.1 Early Scanners
The first motivationfortheclassificationofhandwrittentexts
was the digit classification of mail. Early postal readers, such
as Jacob Rabinow's, used integrated scanningandprocessing
devices to recognize fixed-spaced fonts. Allum et al. He
refined this by developing a sophisticated scanner that
allowed more variation in the way text was written and put
the data into a barcode that was printed directly on the
letter.in text input and that also encoded the information
onto a bar-code that was printed directly on the letter.
2.2 To the Digital Age
In 1974, Ray Kurzweil developed the first known piece of
OCR software that could identify any typeface. In this
software, the matrix technique was used more frequently
(pattern matching). In essence, this would determine which
character it most closely resembled by comparing the
bitmaps of the template character with the read character.
The drawback of this programme was thatitwassensitiveto
variations in font size and writing style. To improve
templating, OCR software switched from using templates to
feature extraction. Software would look for each character's
geometric moments, zoning, and projection histograms.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1940
2.3 Machine Learning
The Lecun concentrated on utilizinggradient-basedlearning
approaches with multi-module ML models, which served as
the forerunner for the first end-to-end modern deep
learning models.
Adoption of a Hidden Markov Model for OCR was the next
big advancement in attaining high OCR accuracy. This
method uses each letter as a specific state, and then
considering the context of the character when choosing the
next hidden variable. In comparison to feature extraction
technique and the Naive Bayes approach, this resulted in
higher accuracy. The manual extraction elements remained
the biggest drawback, since they required prior language
expertise and were not especially robust to thediversityand
complexity of handwriting.
Ng and colleagues used a sliding window to apply CNNs to
the issue of identifying text that is observed inthereal world
(signs, written, etc.) within an image.Aslidingwindowscans
the picture to see if any characters might be present. Each
character was classified using a CNN with two of each
convolutional layers, and averagepoolinglayerslastlya fully
connected layer.
Scan, and Read: End-to-End MDLSTM Handwritten
Paragraph Recognition One of the most renowned
publications for the task of handwritten text detection is
Attention. An LSTM layer was utilised to encode the raw
image data to a feature map for each scanning direction.
After then, the model would use attention to give particular
feature maps more priority than others. The attention map
was created, then it was given to thedecoder, whichused the
visual summary and state to anticipate the character. This
approach was distinctive in that it combined the
segmentation and classification processes into a single
model rather than separating them. Due to the lack of a
language model for producing character and word
sequences, this approach has one drawback. Without
considering the context of the invented word, it solely relies
on each character's aesthetic categorization.
We also found that using a project from CS 231N in the past
helped us with our job. Yan uses the Faster R- CNN model to
categorize and identify specific characters within a word..
This method initially determines whether an object exists
within the image's bounds by dragging a window across it.
The bounded image is then classified to the character that
corresponds to it. Yan also uses edit distance, which lets
users to make changes to the separated word to see if
another separated word (for example, xoo versus zoo) is
more likely to be correct.
3. DATA
The IAM Handwriting Dataset was our primary source for
handwriting recognition training. The handwritten text in
this dataset totals 5500 phrases and 11500 words,anditwas
written on more than 1500 forms (a form is a sheet of paper
with lines of text) by more than 600 authors.Thewordswere
segmented and meticulouslychecked before being delivered
with all relevant formlabel metadata in associated XML files.
The source material was taken from the Lancaster-
Oslo/Bergen (LOB) corpus, which comprises over 1 million
words of entire English sentences. There are another 1,066
forms in the database, written by roughly 400 distinct
authors.
Fig -1: a form from the IAM Handwriting dataset as an
example. The dataset's word pictures were taken from
such formats.
Because of its breadth, depth, and quality, this database is
often used as the foundation for many handwriting
recognition tasks, this is the reason for choosing the IAM
Handwriting Dataset as the foundation of our training,
validation, and test data for our models. Lastly, large
datasets—even those that contain multiple pre-trained
models—are essential for deep learning, and this dataset,
which contained over 100K word instances, satisfied those
requirements. Deep learning models need at best 105 -106
training examples to work properly, regardless of transfer
learning.
4. PRE-PROCESSING STAGE
Pre-processing is a sequence of procedures applied to the
captured image. In essence, it improves the depiction of the
image to make it segmentation-ready.Alotofprocedures are
carried out on an image as part of its pre-processingasgiven
below.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1941
4.1 Noise Removal
This pre-processing stage raises the qualityoftherawimage
and increases the data relevance. This stageisalsoknownas
pixel level or low-level processing. Skew correction of an
image is carried out during the pre-processing stage to
accurately display the article text lines, threshold to convert
a colour or grayscale image into a binary image, and noise
reduction to reduce superfluous data, asillustratedinFigure
2.
Dust, spots, dots, and lines are all examples of noise
that can contaminate scanned documents and significantly
alter recognition outcomes. A scannedarticleimageneedsto
be cleaned of any noise in order to make it suitable for
further processing. To enhance the image that a machine or
person is viewing, image enhancement techniques are used.
Noise reductionusessmoothingandnonlinearprocesseslike
morphological operations.
Fig -2: Results from the pre-processing phase, including
noise removal and skew correction.
4.2 Skew Correction
If the text line in scanned document images is not
horizontally aligned after scanning, skew correction is
employed to fix it. On the document or line level, the skew
correction can be implemented. A global skew correction is
implemented on the document level during the pre-
processing stage.
4.3 Slant Correction
To adjust the writing style's inclination, slant correction is
used. The writings are altered from their slant to their
upright state by means of a shear transformation.Numerous
directions are affected by a shear transformation. The pixel
values of the same image that have been vertically movedby
distances d and -d are added to the converted image data for
each direction, as illustrated in Figure 3.
Fig -3: Steps in Slant correction.
4.4 Line Segmentation
An image with an alphabetic sequence is broken down into
smaller images that represent individual characters during
this segmentation stage of pre-processing. By giving each
alphabet a number via a labelling procedure, the pre-
processed input image is separated into distinct alphabets.
The labelling indicates how many characters aretherein the
image. For the categorization and identification stage, every
single character is uniformly reconstructed into 90*60
pixels.
5. METHODOLOGY
5.1 Training Level
Following SoftMax activation, we trained eachofourmodels
on a cross-entropy loss function using the ADAM Optimizer.
With the exception of changing the last completely
connected output layer to map to the number of classes
(word/character vocabulary), we kept the identical
architectures for the Visual Geometry Group (VGG) and
Residual Networks (RESNET) models. We started from
scratch while training the word and character classifier. We
reasoned that the enormousSoftMaxlayeroutputsize(more
than 10,000 words in our training vocabulary and well over
100,000 words in only the English language) and the
difficulty of fine-grained word picture identification were
impacting our performance for word-level classification.We
came to the conclusion that character-level classification
might be a more effective strategy because the SoftMax's
computational complexity can be greatly reduced by using a
character vocabulary that is significantlylesscomplexthana
word vocabulary (the characters A-Za-z0-9 only have 62
distinct symbols). In addition, due to the constrained
character set, character identification in an image is an
easier challenge than word recognition in an image. The
separation of word specific images into their component
character images will be the first significant obstacle we
would have to overcome in order to evaluate this strategy.
The second major difficulty would be identifying word
breaks in images and connecting consecutively identified
letters in between these word breaks to make words. In this
section, we'll focus on the first of these problems. We used
the brand-new CNN/LSTM engine powered by Tesseract4.0
neural networks to carry out this task. This model is set up
as a text line recognizer that can automatically identifymore
than 100 languages. It was originally developed byHPand is
now maintained by Google. The Tesseract model had been
trained on roughly 400000 text lines spread among about
4500 fonts for Latin-based languages, including English. On
our IAM handwriting dataset, we then adjusted this
pretrained model's parameters. The original word input
images were split into their hypothetical component
character images after the model was tweaked, and these
output segmented character images were fed into our
character-level categorization system.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1942
5.2 Each Word-Level Classification
We started using VGG-19 to train our Word-level
classification model. First off, considering the quantity of
parameters it needs, we discovered that this model is
difficult to train. Initially, we tried using the whole
vocabulary, but we discovered that the run time was too
slow. We limited the vocabulary to 50 terms that were
chosen at random and showed up at least 20 times in our
dataset in order to get results. Because some of the chosen
words, like "the," contained hundreds of training examples,
training was still quite slow. In order to have enough data to
benchmark outcomes with a better trade-offonthe runtime,
we limited each word to 20 occurrences. Our VGG-19 model
was subsequently trimmed, and training with numerous
parameters was started. Our learning rate was too high
during our first runs, which led to low end training and
validation accuracy. We increased the learning rate of our
Adam Optimizer, which achieved training accuracy of 28%
and validation accuracy of 22%.
Graph -1: When we first started training, Our learning rate
was set too quickly, which resulted in inconsistent accuracy.
We looked for alternative architectures after realizing that
VGG-19 was fairly slow due to the quantity of parameters it
needs. We discovered CNN-benchmarks that suggested
RESNETi-18 could deliver comparable results at
substantially faster speeds. Furthermore, we also thought
that our model's ability to train the residual layer of the
RESNET architecture would improve our accuracy,
especially with deeper networks, as our model would be
modified or taught to achieve optimality for the residual
layer.
Graph -2: We found that our loss was steadily decreasing
after modifying our learning rate and model input.
We have discovered that limiting the number of epochsused
when experimenting with various parameters is a good
method. We trained our model on two epochs,comparedthe
results, and then reran it on more epochs with a subset of
parameters that produced the best results and those whose
loss/accuracy graphs looked the most promising because
our training and validation processes took a long time and
we wanted to find the best parameters. In terms of training,
this method undoubtedly saved us a ton of time, but it also
had drawbacks. Some parameters, for example, might have
taken longer to converge but eventually attained their best
value in the end. If their graphs did not appear sufficiently
promising and their findings weresubparwhencomparedto
the other parameters, we might have overlooked such
parameters. Due to a lack of resources, this presented a
hurdle for us, but we made an effort to overcome it by taking
into account the parameter graphs in addition to the raw
data they supplied.
Because only one class dominated the weight updates at the
beginning of our training efforts and at the end, all of the
instances were assigned tothatclass,thevalidationaccuracy
was initially practically random. We discovered that the
problem was in fact with the weights and that the updates
were too large, saturating weightvaluesatthebeginningand
inhibiting learning after printing the weights and their
changes for two samples.
Table -1: Results of Classification at the Word and
Character Levels.
Architecture Training Accuracy Validation
Accuracy
Test
Accuracy
VGG-19 28% 22% 20%
RESNET-18 31% 23% 22%
RESNET-34 35% 27% 25%
Char-Level 38% 33% 31%
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1943
RESNET-34 was implemented to deepen the network in
response to our RESNET-18 results, which we used in
conjunction with our model. The accuracy we achieved
during training was 35%, while accuracy during validation
was 27%. In order to avoid having parameter updates occur
too frequently and slow down our model, we decided to
utilize a reasonably large mini-batch size of 50. A bigger
mini-batch size would collect too much data all at once to
indicate the gradient's direction. Due to the need for semi-
regular updates and the avoidance of the gradient's value
being distorted, this was crucial for our model.
Qualitatively, the findings of word classification are
encouraging; they demonstrate that, despite data
constraints, we can still obtain respectable accuracy for a
highly complex issue. We are aware, however, that larger
vocabulary sizes will make it more difficult to gather the
essential data and will drastically impair accuracy,
undermining the capacity to effectivelyconverthandwritten
text to a digital format. This motivated us to start looking
into the character segmentation technique.
Fig -4: When we first started training, Our learning rate
was set too quickly, which resulted in inconsistent
accuracy.
5.3 Character Classification Level
We then move on to letter segmentation and word
reconstruction by classifying each character separately to
improve the direct word classification results.TheTesseract
LSTM model with convolution was firstly downloaded and
pre-trained on the aforesaid English datasets before it being
refined onto our dataset. We then had to modify each of our
input data so that it included not only input image, but also
bounding box labels of each particular character in that
image and the closest approximation in the four top left,
bottom top left, top right, and top left of each personality
character, which we extracted from the XML data.
Given the pertinent data, the Tesseractcontainsprogramsto
automatically construct this dataset. Following that, we
created a character vocabulary made up of single-digit
numerals, uppercase, and lowercase letters. Adam, the
optimization technique weultimatelyusedinourfinal model
was also used, but we used a different kind of loss that was
more appropriate for the issue: CTC loss. In a nutshell, CTC
(Connectionist Temporal Classification) is a technique/loss
function created by Graves et al. for teaching recurrent
neural networks to label output sequences (labels for
characters in our example)fromunsegmentedinputdata the
input word picture in our case. RNNs with CTC loss have
been shown to be more efficient than more conventional
methods like CRFs (Conditional Random Fields) and HMMs
for tasks like labelling speech signal data with word-level
transcriptions because of their automated learning process,
need for only an input/output data representation as
opposed to substantial hand-engineeredfeatures,andability
to more generally capture context over time and space in
their hidden state. CTC can separate and label each of the
unstructured input data available by assessing network
outputs as a probability distribution over all possible label
sequences, conditioned on a given input sequence” created
by considering segmentation of the input sequence given a
certain minimum segment size and starting training of the
network from there. We provided these final segmented
character photos to our model after honing and completing
our segmentation model, as previously described. With the
same model designs for classification and even with the
possibility of segmentation error, we discovered that
character-level classification was more successful,asseen in
the accuracy graphforcharactervs.word-level classification.
These findings corroborated our hypotheses that the
performance was improved by the model's initial feature
representation problem, which for characters was
significantly less in scope than for words, and its final
labelling problem. Due to a lack of data that was enough for
the extent of our issue and flaws in the segmentation model,
we believed that our model did not perform any better.
Fig -5: For each word, we can segment each character
using Tesseract LSTM. For example, we can extract the
start and end coordinates of each character in the word
"MOVE".
Due to the breakdown in the borders between some cursive
letters, our model, like most segmentation models, has
difficulty segmenting cursive characters. Infact,becauseour
segmentation model andcharacterclassificationmodel were
trained independently, we were unabletopreciselypinpoint
whether our classificationmodel orsegmentationmodel was
to blame for the inaccuracy. Last but not least, writers'
handwriting varies greatly, making it difficult to distinguish
between all of the many ways that a word or character is
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072
© 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1944
written. We reasoned that, as is frequently the case with
deep learning challenges, even additional data (over the
course of millions more words) would have assisted in
teaching our model a more generalized feature
representation for the issue, which would have improved its
performance.
3. CONCLUSIONS
Deep learning has gained popularity in recent years and has
started to be used in a variety of fields, including facial
recognition, target identification, and optical character
recognition (OCR). Target detection is consistently used by
researchers. Networks of various types, from the Faster and
RCNN OCR area, used to find texts. in comparison to text
recognition in the classical sense by embedded. With
consistent noise immunity, deep learning is more accurate
and durability. It has the capacity to fend off effects like
alterations in the backdrop. The deep learning end-to-end
network is more precise when applied to text detection,
when compared to customary picture processing and
cutting. Machine learning enables character classification.
Apart from the text detection and conversion it into digital
format this application also provides with translating the
text into other languages mainly focusing on English and
Hindi.
ACKNOWLEDGEMENT
Vishnu Deepak Patibandla, Vishnu Vardhan and Yogesh
Gautam are grateful to Mrs. Jayashree, Assistant Professor,
BNMIT for her constant support and technical guidanceover
the course this project. Also,VishnuDeepak,VishnuVardhan
and Yogesh Gautam would like express our immense
gratitude towards Mrs. Chayadevi, M L HOD at BNMIT for
her constant support.
REFERENCES
[1] A Novel Approach to Classify English Handwritten
Character Using Convolutional Neural Networks by
Rupesh Sharma, Prashant Sharma, Naresh Purohit,
International Journal ofAdvancedComputerScienceand
Applications, 21-23 Mar. 2021, West Yorkshire,
Malayasia.
[2] Handwritten character recognition using Convolutional
neural networks by Verdha Chaudhary, Mahak Bansal
and Guarav Raj, IEEE Seventh International Conference
on Intelligent Systems Design and Applications, 09 Dec.
2020, Rio de Janeiro, Brazil.
[3] CNN Based CommonApproachtohandwrittencharacter
recognition of multiple scripts by Durjoy Sen Maitra,
Ujjwal Bhattacharya and Swapan K. Paru, IEEE
Proceedings of 2nd International Conference on
Document Analysis and Recognition, 20-22 Oct. 2020,
Tsukuba, Japan.
[4] Handwritten Character Recognition from Images using
CNN-ECOC by Mayur Bhargab Bora, Dinthisrang
Daimary, Khwairakpam Amitab, Debdatta Kandar.,
International Conference on Innovative Practices in
Technology and Management, 17-19 Feb. 2020, Noida,
India.
[5] Handwritten character recognition using CNN by S.
Anandh kishan, J. Clinton David, P. Sharon Femi, IEEE
Transactions on Pattern Analysis and Machine
Intelligence, 05 May. 2020, Melbourne, Australia.
[6] Transfer learning using CNN for handwritten
Devanagari Character Recognition by Nagender Aneja
and Sandhya Aneja.,International Journal ofEngineering
and Computer Science, 06-09 Feb. 2020, Delhi, India.
[7] HandwrittenCharacterRecognitionofDevanagariScript
by Breijeshwar Dessai Amit Patel, IEEE Proceedings of
2nd International ConferenceonDocumentAnalysisand
Recognition, 20-22 Oct. 2019, Tokyo, Japan.
[8] An End-to-End System for Hindi Online Handwriting
Recognition, bySoumik Bhattacharya,DurjoySenMaitra,
Ujjwal Bhattacharya and Swapan K. Parui, IEEE third
International Conference, 2016.
[9] Automatic image-to-Text-to-Voice Conversion for
Interactively Locating Objects in Home Environments,
by Nikolas Bourbk, 20th IEEE Interactive conference on
Tools with ArtificialIntelligence,2008,1082-3409/08.
[10] Review on conversion of image to text as well as speech
using edge detection and image segmentation by
Mrunmayee patil, Ramesh kagalkar, international
journal of science and research (IJSR),2319-
7064,2012,3.358.

Contenu connexe

Similaire à Handwritten Text Recognition and Translation with Audio

IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNINGIMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNINGIRJET Journal
 
Devanagari Character Recognition using Image Processing & Machine Learning
Devanagari Character Recognition using Image Processing & Machine LearningDevanagari Character Recognition using Image Processing & Machine Learning
Devanagari Character Recognition using Image Processing & Machine LearningIRJET Journal
 
A Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingA Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingIRJET Journal
 
IRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using AndroidIRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using AndroidIRJET Journal
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AIIRJET Journal
 
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...IRJET Journal
 
Design and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontDesign and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontIRJET Journal
 
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLESIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLEIRJET Journal
 
IRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural NetworkIRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural NetworkIRJET Journal
 
Image Based Tool for Level 1 and Level 2 Autistic People
Image Based Tool for Level 1 and Level 2 Autistic PeopleImage Based Tool for Level 1 and Level 2 Autistic People
Image Based Tool for Level 1 and Level 2 Autistic PeopleIRJET Journal
 
IRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text DetectionIRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text DetectionIRJET Journal
 
IRJET- Handwritten Character Recognition using Artificial Neural Network
IRJET- Handwritten Character Recognition using Artificial Neural NetworkIRJET- Handwritten Character Recognition using Artificial Neural Network
IRJET- Handwritten Character Recognition using Artificial Neural NetworkIRJET Journal
 
Product Label Reading System for visually challenged people
Product Label Reading System for visually challenged peopleProduct Label Reading System for visually challenged people
Product Label Reading System for visually challenged peopleIRJET Journal
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC
 
Deep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A ReviewDeep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A ReviewIRJET Journal
 
A Graph-based Web Image Annotation for Large Scale Image Retrieval
A Graph-based Web Image Annotation for Large Scale Image RetrievalA Graph-based Web Image Annotation for Large Scale Image Retrieval
A Graph-based Web Image Annotation for Large Scale Image RetrievalIRJET Journal
 
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET Journal
 
Named Entity Recognition (NER) Using Automatic Summarization of Resumes
Named Entity Recognition (NER) Using Automatic Summarization of ResumesNamed Entity Recognition (NER) Using Automatic Summarization of Resumes
Named Entity Recognition (NER) Using Automatic Summarization of ResumesIRJET Journal
 
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUIIRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUIIRJET Journal
 
Real Time Sign Language Detection
Real Time Sign Language DetectionReal Time Sign Language Detection
Real Time Sign Language DetectionIRJET Journal
 

Similaire à Handwritten Text Recognition and Translation with Audio (20)

IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNINGIMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
IMAGE TO TEXT TO SPEECH CONVERSION USING MACHINE LEARNING
 
Devanagari Character Recognition using Image Processing & Machine Learning
Devanagari Character Recognition using Image Processing & Machine LearningDevanagari Character Recognition using Image Processing & Machine Learning
Devanagari Character Recognition using Image Processing & Machine Learning
 
A Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive HandwritingA Deep Learning Approach to Recognize Cursive Handwriting
A Deep Learning Approach to Recognize Cursive Handwriting
 
IRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using AndroidIRJET- Text Extraction from Text Based Image using Android
IRJET- Text Extraction from Text Based Image using Android
 
IRJET- Offline Transcription using AI
IRJET-  	  Offline Transcription using AIIRJET-  	  Offline Transcription using AI
IRJET- Offline Transcription using AI
 
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
IRJET- Neural Network based Script Recognition using Wavelet Features: An App...
 
Design and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English FontDesign and Description of Feature Extraction Algorithm for Old English Font
Design and Description of Feature Extraction Algorithm for Old English Font
 
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLESIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
SIGN LANGUAGE INTERFACE SYSTEM FOR HEARING IMPAIRED PEOPLE
 
IRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural NetworkIRJET- Extension to Visual Information Narrator using Neural Network
IRJET- Extension to Visual Information Narrator using Neural Network
 
Image Based Tool for Level 1 and Level 2 Autistic People
Image Based Tool for Level 1 and Level 2 Autistic PeopleImage Based Tool for Level 1 and Level 2 Autistic People
Image Based Tool for Level 1 and Level 2 Autistic People
 
IRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text DetectionIRJET- On-Screen Translator using NLP and Text Detection
IRJET- On-Screen Translator using NLP and Text Detection
 
IRJET- Handwritten Character Recognition using Artificial Neural Network
IRJET- Handwritten Character Recognition using Artificial Neural NetworkIRJET- Handwritten Character Recognition using Artificial Neural Network
IRJET- Handwritten Character Recognition using Artificial Neural Network
 
Product Label Reading System for visually challenged people
Product Label Reading System for visually challenged peopleProduct Label Reading System for visually challenged people
Product Label Reading System for visually challenged people
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
Deep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A ReviewDeep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A Review
 
A Graph-based Web Image Annotation for Large Scale Image Retrieval
A Graph-based Web Image Annotation for Large Scale Image RetrievalA Graph-based Web Image Annotation for Large Scale Image Retrieval
A Graph-based Web Image Annotation for Large Scale Image Retrieval
 
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...IRJET-  	  Detection and Recognition of Hypertexts in Imagery using Text Reco...
IRJET- Detection and Recognition of Hypertexts in Imagery using Text Reco...
 
Named Entity Recognition (NER) Using Automatic Summarization of Resumes
Named Entity Recognition (NER) Using Automatic Summarization of ResumesNamed Entity Recognition (NER) Using Automatic Summarization of Resumes
Named Entity Recognition (NER) Using Automatic Summarization of Resumes
 
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUIIRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
IRJET- Text Line Detection in Camera Caputerd Images using Matlab GUI
 
Real Time Sign Language Detection
Real Time Sign Language DetectionReal Time Sign Language Detection
Real Time Sign Language Detection
 

Plus de IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

Plus de IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Dernier

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college projectTonystark477637
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 

Dernier (20)

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
result management system report for college project
result management system report for college projectresult management system report for college project
result management system report for college project
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 

Handwritten Text Recognition and Translation with Audio

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1939 Handwritten Text Recognition and Translation with Audio Patibandla Vishnu Deepak1, Vishnu Vardhan2, Yogesh Gautam3, Jayashree4 1VIII Semester, Dept. of CSE, BNMIT 2VIII Semester, Dept. of CSE BNMIT 3Asst. Professor, Dept of CSE, BNMIT, Karnataka, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - This project attempts to classify any unique handwritten text in order to convert the text material to a digital format. To complete this objective, we used two different methods: direct word classification and separating each character. Using a Convolutional Neural Network (CNN) with different architectures, we train a model that can accurately categorize words for the former. To create boxes for each particular character, we employ a special LSTM with convolution. Where LSTM is Long Short-Term Memory. These separated characters are then forwarded into a CNN for categorizing, and each particular word is subsequently reassembled using the classification and separation findings. Key Words: Optical character recognition, Convolutional Networks, Text detection and recognition, translation,audio speech. 1.INTRODUCTION Despite the prevalence of electronic writing devices, many people still choose taking notes on paper and a pen. Though, there are certainadvantages therearespecific disadvantages to writing text by hand and are provided below. It's tough to efficiently keep and access physical documents, as well as to efficiently search, sort them from a huge pile of documents. Sharing these documents can also stressful. Consequently, lots of valuable information is misplaced or never analysed since these documents are never converted to digital format. We chose to address this issue in our research because we believe that the far better ease of managing digital text compared to written language will enable people to more easily access, search, share, and analyse their records while still allowing them to use their preferred writing method. The purpose of this project is to delve deeper into the task of classifyinghandwrittentext and converting it to a digital representation. Handwritten text is a broad phrase, and we intended to reduce the scope of the project by defining handwritten writing specifically for our needs. We took on the task of identifying the visual of any handwritten word, whether it was in cursive or block script, for this research. This work can be combined with algorithms that separate words from lines in a given line image, which can then be used with algorithmsthatseparate lines from lines in a given image of a whole handwritten page. With these classes added, our project could be in the form of a distributable file used by the end user and would be a fully functional template that would prompt the user to take a picture of the notes page, allowing users solve the problem of converting handwritten documents to digital format. Note that while some extra layers need to be added to our model to create a fully functional deliverable product to end users, we believe the categorization of the problem is interesting. the most delicious and the hardest, that's why we chose to approach it. instead of line-to-word, document- to-line, etc. Because CNNs function better on raw input pixels data rather than specific features or sections of an particular image, we can address this problem with complete word pictures. We expect the improvement by extracting characters from each particular word image and then classifying eachwordcharacterindividuallytore-constructa whole word, based on our findingsusing entirewordimages. In summary, our models in both ways take in an image/picture of the word and thenoutputthe word'sname. 2. RELATED WORK 2.1 Early Scanners The first motivationfortheclassificationofhandwrittentexts was the digit classification of mail. Early postal readers, such as Jacob Rabinow's, used integrated scanningandprocessing devices to recognize fixed-spaced fonts. Allum et al. He refined this by developing a sophisticated scanner that allowed more variation in the way text was written and put the data into a barcode that was printed directly on the letter.in text input and that also encoded the information onto a bar-code that was printed directly on the letter. 2.2 To the Digital Age In 1974, Ray Kurzweil developed the first known piece of OCR software that could identify any typeface. In this software, the matrix technique was used more frequently (pattern matching). In essence, this would determine which character it most closely resembled by comparing the bitmaps of the template character with the read character. The drawback of this programme was thatitwassensitiveto variations in font size and writing style. To improve templating, OCR software switched from using templates to feature extraction. Software would look for each character's geometric moments, zoning, and projection histograms.
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1940 2.3 Machine Learning The Lecun concentrated on utilizinggradient-basedlearning approaches with multi-module ML models, which served as the forerunner for the first end-to-end modern deep learning models. Adoption of a Hidden Markov Model for OCR was the next big advancement in attaining high OCR accuracy. This method uses each letter as a specific state, and then considering the context of the character when choosing the next hidden variable. In comparison to feature extraction technique and the Naive Bayes approach, this resulted in higher accuracy. The manual extraction elements remained the biggest drawback, since they required prior language expertise and were not especially robust to thediversityand complexity of handwriting. Ng and colleagues used a sliding window to apply CNNs to the issue of identifying text that is observed inthereal world (signs, written, etc.) within an image.Aslidingwindowscans the picture to see if any characters might be present. Each character was classified using a CNN with two of each convolutional layers, and averagepoolinglayerslastlya fully connected layer. Scan, and Read: End-to-End MDLSTM Handwritten Paragraph Recognition One of the most renowned publications for the task of handwritten text detection is Attention. An LSTM layer was utilised to encode the raw image data to a feature map for each scanning direction. After then, the model would use attention to give particular feature maps more priority than others. The attention map was created, then it was given to thedecoder, whichused the visual summary and state to anticipate the character. This approach was distinctive in that it combined the segmentation and classification processes into a single model rather than separating them. Due to the lack of a language model for producing character and word sequences, this approach has one drawback. Without considering the context of the invented word, it solely relies on each character's aesthetic categorization. We also found that using a project from CS 231N in the past helped us with our job. Yan uses the Faster R- CNN model to categorize and identify specific characters within a word.. This method initially determines whether an object exists within the image's bounds by dragging a window across it. The bounded image is then classified to the character that corresponds to it. Yan also uses edit distance, which lets users to make changes to the separated word to see if another separated word (for example, xoo versus zoo) is more likely to be correct. 3. DATA The IAM Handwriting Dataset was our primary source for handwriting recognition training. The handwritten text in this dataset totals 5500 phrases and 11500 words,anditwas written on more than 1500 forms (a form is a sheet of paper with lines of text) by more than 600 authors.Thewordswere segmented and meticulouslychecked before being delivered with all relevant formlabel metadata in associated XML files. The source material was taken from the Lancaster- Oslo/Bergen (LOB) corpus, which comprises over 1 million words of entire English sentences. There are another 1,066 forms in the database, written by roughly 400 distinct authors. Fig -1: a form from the IAM Handwriting dataset as an example. The dataset's word pictures were taken from such formats. Because of its breadth, depth, and quality, this database is often used as the foundation for many handwriting recognition tasks, this is the reason for choosing the IAM Handwriting Dataset as the foundation of our training, validation, and test data for our models. Lastly, large datasets—even those that contain multiple pre-trained models—are essential for deep learning, and this dataset, which contained over 100K word instances, satisfied those requirements. Deep learning models need at best 105 -106 training examples to work properly, regardless of transfer learning. 4. PRE-PROCESSING STAGE Pre-processing is a sequence of procedures applied to the captured image. In essence, it improves the depiction of the image to make it segmentation-ready.Alotofprocedures are carried out on an image as part of its pre-processingasgiven below.
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1941 4.1 Noise Removal This pre-processing stage raises the qualityoftherawimage and increases the data relevance. This stageisalsoknownas pixel level or low-level processing. Skew correction of an image is carried out during the pre-processing stage to accurately display the article text lines, threshold to convert a colour or grayscale image into a binary image, and noise reduction to reduce superfluous data, asillustratedinFigure 2. Dust, spots, dots, and lines are all examples of noise that can contaminate scanned documents and significantly alter recognition outcomes. A scannedarticleimageneedsto be cleaned of any noise in order to make it suitable for further processing. To enhance the image that a machine or person is viewing, image enhancement techniques are used. Noise reductionusessmoothingandnonlinearprocesseslike morphological operations. Fig -2: Results from the pre-processing phase, including noise removal and skew correction. 4.2 Skew Correction If the text line in scanned document images is not horizontally aligned after scanning, skew correction is employed to fix it. On the document or line level, the skew correction can be implemented. A global skew correction is implemented on the document level during the pre- processing stage. 4.3 Slant Correction To adjust the writing style's inclination, slant correction is used. The writings are altered from their slant to their upright state by means of a shear transformation.Numerous directions are affected by a shear transformation. The pixel values of the same image that have been vertically movedby distances d and -d are added to the converted image data for each direction, as illustrated in Figure 3. Fig -3: Steps in Slant correction. 4.4 Line Segmentation An image with an alphabetic sequence is broken down into smaller images that represent individual characters during this segmentation stage of pre-processing. By giving each alphabet a number via a labelling procedure, the pre- processed input image is separated into distinct alphabets. The labelling indicates how many characters aretherein the image. For the categorization and identification stage, every single character is uniformly reconstructed into 90*60 pixels. 5. METHODOLOGY 5.1 Training Level Following SoftMax activation, we trained eachofourmodels on a cross-entropy loss function using the ADAM Optimizer. With the exception of changing the last completely connected output layer to map to the number of classes (word/character vocabulary), we kept the identical architectures for the Visual Geometry Group (VGG) and Residual Networks (RESNET) models. We started from scratch while training the word and character classifier. We reasoned that the enormousSoftMaxlayeroutputsize(more than 10,000 words in our training vocabulary and well over 100,000 words in only the English language) and the difficulty of fine-grained word picture identification were impacting our performance for word-level classification.We came to the conclusion that character-level classification might be a more effective strategy because the SoftMax's computational complexity can be greatly reduced by using a character vocabulary that is significantlylesscomplexthana word vocabulary (the characters A-Za-z0-9 only have 62 distinct symbols). In addition, due to the constrained character set, character identification in an image is an easier challenge than word recognition in an image. The separation of word specific images into their component character images will be the first significant obstacle we would have to overcome in order to evaluate this strategy. The second major difficulty would be identifying word breaks in images and connecting consecutively identified letters in between these word breaks to make words. In this section, we'll focus on the first of these problems. We used the brand-new CNN/LSTM engine powered by Tesseract4.0 neural networks to carry out this task. This model is set up as a text line recognizer that can automatically identifymore than 100 languages. It was originally developed byHPand is now maintained by Google. The Tesseract model had been trained on roughly 400000 text lines spread among about 4500 fonts for Latin-based languages, including English. On our IAM handwriting dataset, we then adjusted this pretrained model's parameters. The original word input images were split into their hypothetical component character images after the model was tweaked, and these output segmented character images were fed into our character-level categorization system.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1942 5.2 Each Word-Level Classification We started using VGG-19 to train our Word-level classification model. First off, considering the quantity of parameters it needs, we discovered that this model is difficult to train. Initially, we tried using the whole vocabulary, but we discovered that the run time was too slow. We limited the vocabulary to 50 terms that were chosen at random and showed up at least 20 times in our dataset in order to get results. Because some of the chosen words, like "the," contained hundreds of training examples, training was still quite slow. In order to have enough data to benchmark outcomes with a better trade-offonthe runtime, we limited each word to 20 occurrences. Our VGG-19 model was subsequently trimmed, and training with numerous parameters was started. Our learning rate was too high during our first runs, which led to low end training and validation accuracy. We increased the learning rate of our Adam Optimizer, which achieved training accuracy of 28% and validation accuracy of 22%. Graph -1: When we first started training, Our learning rate was set too quickly, which resulted in inconsistent accuracy. We looked for alternative architectures after realizing that VGG-19 was fairly slow due to the quantity of parameters it needs. We discovered CNN-benchmarks that suggested RESNETi-18 could deliver comparable results at substantially faster speeds. Furthermore, we also thought that our model's ability to train the residual layer of the RESNET architecture would improve our accuracy, especially with deeper networks, as our model would be modified or taught to achieve optimality for the residual layer. Graph -2: We found that our loss was steadily decreasing after modifying our learning rate and model input. We have discovered that limiting the number of epochsused when experimenting with various parameters is a good method. We trained our model on two epochs,comparedthe results, and then reran it on more epochs with a subset of parameters that produced the best results and those whose loss/accuracy graphs looked the most promising because our training and validation processes took a long time and we wanted to find the best parameters. In terms of training, this method undoubtedly saved us a ton of time, but it also had drawbacks. Some parameters, for example, might have taken longer to converge but eventually attained their best value in the end. If their graphs did not appear sufficiently promising and their findings weresubparwhencomparedto the other parameters, we might have overlooked such parameters. Due to a lack of resources, this presented a hurdle for us, but we made an effort to overcome it by taking into account the parameter graphs in addition to the raw data they supplied. Because only one class dominated the weight updates at the beginning of our training efforts and at the end, all of the instances were assigned tothatclass,thevalidationaccuracy was initially practically random. We discovered that the problem was in fact with the weights and that the updates were too large, saturating weightvaluesatthebeginningand inhibiting learning after printing the weights and their changes for two samples. Table -1: Results of Classification at the Word and Character Levels. Architecture Training Accuracy Validation Accuracy Test Accuracy VGG-19 28% 22% 20% RESNET-18 31% 23% 22% RESNET-34 35% 27% 25% Char-Level 38% 33% 31%
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1943 RESNET-34 was implemented to deepen the network in response to our RESNET-18 results, which we used in conjunction with our model. The accuracy we achieved during training was 35%, while accuracy during validation was 27%. In order to avoid having parameter updates occur too frequently and slow down our model, we decided to utilize a reasonably large mini-batch size of 50. A bigger mini-batch size would collect too much data all at once to indicate the gradient's direction. Due to the need for semi- regular updates and the avoidance of the gradient's value being distorted, this was crucial for our model. Qualitatively, the findings of word classification are encouraging; they demonstrate that, despite data constraints, we can still obtain respectable accuracy for a highly complex issue. We are aware, however, that larger vocabulary sizes will make it more difficult to gather the essential data and will drastically impair accuracy, undermining the capacity to effectivelyconverthandwritten text to a digital format. This motivated us to start looking into the character segmentation technique. Fig -4: When we first started training, Our learning rate was set too quickly, which resulted in inconsistent accuracy. 5.3 Character Classification Level We then move on to letter segmentation and word reconstruction by classifying each character separately to improve the direct word classification results.TheTesseract LSTM model with convolution was firstly downloaded and pre-trained on the aforesaid English datasets before it being refined onto our dataset. We then had to modify each of our input data so that it included not only input image, but also bounding box labels of each particular character in that image and the closest approximation in the four top left, bottom top left, top right, and top left of each personality character, which we extracted from the XML data. Given the pertinent data, the Tesseractcontainsprogramsto automatically construct this dataset. Following that, we created a character vocabulary made up of single-digit numerals, uppercase, and lowercase letters. Adam, the optimization technique weultimatelyusedinourfinal model was also used, but we used a different kind of loss that was more appropriate for the issue: CTC loss. In a nutshell, CTC (Connectionist Temporal Classification) is a technique/loss function created by Graves et al. for teaching recurrent neural networks to label output sequences (labels for characters in our example)fromunsegmentedinputdata the input word picture in our case. RNNs with CTC loss have been shown to be more efficient than more conventional methods like CRFs (Conditional Random Fields) and HMMs for tasks like labelling speech signal data with word-level transcriptions because of their automated learning process, need for only an input/output data representation as opposed to substantial hand-engineeredfeatures,andability to more generally capture context over time and space in their hidden state. CTC can separate and label each of the unstructured input data available by assessing network outputs as a probability distribution over all possible label sequences, conditioned on a given input sequence” created by considering segmentation of the input sequence given a certain minimum segment size and starting training of the network from there. We provided these final segmented character photos to our model after honing and completing our segmentation model, as previously described. With the same model designs for classification and even with the possibility of segmentation error, we discovered that character-level classification was more successful,asseen in the accuracy graphforcharactervs.word-level classification. These findings corroborated our hypotheses that the performance was improved by the model's initial feature representation problem, which for characters was significantly less in scope than for words, and its final labelling problem. Due to a lack of data that was enough for the extent of our issue and flaws in the segmentation model, we believed that our model did not perform any better. Fig -5: For each word, we can segment each character using Tesseract LSTM. For example, we can extract the start and end coordinates of each character in the word "MOVE". Due to the breakdown in the borders between some cursive letters, our model, like most segmentation models, has difficulty segmenting cursive characters. Infact,becauseour segmentation model andcharacterclassificationmodel were trained independently, we were unabletopreciselypinpoint whether our classificationmodel orsegmentationmodel was to blame for the inaccuracy. Last but not least, writers' handwriting varies greatly, making it difficult to distinguish between all of the many ways that a word or character is
  • 6. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 07 | July 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 1944 written. We reasoned that, as is frequently the case with deep learning challenges, even additional data (over the course of millions more words) would have assisted in teaching our model a more generalized feature representation for the issue, which would have improved its performance. 3. CONCLUSIONS Deep learning has gained popularity in recent years and has started to be used in a variety of fields, including facial recognition, target identification, and optical character recognition (OCR). Target detection is consistently used by researchers. Networks of various types, from the Faster and RCNN OCR area, used to find texts. in comparison to text recognition in the classical sense by embedded. With consistent noise immunity, deep learning is more accurate and durability. It has the capacity to fend off effects like alterations in the backdrop. The deep learning end-to-end network is more precise when applied to text detection, when compared to customary picture processing and cutting. Machine learning enables character classification. Apart from the text detection and conversion it into digital format this application also provides with translating the text into other languages mainly focusing on English and Hindi. ACKNOWLEDGEMENT Vishnu Deepak Patibandla, Vishnu Vardhan and Yogesh Gautam are grateful to Mrs. Jayashree, Assistant Professor, BNMIT for her constant support and technical guidanceover the course this project. Also,VishnuDeepak,VishnuVardhan and Yogesh Gautam would like express our immense gratitude towards Mrs. Chayadevi, M L HOD at BNMIT for her constant support. REFERENCES [1] A Novel Approach to Classify English Handwritten Character Using Convolutional Neural Networks by Rupesh Sharma, Prashant Sharma, Naresh Purohit, International Journal ofAdvancedComputerScienceand Applications, 21-23 Mar. 2021, West Yorkshire, Malayasia. [2] Handwritten character recognition using Convolutional neural networks by Verdha Chaudhary, Mahak Bansal and Guarav Raj, IEEE Seventh International Conference on Intelligent Systems Design and Applications, 09 Dec. 2020, Rio de Janeiro, Brazil. [3] CNN Based CommonApproachtohandwrittencharacter recognition of multiple scripts by Durjoy Sen Maitra, Ujjwal Bhattacharya and Swapan K. Paru, IEEE Proceedings of 2nd International Conference on Document Analysis and Recognition, 20-22 Oct. 2020, Tsukuba, Japan. [4] Handwritten Character Recognition from Images using CNN-ECOC by Mayur Bhargab Bora, Dinthisrang Daimary, Khwairakpam Amitab, Debdatta Kandar., International Conference on Innovative Practices in Technology and Management, 17-19 Feb. 2020, Noida, India. [5] Handwritten character recognition using CNN by S. Anandh kishan, J. Clinton David, P. Sharon Femi, IEEE Transactions on Pattern Analysis and Machine Intelligence, 05 May. 2020, Melbourne, Australia. [6] Transfer learning using CNN for handwritten Devanagari Character Recognition by Nagender Aneja and Sandhya Aneja.,International Journal ofEngineering and Computer Science, 06-09 Feb. 2020, Delhi, India. [7] HandwrittenCharacterRecognitionofDevanagariScript by Breijeshwar Dessai Amit Patel, IEEE Proceedings of 2nd International ConferenceonDocumentAnalysisand Recognition, 20-22 Oct. 2019, Tokyo, Japan. [8] An End-to-End System for Hindi Online Handwriting Recognition, bySoumik Bhattacharya,DurjoySenMaitra, Ujjwal Bhattacharya and Swapan K. Parui, IEEE third International Conference, 2016. [9] Automatic image-to-Text-to-Voice Conversion for Interactively Locating Objects in Home Environments, by Nikolas Bourbk, 20th IEEE Interactive conference on Tools with ArtificialIntelligence,2008,1082-3409/08. [10] Review on conversion of image to text as well as speech using edge detection and image segmentation by Mrunmayee patil, Ramesh kagalkar, international journal of science and research (IJSR),2319- 7064,2012,3.358.