SlideShare une entreprise Scribd logo
1  sur  55
Unlocking the Handwritten Content in  Document Images  Venu Govindaraju [email_address]
Handwritten Documents Relevance Scanner Storage OCR Noisy Text Newton Kinematics Notes Query Forms Letters Notes
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Challenge of Handwriting
Input Output 20187 + 2246 Handwriting Recognition
Postal Context  (138 mil records) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],LDR Lex Top 1 Top 2 10 96.5 98.7 100 89.2 94.1 1000 75.3 86.3
Paradigms Lexicon Driven OCR LDR Lexicon Free  OCR LFR Context Ranked Lexicon Segmentation Recognition Post-processing
Lexicon Free (LFR) i[.8], l[.8] u[.5], v[.2] w[.6], m[.3] w[.7] i[.7] u[.3] m[.2] m[.1] r[.4] d[.8] o[.5] ,[object Object],[object Object],[object Object],Find the best path in graph from segment 1 to 8
Lexicon Driven (LDR) Find the best way of accounting for  characters  ‘w’, ‘o’, ‘r’, ‘d’ buy consuming all segments 1 to 8 Distance between lexicon entry ‘word’ first character ‘w’ and the image between: - segments 1 and 4 is 5.0 - segments 1  and  3 is 7.2 - segments 1 and 2 is 7.6 w[7.6] w[7.2] r[3.8] w[5.0] w[8.6] o[7.6]r[6.3] d[4.9] w[5.0] o[6.6] o[6.0] o[7.2] o[10.6] d[6.5] d[4.4] r[7.5] r[6.4] o[7.8]r[8.6] o[8.7]r[7.4] r[7.6] o[8.3] o[7.7]r[5.8] 1 2 3 4 5 6 7 8 9 o[6.1]
Grapheme Models (LFR) Writer Specific Modeling Holistic Features grapheme pos orientation angle Down cusp 3.0 -90 o Up loop Down arc
[object Object],[object Object],[object Object],[object Object],ABLE TRIP TRAP A T N Words Letters Features Interactive Models (LDR) 1-way activation [McClelland and Rumelhart 1981] 2-way  interaction
Interactive Models (LDR) Phrase Level  T-crossings, loops, ascenders, descenders, length West Central Street West Main  Street Sunset Avenue West Central Street East Central Street Sunset Avenue West Central Street West Central Avenue Sunset Avenue Lexicon 1   Lexicon 2 Lexicon 3 Interactive Model features image 2-way interaction
Interactive Models Character Recognition ,[object Object],[object Object],[object Object],Gradient (4) and Moment (5) Features 0  1  0  1  1  1  0  0  1 [Park and Govindaraju, IEEE CVPR 2000]
Active Recognition
Results 10 class digit recognition 25656 training and 12242 test  (Postal +NIST) Active Model Neural  Net KNN Top 1% 95.7 % 96.4% 95.7% Temp 612 976 3,777 Msec 1.45 11.5 384 Training  hrs 1 24 1 Lex size LDR % GM % 10 96.86 96.56 100 91.36 89.12 1000 79.58 75.38 (Top 50) 98.00 98.40 20000 62.43 58.14 (Top 100) 93.59 93.39
Fusion   Identification Task Verification Task LDR LFR
Fusion of Recognizers Type III LDR 5.6 7.4 … LFR .52 .81 … Identification task: Amherst Buffalo … Verification task: 5.6 .52 Amherst Question:  if we find optimal  and  , is it necessarily  ?  Accept Reject
Traditional Fusion Rules ,[object Object],[object Object],[object Object],[object Object],[object Object]
Likelihood Ratio Verification Tasks ,[object Object],[object Object],Minimum risk criteria:  optimal decision boundaries coincide with the contours of likelihood ratio function: Metaclassification with NN, SVM, etc. also possible [Prabhakar, Jain 02] [Nandkumar, Jain, Das 08] Impostor Genuine Recognizer score 2 Recognizer score 1
Optimal Combination functions Identification Task Results Top choice correct rate Verification Task Results ROC LFR is correct 54.8% LDR is correct 77.2% Both are correct 48.9% Either is correct 83.0% Likelihood Ratio 69.8% Weighted Sum 81.6% ,[object Object]
Independence of Scores In a single trial Amherst 5.6 7.4 … Buffalo .52 .81 … LDR LFR … … . … .
Lexicon1 Lexicon  i Lexicon N Independence of Scores In a single trial Recognizer 1 Recognizer  M Tulyakov & Govindaraju, TIFS 2009 Independent? Dependent Dependent
Optimal  Combination  ? Correlated Scores Dependent on input signal Set size LFR LDR Both correct Either correct LR Weighted sum 54.8% 77.2% 48.9% 83.0% 69.8% 81.6% 6147 3366 4744 3005 5105 4293 5015 2 nd  choice 3 rd  choice 4 th  choice Mean LFR .4359 .4755 .4771 .1145 LDR .7885 .7825 .7673 .5685
Optimal Trainable Combination Function  Minimizing misclassification cost: Classify as  rather than Assume that scores assigned to different classes are independent : Tulyakov & Govindaraju IJPRAI 2009
Combination Methods  Identification Tasks No!  Traditional Training mixes the genuine and imposter scores from different trials. Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer Score 2 Recognizer score 1
Combination Methods  Identification Tasks Model  Training MUST process scores from one identification trial as a  single training sample . BRecognizer score 2 Recognizer score 1 Impostor Genuine Rexcognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Biometric score 1
Iterative Methods ,[object Object],[object Object],[object Object],Best Impostor Function ,[object Object],Likelihood Ratio Weighted sum Best Impostor Likelihood Ratio Logistic Sum Neural Network LFR & LDR 69.84 81.58 80.07 81.43 81.67 li & C 97.24 97.23 97.01 97.34 97.39 li & G 95.90 95.47 95.99 96.17 96.29
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Search for Handwritten Documents ,[object Object],[object Object],[object Object],[object Object],[object Object],Lexicon Good Quality 10K  1K Historical 10K  1K Medical 4K Top 1 (%) 57 67 12 28 20 Top 3 (%) 69 72 22 44 27 Top 10 (%) 74 75 32 72 42
Search Engine Handwritten Forms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Search Engine for Medical Forms ,[object Object],[object Object],[object Object]
Topic Categorization  Lexicon Reduction Lex Free Large Lexicon > 5K Handwritten Medical Documents ICR Features ~33% word Recognition rate (10 points gain) Topic  Categorization Select Reduced Lexicon ~2.5K Lex Driven
ICR Features Index
DIGESTIVE-SYSTEM  FQ  CHSN   PHRASE 30  0.72    PAIN INCIDENT 5  0.31    PAIN TRANSPORTED 42  0.54    PAIN CHEST 52  0.81    STOMACH PAIN 9  0.25    HOME PAIN 6  0.43    VOMITING ILLNESS Topic Features
(Chu-Carroll, et al., 1999) Topic Categorization
Results C: complete lexicon R: reduced lexicon A: category given S: features synthetic T: truth present CLT to RLT CL to RL CLT to ALT CLT to SLT HR  7.48%  7.42%  17.58%  7.42% Error Rate  10.78%  10.88%  24.53%  10.21%
Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Urgent Issue of our Times ,[object Object],[object Object],Threat:   ‘If it’s not in Google, it doesn’t exist!’ Baird 2003
What is possible today? ,[object Object]
Document Enhancement [Shi, Setlur, and Govindaraju 2008]
Transcript-Mapping 1787 Thomas Jefferson letter and its transcript  Image Transcript + +
What is not possible today?
 
Crosslingual Retrieval Multilingual Document Corpus Retrieved Documents  English Hindi Sanskrit Translations of “strength”
SEARCH Handwritten Documents Image – Based  Use Image Based Features OCR - Based Use OCR Recognition Results Query rendered
Image Based Methods (Rath 07 IJDAR)  Poor performance in multiple writer scenarios
SEARCH Handwritten Documents Image – Based  Use Image Based Features- OCR - Based Use OCR recognition results
Indexing Retrieval Handwriting  Recognition
Vector IR Model (TF-IDF) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Baeza-Yates99]
Modifications to VM ,[object Object],[object Object]
[object Object],[object Object],[object Object],Estimation   :  word images 0.02  0.01  0.2  0.01 0.01 … Doc  d j [Rath 04, Howe 05]
Estimating Term Frequency
Estimating Segmentation ,[object Object],[object Object],[object Object],[object Object],[object Object],d  >  D 3 hypotheses
[object Object],[object Object],[object Object],Word Recognition
[object Object]

Contenu connexe

Tendances

Loss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic CodingLoss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic CodingIJERA Editor
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - IJaganadh Gopinadhan
 
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)NAVER Engineering
 
Nov 04 MS1
Nov 04 MS1Nov 04 MS1
Nov 04 MS1Samimvez
 
A first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupA first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupDan Sullivan, Ph.D.
 

Tendances (6)

Loss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic CodingLoss less DNA Solidity Using Huffman and Arithmetic Coding
Loss less DNA Solidity Using Huffman and Arithmetic Coding
 
Elements of Text Mining Part - I
Elements of Text Mining Part - IElements of Text Mining Part - I
Elements of Text Mining Part - I
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniques
 
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
강화학습을 자연어 처리에 이용할 수 있을까? (보상의 희소성 문제와 그 방안)
 
Nov 04 MS1
Nov 04 MS1Nov 04 MS1
Nov 04 MS1
 
A first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetupA first look at tf idf-pdx data science meetup
A first look at tf idf-pdx data science meetup
 

Similaire à Trivandrum

Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI SystemsGlobecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI SystemsStenio Fernandes
 
Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018Manish Pandey
 
Towards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent supportTowards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent supportConcordia University
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured predictionzukun
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug predictionMartin Pinzger
 
Alexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question AnsweringAlexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question AnsweringAlexander Sirenko
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicatorsvie_dels
 
Not Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software QualityNot Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software QualityRocco Oliveto
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationkrws
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffMartin Pinzger
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionMartin Pinzger
 
Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?Roger Smith
 
Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?Sebastiano Panichella
 
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...IMPACT Centre of Competence
 
Real Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsReal Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsWassim Filali
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Similaire à Trivandrum (20)

Csmr10c.ppt
Csmr10c.pptCsmr10c.ppt
Csmr10c.ppt
 
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI SystemsGlobecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
Globecom - MENS 2011 - Characterizing Signature Sets for Testing DPI Systems
 
Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018
 
Towards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent supportTowards better software quality assurance by providing intelligent support
Towards better software quality assurance by providing intelligent support
 
NIPS2007: structured prediction
NIPS2007: structured predictionNIPS2007: structured prediction
NIPS2007: structured prediction
 
Wcre12b.ppt
Wcre12b.pptWcre12b.ppt
Wcre12b.ppt
 
Wcre12b.ppt
Wcre12b.pptWcre12b.ppt
Wcre12b.ppt
 
A tale of experiments on bug prediction
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
 
Alexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question AnsweringAlexander Sirenko - Query expansion for Question Answering
Alexander Sirenko - Query expansion for Question Answering
 
A Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality IndicatorsA Validation of Object-Oriented Design Metrics as Quality Indicators
A Validation of Object-Oriented Design Metrics as Quality Indicators
 
Not Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software QualityNot Only Statements: The Role of Textual Analysis in Software Quality
Not Only Statements: The Role of Textual Analysis in Software Quality
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localization
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?Medical Simulation Standards: What can we learn from the DoD?
Medical Simulation Standards: What can we learn from the DoD?
 
Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?Using IR methods for labeling source code artifacts: Is it worthwhile?
Using IR methods for labeling source code artifacts: Is it worthwhile?
 
Rui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase GenerationRui Meng - 2017 - Deep Keyphrase Generation
Rui Meng - 2017 - Deep Keyphrase Generation
 
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
Datech2014 - Session 4 - Construction of Text Digitization System for Nôm His...
 
Real Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth SensorsReal Time Human Posture Detection with Multiple Depth Sensors
Real Time Human Posture Detection with Multiple Depth Sensors
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Dernier

Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 

Dernier (20)

Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 

Trivandrum

  • 1. Unlocking the Handwritten Content in Document Images Venu Govindaraju [email_address]
  • 2. Handwritten Documents Relevance Scanner Storage OCR Noisy Text Newton Kinematics Notes Query Forms Letters Notes
  • 3.
  • 5. Input Output 20187 + 2246 Handwriting Recognition
  • 6.
  • 7. Paradigms Lexicon Driven OCR LDR Lexicon Free OCR LFR Context Ranked Lexicon Segmentation Recognition Post-processing
  • 8.
  • 9. Lexicon Driven (LDR) Find the best way of accounting for characters ‘w’, ‘o’, ‘r’, ‘d’ buy consuming all segments 1 to 8 Distance between lexicon entry ‘word’ first character ‘w’ and the image between: - segments 1 and 4 is 5.0 - segments 1 and 3 is 7.2 - segments 1 and 2 is 7.6 w[7.6] w[7.2] r[3.8] w[5.0] w[8.6] o[7.6]r[6.3] d[4.9] w[5.0] o[6.6] o[6.0] o[7.2] o[10.6] d[6.5] d[4.4] r[7.5] r[6.4] o[7.8]r[8.6] o[8.7]r[7.4] r[7.6] o[8.3] o[7.7]r[5.8] 1 2 3 4 5 6 7 8 9 o[6.1]
  • 10. Grapheme Models (LFR) Writer Specific Modeling Holistic Features grapheme pos orientation angle Down cusp 3.0 -90 o Up loop Down arc
  • 11.
  • 12. Interactive Models (LDR) Phrase Level T-crossings, loops, ascenders, descenders, length West Central Street West Main Street Sunset Avenue West Central Street East Central Street Sunset Avenue West Central Street West Central Avenue Sunset Avenue Lexicon 1 Lexicon 2 Lexicon 3 Interactive Model features image 2-way interaction
  • 13.
  • 15. Results 10 class digit recognition 25656 training and 12242 test (Postal +NIST) Active Model Neural Net KNN Top 1% 95.7 % 96.4% 95.7% Temp 612 976 3,777 Msec 1.45 11.5 384 Training hrs 1 24 1 Lex size LDR % GM % 10 96.86 96.56 100 91.36 89.12 1000 79.58 75.38 (Top 50) 98.00 98.40 20000 62.43 58.14 (Top 100) 93.59 93.39
  • 16. Fusion Identification Task Verification Task LDR LFR
  • 17. Fusion of Recognizers Type III LDR 5.6 7.4 … LFR .52 .81 … Identification task: Amherst Buffalo … Verification task: 5.6 .52 Amherst Question: if we find optimal and , is it necessarily ? Accept Reject
  • 18.
  • 19.
  • 20.
  • 21. Independence of Scores In a single trial Amherst 5.6 7.4 … Buffalo .52 .81 … LDR LFR … … . … .
  • 22. Lexicon1 Lexicon i Lexicon N Independence of Scores In a single trial Recognizer 1 Recognizer M Tulyakov & Govindaraju, TIFS 2009 Independent? Dependent Dependent
  • 23. Optimal Combination ? Correlated Scores Dependent on input signal Set size LFR LDR Both correct Either correct LR Weighted sum 54.8% 77.2% 48.9% 83.0% 69.8% 81.6% 6147 3366 4744 3005 5105 4293 5015 2 nd choice 3 rd choice 4 th choice Mean LFR .4359 .4755 .4771 .1145 LDR .7885 .7825 .7673 .5685
  • 24. Optimal Trainable Combination Function Minimizing misclassification cost: Classify as rather than Assume that scores assigned to different classes are independent : Tulyakov & Govindaraju IJPRAI 2009
  • 25. Combination Methods Identification Tasks No! Traditional Training mixes the genuine and imposter scores from different trials. Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer Score 2 Recognizer score 1
  • 26. Combination Methods Identification Tasks Model Training MUST process scores from one identification trial as a single training sample . BRecognizer score 2 Recognizer score 1 Impostor Genuine Rexcognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Biometric score 1
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32. Topic Categorization Lexicon Reduction Lex Free Large Lexicon > 5K Handwritten Medical Documents ICR Features ~33% word Recognition rate (10 points gain) Topic Categorization Select Reduced Lexicon ~2.5K Lex Driven
  • 34. DIGESTIVE-SYSTEM FQ CHSN PHRASE 30 0.72 PAIN INCIDENT 5 0.31 PAIN TRANSPORTED 42 0.54 PAIN CHEST 52 0.81 STOMACH PAIN 9 0.25 HOME PAIN 6 0.43 VOMITING ILLNESS Topic Features
  • 35. (Chu-Carroll, et al., 1999) Topic Categorization
  • 36. Results C: complete lexicon R: reduced lexicon A: category given S: features synthetic T: truth present CLT to RLT CL to RL CLT to ALT CLT to SLT HR  7.48%  7.42%  17.58%  7.42% Error Rate  10.78%  10.88%  24.53%  10.21%
  • 37.
  • 38.
  • 39.
  • 40. Document Enhancement [Shi, Setlur, and Govindaraju 2008]
  • 41. Transcript-Mapping 1787 Thomas Jefferson letter and its transcript Image Transcript + +
  • 42. What is not possible today?
  • 43.  
  • 44. Crosslingual Retrieval Multilingual Document Corpus Retrieved Documents English Hindi Sanskrit Translations of “strength”
  • 45. SEARCH Handwritten Documents Image – Based Use Image Based Features OCR - Based Use OCR Recognition Results Query rendered
  • 46. Image Based Methods (Rath 07 IJDAR) Poor performance in multiple writer scenarios
  • 47. SEARCH Handwritten Documents Image – Based Use Image Based Features- OCR - Based Use OCR recognition results
  • 49.
  • 50.
  • 51.
  • 53.
  • 54.
  • 55.

Notes de l'éditeur

  1. ½ min Good Afternoon: I am Venu Govindaraju, Professor at the University at Buffalo. The title of my talk today is “Paradigms in Handwriting Recognition”. This will be in the context of “English” language and the Roman alphabet. The idea is to see if some of the techniques that have proved successful in English are also applicable to Arabic or Chinese. This will be an overview style presentation: describing paradigms, applications, and accuracy figures.
  2. In the postal application, we are able to operate in the Lexicon size 30 (average). When we do not have collateral information, how does one reduce the lexicon size.
  3. 1 min The problem of handwriting recognition has been typically defined as follows: - The inputs are: a bit-map image of the word to be recognized AND a lexicon of possible choices. The lexicon usually captures the context of the application at hand. When the lexicon is not provided by the application, it assumes the size of the entire English Dictionary or at least the words in common usage. In such cases, the lexicon can be of the size of tens of thousands of words. -The output is a ranked list of the lexical choices. The choices are often associated with a confidence score. In this talk, we will make the following 2 assumptions: that we are dealing with single words or short phrases of a few words. There has been a considerable body of work in recognition of entire sentences. An early paper on the topic was published by Kim, Govindaraju and Srihari in IJDAR 1997. Since the, several papers have been published on the topic most notably from Prof Suen’s group at Concordia and Prof. Bunke’s group in Switzerland. The second assumption is that we are dealing with offline handwriting recognition.
  4. We are looking at the narrative text in the medical forms. We are using medical dictionaries. It can be seen that the techniques scale to other applications as well. We want develop a search engine for such medical forms where a health official could search the forms by querying with some medical terms. We demonstrated the method of keyword spotting at the demo session yesterday. We will now describe an alternate method of attempting full transcription- which is expected to be errorful- and see if search engines are still viable. The handwriting is sloppy- written in ambulances and other emergency scenarios. Abbreviations are freely used. Documents are in carbon copies and binarization itself is a challenge.- we presented this work at DAS 06. Lexicon Free recognition can pick up only a few characters in a each word with reasonable confidence. Lexicon driven- the lexicons will be greater than 5K for which the accuracy is in the 20s. What should we do?
  5. One problem with cohesive phrases alone is that during the recognition phase we do not know the words. Therefore, we extract terms from these cohesive phrases to be used to model the category to which its associated. This is the basis for the hypothesis. For example [read slide]
  6. The pseudo-category vector is then attached to the matrix of category column vectors.
  7. Some more detail concerning the impact of ruled line removal on word recognition: We extracted all the test word images from lined pages and measured the top choice recognition performance. Here are the numbers: -- Total word images in test set : 848 from a total of 274 pages. Of these: -- Number of word images from pages with ruled lines: 460, from 146 lined pages. -- The ratio of words and pages with ruled lines in the 34 PAW data set: 460/848 = 54.25% (word), 146/274=53.28% (pages). Recognition performance on words from lined pages: -- Top1: Earlier: 318/460 = 69.13% Now: 349/460 = 75.87% The ruled line removal improves the word recognition for top 1 by 6.74% (evaluated on words from lined pages). Overall improvement for top 1 is by 4.13% (evaluated using test set including all word images from lined or non-lined pages - which we had reported earlier). Also the PAW recognizer is a straightforward implementation using a k-nearest neighbor classifier. The features used are CUBS Gradient, Structure and Concavity Features. The classifier is a very simple implementation that can be improved and its purpose was for testing the effectiveness of our features.
  8. Digital libraries like the George Washington Papers collection at the Library of Congress consist of approximately 152,000 handwritten document images and associated transcripts. The Newton Project aims to make all of Newton's writings available online. The task of aligning the transcription with handwritten text in these libraries would enable one to automatically generate an immense database of word images which in turn can be used as truth data by word recognizers to create transcriptions for the remaining scanned documents. The tedious process of manually dragging a box around each word in an image and keying in the annotations could thus be avoided. In forensic document evaluations capturing characteristics specific to a writer are of paramount importance both in writer identification and writer verification. Thus if a mapping algorithm correctly maps word images to lexicon words during preprocessing the accuracy of writer recognition would improve remarkably. For existing scanned images the alignment enables one to build interfaces where the transcript text can be browsed alongside the manuscript.
  9. Existing keyword spotting approaches can be classified into two categories: (a) Image based and (b) OCR based In image feature based indexing approaches, after preprocessing of document images and word segmentation, feature vectors are extracted from word images and stored in a database. When a user provides a query word, the similarity between the query and the word image in the database is computed, and word images are returned in the decreasing order of similarities. (b) In OCR based approaches, the indices are built from OCR scores such posterior probabilities or feature vector observational likelihoods (probability density) converted from distances returned by word recognizer.
  10. Existing keyword spotting approaches can be classified into two categories: (a) Image based and (b) OCR based In image feature based indexing approaches, after preprocessing of document images and word segmentation, feature vectors are extracted from word images and stored in a database. When a user provides a query word, the similarity between the query and the word image in the database is computed, and word images are returned in the decreasing order of similarities. (b) In OCR based approaches, the indices are built from OCR scores such posterior probabilities or feature vector observational likelihoods (probability density) converted from distances returned by word recognizer.