Trivandrum

Unlocking the Handwritten Content in Document Images Venu Govindaraju [email_address]

Handwritten Documents Relevance Scanner Storage OCR Noisy Text Newton Kinematics Notes Query Forms Letters Notes

Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Input Output 20187 + 2246 Handwriting Recognition

Postal Context (138 mil records) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],LDR Lex Top 1 Top 2 10 96.5 98.7 100 89.2 94.1 1000 75.3 86.3

Paradigms Lexicon Driven OCR LDR Lexicon Free OCR LFR Context Ranked Lexicon Segmentation Recognition Post-processing

Lexicon Free (LFR) i[.8], l[.8] u[.5], v[.2] w[.6], m[.3] w[.7] i[.7] u[.3] m[.2] m[.1] r[.4] d[.8] o[.5] ,[object Object],[object Object],[object Object],Find the best path in graph from segment 1 to 8

Lexicon Driven (LDR) Find the best way of accounting for characters ‘w’, ‘o’, ‘r’, ‘d’ buy consuming all segments 1 to 8 Distance between lexicon entry ‘word’ first character ‘w’ and the image between: - segments 1 and 4 is 5.0 - segments 1 and 3 is 7.2 - segments 1 and 2 is 7.6 w[7.6] w[7.2] r[3.8] w[5.0] w[8.6] o[7.6]r[6.3] d[4.9] w[5.0] o[6.6] o[6.0] o[7.2] o[10.6] d[6.5] d[4.4] r[7.5] r[6.4] o[7.8]r[8.6] o[8.7]r[7.4] r[7.6] o[8.3] o[7.7]r[5.8] 1 2 3 4 5 6 7 8 9 o[6.1]

Grapheme Models (LFR) Writer Specific Modeling Holistic Features grapheme pos orientation angle Down cusp 3.0 -90 o Up loop Down arc

[object Object],[object Object],[object Object],[object Object],ABLE TRIP TRAP A T N Words Letters Features Interactive Models (LDR) 1-way activation [McClelland and Rumelhart 1981] 2-way interaction

Interactive Models (LDR) Phrase Level T-crossings, loops, ascenders, descenders, length West Central Street West Main Street Sunset Avenue West Central Street East Central Street Sunset Avenue West Central Street West Central Avenue Sunset Avenue Lexicon 1 Lexicon 2 Lexicon 3 Interactive Model features image 2-way interaction

Interactive Models Character Recognition ,[object Object],[object Object],[object Object],Gradient (4) and Moment (5) Features 0 1 0 1 1 1 0 0 1 [Park and Govindaraju, IEEE CVPR 2000]

Results 10 class digit recognition 25656 training and 12242 test (Postal +NIST) Active Model Neural Net KNN Top 1% 95.7 % 96.4% 95.7% Temp 612 976 3,777 Msec 1.45 11.5 384 Training hrs 1 24 1 Lex size LDR % GM % 10 96.86 96.56 100 91.36 89.12 1000 79.58 75.38 (Top 50) 98.00 98.40 20000 62.43 58.14 (Top 100) 93.59 93.39

Fusion Identification Task Verification Task LDR LFR

Fusion of Recognizers Type III LDR 5.6 7.4 … LFR .52 .81 … Identification task: Amherst Buffalo … Verification task: 5.6 .52 Amherst Question: if we find optimal and , is it necessarily ? Accept Reject

Traditional Fusion Rules ,[object Object],[object Object],[object Object],[object Object],[object Object]

Likelihood Ratio Verification Tasks ,[object Object],[object Object],Minimum risk criteria: optimal decision boundaries coincide with the contours of likelihood ratio function: Metaclassification with NN, SVM, etc. also possible [Prabhakar, Jain 02] [Nandkumar, Jain, Das 08] Impostor Genuine Recognizer score 2 Recognizer score 1

Optimal Combination functions Identification Task Results Top choice correct rate Verification Task Results ROC LFR is correct 54.8% LDR is correct 77.2% Both are correct 48.9% Either is correct 83.0% Likelihood Ratio 69.8% Weighted Sum 81.6% ,[object Object]

Independence of Scores In a single trial Amherst 5.6 7.4 … Buffalo .52 .81 … LDR LFR … … . … .

Lexicon1 Lexicon i Lexicon N Independence of Scores In a single trial Recognizer 1 Recognizer M Tulyakov & Govindaraju, TIFS 2009 Independent? Dependent Dependent

Optimal Combination ? Correlated Scores Dependent on input signal Set size LFR LDR Both correct Either correct LR Weighted sum 54.8% 77.2% 48.9% 83.0% 69.8% 81.6% 6147 3366 4744 3005 5105 4293 5015 2 nd choice 3 rd choice 4 th choice Mean LFR .4359 .4755 .4771 .1145 LDR .7885 .7825 .7673 .5685

Optimal Trainable Combination Function Minimizing misclassification cost: Classify as rather than Assume that scores assigned to different classes are independent : Tulyakov & Govindaraju IJPRAI 2009

Combination Methods Identification Tasks No! Traditional Training mixes the genuine and imposter scores from different trials. Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Recognizer score 1 Impostor Genuine Recognizer Score 2 Recognizer score 1

Combination Methods Identification Tasks Model Training MUST process scores from one identification trial as a single training sample . BRecognizer score 2 Recognizer score 1 Impostor Genuine Rexcognizer score 2 Recognizer score 1 Impostor Genuine Recognizer score 2 Biometric score 1

Iterative Methods ,[object Object],[object Object],[object Object],Best Impostor Function ,[object Object],Likelihood Ratio Weighted sum Best Impostor Likelihood Ratio Logistic Sum Neural Network LFR & LDR 69.84 81.58 80.07 81.43 81.67 li & C 97.24 97.23 97.01 97.34 97.39 li & G 95.90 95.47 95.99 96.17 96.29

Outline ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Search for Handwritten Documents ,[object Object],[object Object],[object Object],[object Object],[object Object],Lexicon Good Quality 10K 1K Historical 10K 1K Medical 4K Top 1 (%) 57 67 12 28 20 Top 3 (%) 69 72 22 44 27 Top 10 (%) 74 75 32 72 42

Search Engine Handwritten Forms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Search Engine for Medical Forms ,[object Object],[object Object],[object Object]

Topic Categorization Lexicon Reduction Lex Free Large Lexicon > 5K Handwritten Medical Documents ICR Features ~33% word Recognition rate (10 points gain) Topic Categorization Select Reduced Lexicon ~2.5K Lex Driven

DIGESTIVE-SYSTEM FQ CHSN PHRASE 30 0.72 PAIN INCIDENT 5 0.31 PAIN TRANSPORTED 42 0.54 PAIN CHEST 52 0.81 STOMACH PAIN 9 0.25 HOME PAIN 6 0.43 VOMITING ILLNESS Topic Features

(Chu-Carroll, et al., 1999) Topic Categorization

Results C: complete lexicon R: reduced lexicon A: category given S: features synthetic T: truth present CLT to RLT CL to RL CLT to ALT CLT to SLT HR  7.48%  7.42%  17.58%  7.42% Error Rate  10.78%  10.88%  24.53%  10.21%

Urgent Issue of our Times ,[object Object],[object Object],Threat: ‘If it’s not in Google, it doesn’t exist!’ Baird 2003

What is possible today? ,[object Object]

Document Enhancement [Shi, Setlur, and Govindaraju 2008]

Transcript-Mapping 1787 Thomas Jefferson letter and its transcript Image Transcript + +

Crosslingual Retrieval Multilingual Document Corpus Retrieved Documents English Hindi Sanskrit Translations of “strength”

SEARCH Handwritten Documents Image – Based Use Image Based Features OCR - Based Use OCR Recognition Results Query rendered

Image Based Methods (Rath 07 IJDAR) Poor performance in multiple writer scenarios

SEARCH Handwritten Documents Image – Based Use Image Based Features- OCR - Based Use OCR recognition results

Indexing Retrieval Handwriting Recognition

Vector IR Model (TF-IDF) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[Baeza-Yates99]

Modifications to VM ,[object Object],[object Object]

[object Object],[object Object],[object Object],Estimation : word images 0.02 0.01 0.2 0.01 0.01 … Doc d j [Rath 04, Howe 05]

Estimating Segmentation ,[object Object],[object Object],[object Object],[object Object],[object Object],d > D 3 hypotheses

[object Object],[object Object],[object Object],Word Recognition

Trivandrum

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (6)

Similaire à Trivandrum

Similaire à Trivandrum (20)

Dernier

Dernier (20)

Trivandrum

Notes de l'éditeur