SlideShare une entreprise Scribd logo
1  sur  36
Text detection in product images
10/26/2013
Naoki Chiba, Lead Scientist

Rakuten Institute of Technology
Rakuten Inc.
http://rit.rakuten.co.jp/
Product images
Sales pitches in images

Applications:
• Content retrieval/filtering
• Recognition
• Translation
2
RIT Text Detector

Far more accurate

Works like magic

3
Outline

1 Text detection overview
2 Current methods
3 RIT’s approach

4
Outline

1 Text detection overview
2 Current methods
3 RIT’s approach

5
Academic Research
• Natural scene OCR ≠ traditional scanned
OCR
–
–
–
–

Camera captured
Illumination variations
Perspective distortion
Short text

Digital-born text

Natural-scene text
Source: ICDAR Text locating competition 6
Product Images - Two Purposes
Text’s role is different

1. Sales pitches
1. Product list

7
Product list
Sales pitch (Merchant’s names, Price, Shipping)

8
“Now Printing” images
Showing image unavailability, but..

Not
Updated

9
Text detection for product images

More accurate
Much Faster

10
Outline

1 Text detection overview
2 Current methods
3 RIT’s approach

11
Current methods

1. Texture based (Classifier-based)
2. Region based (Connected components)
3. Hybrids

12
1. Texture-based method
• Special texture
• Scan
• Classifier (SVM, AdaBoost
or Neural network)

Problems:
• Scale/Rotation variant

• High computation
13
2. Region-based method
• Local features
(edges or color clustering)

• Connected component
analysis
• Text lines and word
separation
Output of Stroke width transform
Problem:
• False candidates

14
3. Hybrid method

B
Classifier
SVM
Random Forrest
AdaBoost

Region based
Edge (Stroke Width Transform)
Color clustering

15
Problems

1. Character/word annotation
Time-consuming task

2. Transparent text
Hard to detect

16
Problem 1: Character/word annotation
Time consuming for many images

17
Problem 2: Transparent text

?
• Weak edges (difficult to detect)

18
Outline

1 Text detection overview
2 Current methods
3 RIT’s approach

19
RIT’s Approach

1. Character/word annotation
Time-consuming task

Text image classifier using imagewise annotation
2. Transparent text
Hard to detect

Transparent text detection and
background recovery

20
1. Text image classifier
using image-wise annotation

• Text image detection (not char/word)
– Image-wise annotation (less time)
– Clustering detected regions
(measure text likeliness)

21
Image-wise Annotation

送料無料
text
Draw rectangles

Character-wise

non-text

Classify text/non-text

Image-wise

22
f2

Clustering detected regions
P(C1) = 3/4
x

x

C1

C5

x
C3

x

x
C2

P(C4) = 0/3
C4

Region in text images
Region in non-text images
x

f1

Cluster center
23
Comparison
Better than a typical method
Accuracy
90.0%
80.0%
70.0%
60.0%
50.0%
40.0%
30.0%
20.0%
10.0%
0.0%

Current

Proposed

• Rakuten 500 images
• Compared w/a traditional region-based method

24
RIT’s Approach

1. Character/word annotation
Time-consuming task

Text image classifier using imagewise annotation
2. Transparent text
Hard to detect

Transparent text detection and
background recovery

25
2. Transparent text detection and
background recovery
• Edge Detection with adaptive threshold
– Image content analysis

• Background recovery
– Text color/opacity estimation

26
Edge detection with adaptive thresholds

•

Less noise

Weak edges are
better preserved
27
Texture strength
Measuring image complexity
Image patches:
Direction and energy:
eigenvectors and eigenvalues[1]

Texture strength:
[1] Xiang Zhu and Peyman Milanfar, “Automatic parameter selection for denoising
algorithms using a no-reference measure of image content,” IEEE transactions on
image processing, pp. 3116–32, 2010.
28
Proposed text detection
1. Texture based (Classifier based)
SVM/Random Forest/AdaBoost

2. Region based (Connected components)
Edge/Color Clustering

3. Hybrids
Region (Edge Stroke Width)
+
Texture (AdaBoost)

29
System flow

•

Input image

Components
Analysis

Adaptive Edge
detection

Stroke width transform and
Connected component

Detected
text
30
Detection result

(a) constant threshold

(b) proposed
31
System flow

•

Input image

Components
Analysis

Adaptive Edge
detection

Detected
text

Stroke width transform and
Connected component

Background
recovery
32
Transparent Text
opacity
I

text color

I = O(1- r)+ rT
O
I: observed pixel value
O: original pixel value

• 2 >= equations
• Least squares solution
• 2 unknown

33
Extraction result

(a) original

(b) recovered

34
Comparison with InPainting

Original

Magic
Patented!

InPainting

Rakuten

35
Details: ACPR 2013

Thank you!

36

Contenu connexe

Tendances

Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...Divya Gera
 
Cc31331335
Cc31331335Cc31331335
Cc31331335IJMER
 
CBIR For Medical Imaging...
CBIR For Medical  Imaging...CBIR For Medical  Imaging...
CBIR For Medical Imaging...Isha Sharma
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...ijdpsjournal
 
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORKTEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORKijscai
 
IRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET Journal
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image RetrievalSOURAV KAR
 
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Editor IJARCET
 
Semantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual AnalysisSemantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual AnalysisAllenWu
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosCSCJournals
 
Representation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templatesRepresentation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templatesAhmed Abd-Elwasaa
 
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...IRJET Journal
 
Inpainting scheme for text in video a survey
Inpainting scheme for text in video   a surveyInpainting scheme for text in video   a survey
Inpainting scheme for text in video a surveyeSAT Journals
 

Tendances (18)

Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...Texture features based text extraction from images using DWT and K-means clus...
Texture features based text extraction from images using DWT and K-means clus...
 
Cc31331335
Cc31331335Cc31331335
Cc31331335
 
CBIR For Medical Imaging...
CBIR For Medical  Imaging...CBIR For Medical  Imaging...
CBIR For Medical Imaging...
 
CBIR with RF
CBIR with RFCBIR with RF
CBIR with RF
 
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
COHESIVE MULTI-ORIENTED TEXT DETECTION AND RECOGNITION STRUCTURE IN NATURAL S...
 
40120140501009
4012014050100940120140501009
40120140501009
 
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORKTEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
TEXT DETECTION AND EXTRACTION FROM VIDEOS USING ANN BASED NETWORK
 
IRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A SurveyIRJET - Text Detection in Natural Scene Images: A Survey
IRJET - Text Detection in Natural Scene Images: A Survey
 
Content Based Image Retrieval
Content Based Image RetrievalContent Based Image Retrieval
Content Based Image Retrieval
 
Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124Volume 2-issue-6-2119-2124
Volume 2-issue-6-2119-2124
 
Semantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual AnalysisSemantics In Digital Photos A Contenxtual Analysis
Semantics In Digital Photos A Contenxtual Analysis
 
Image Indexing and Retrieval
Image Indexing and RetrievalImage Indexing and Retrieval
Image Indexing and Retrieval
 
A Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In VideosA Survey On Thresholding Operators of Text Extraction In Videos
A Survey On Thresholding Operators of Text Extraction In Videos
 
Representation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templatesRepresentation and recognition of handwirten digits using deformable templates
Representation and recognition of handwirten digits using deformable templates
 
Text Mining
Text MiningText Mining
Text Mining
 
New Technology
New TechnologyNew Technology
New Technology
 
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
 
Inpainting scheme for text in video a survey
Inpainting scheme for text in video   a surveyInpainting scheme for text in video   a survey
Inpainting scheme for text in video a survey
 

En vedette

[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving Era
[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving Era[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving Era
[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving EraRakuten Group, Inc.
 
Scaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFSScaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFSRakuten Group, Inc.
 
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...Rakuten Group, Inc.
 
Latent Class Transliteration based on Source Language Origin
Latent Class Transliteration based on Source Language OriginLatent Class Transliteration based on Source Language Origin
Latent Class Transliteration based on Source Language OriginRakuten Group, Inc.
 
Unsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product DescriptionUnsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product DescriptionRakuten Group, Inc.
 
[RakutenTechConf2013] [D-3_2] Counting Big Data by Streaming Algorithms
[RakutenTechConf2013] [D-3_2] Counting Big Databy Streaming Algorithms[RakutenTechConf2013] [D-3_2] Counting Big Databy Streaming Algorithms
[RakutenTechConf2013] [D-3_2] Counting Big Data by Streaming AlgorithmsRakuten Group, Inc.
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureRakuten Group, Inc.
 
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product DescriptionsRakuten Group, Inc.
 
Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)
Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)
Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)Rakuten Group, Inc.
 

En vedette (10)

[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving Era
[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving Era[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving Era
[RakutenTechConf2013] [LT] Giving Life to your IDEAS to Survive in Evolving Era
 
Scaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFSScaling and High Performance Storage System: LeoFS
Scaling and High Performance Storage System: LeoFS
 
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...
[RakutenTechConf2013][C-4_3] Our Goals and Activities at Rakuten Institute o...
 
Latent Class Transliteration based on Source Language Origin
Latent Class Transliteration based on Source Language OriginLatent Class Transliteration based on Source Language Origin
Latent Class Transliteration based on Source Language Origin
 
Unsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product DescriptionUnsupervised Extraction of Attributes and Their Values from Product Description
Unsupervised Extraction of Attributes and Their Values from Product Description
 
[RakutenTechConf2013] [D-3_2] Counting Big Data by Streaming Algorithms
[RakutenTechConf2013] [D-3_2] Counting Big Databy Streaming Algorithms[RakutenTechConf2013] [D-3_2] Counting Big Databy Streaming Algorithms
[RakutenTechConf2013] [D-3_2] Counting Big Data by Streaming Algorithms
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
[RakutenTechConf2013] [C-4_2] Building Structured Data from Product Descriptions
 
The Egison Programming Language
The Egison Programming LanguageThe Egison Programming Language
The Egison Programming Language
 
Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)
Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)
Purchase prediction by statistical analysis (統計技術を用いた商品購買予測)
 

Similaire à [RakutenTechConf2013] [C4-1] Text detection in product images

Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval Swati Chauhan
 
Ch14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdfCh14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdfAbdullah Azzeh
 
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrievalrubaiyat11
 
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)mayankraj86
 
Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++IRJET Journal
 
Show observe and tell giang nguyen
Show observe and tell   giang nguyenShow observe and tell   giang nguyen
Show observe and tell giang nguyenNguyen Giang
 
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...kumari36
 
Literature Review on Content Based Image Retrieval
Literature Review on Content Based Image RetrievalLiterature Review on Content Based Image Retrieval
Literature Review on Content Based Image RetrievalUpekha Vandebona
 
Content Based Image Retrieval: A Review
Content Based Image Retrieval: A ReviewContent Based Image Retrieval: A Review
Content Based Image Retrieval: A ReviewIRJET Journal
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 
AISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the EdgeAISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the EdgeBill Liu
 
Cbir final ppt
Cbir final pptCbir final ppt
Cbir final pptrinki nag
 
Cbir final ppt
Cbir final pptCbir final ppt
Cbir final pptrinki nag
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchSujit Pal
 
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Thanh Tran
 
Image Retrieval Based on its Contents Using Features Extraction
Image Retrieval Based on its Contents Using Features ExtractionImage Retrieval Based on its Contents Using Features Extraction
Image Retrieval Based on its Contents Using Features ExtractionIRJET Journal
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple featuresHirantha Pradeep
 
A Novel Method for Content Based Image Retrieval using Local Features and SVM...
A Novel Method for Content Based Image Retrieval using Local Features and SVM...A Novel Method for Content Based Image Retrieval using Local Features and SVM...
A Novel Method for Content Based Image Retrieval using Local Features and SVM...IRJET Journal
 

Similaire à [RakutenTechConf2013] [C4-1] Text detection in product images (20)

Content Based Image Retrieval
Content Based Image Retrieval Content Based Image Retrieval
Content Based Image Retrieval
 
Ch14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdfCh14-Part4-ImageRetrieval.pdf
Ch14-Part4-ImageRetrieval.pdf
 
Content based image retrieval
Content based image retrievalContent based image retrieval
Content based image retrieval
 
FELIS
FELISFELIS
FELIS
 
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
Mayank Raj - 4th Year Project on CBIR (Content Based Image Retrieval)
 
Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++Implementation of Computer Vision Applications using OpenCV in C++
Implementation of Computer Vision Applications using OpenCV in C++
 
Show observe and tell giang nguyen
Show observe and tell   giang nguyenShow observe and tell   giang nguyen
Show observe and tell giang nguyen
 
PPT s01-machine vision-s2
PPT s01-machine vision-s2PPT s01-machine vision-s2
PPT s01-machine vision-s2
 
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
Fast Wavelet Based Image Characterization for Highly Adaptive Image Retrieval...
 
Literature Review on Content Based Image Retrieval
Literature Review on Content Based Image RetrievalLiterature Review on Content Based Image Retrieval
Literature Review on Content Based Image Retrieval
 
Content Based Image Retrieval: A Review
Content Based Image Retrieval: A ReviewContent Based Image Retrieval: A Review
Content Based Image Retrieval: A Review
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
AISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the EdgeAISF19 - Unleash Computer Vision at the Edge
AISF19 - Unleash Computer Vision at the Edge
 
Cbir final ppt
Cbir final pptCbir final ppt
Cbir final ppt
 
Cbir final ppt
Cbir final pptCbir final ppt
Cbir final ppt
 
Evolving a Medical Image Similarity Search
Evolving a Medical Image Similarity SearchEvolving a Medical Image Similarity Search
Evolving a Medical Image Similarity Search
 
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
Top-k Exploration of Query Candidates for Efficient Keyword Search on Graph-S...
 
Image Retrieval Based on its Contents Using Features Extraction
Image Retrieval Based on its Contents Using Features ExtractionImage Retrieval Based on its Contents Using Features Extraction
Image Retrieval Based on its Contents Using Features Extraction
 
Rapid object detection using boosted cascade of simple features
Rapid object detection using boosted  cascade of simple featuresRapid object detection using boosted  cascade of simple features
Rapid object detection using boosted cascade of simple features
 
A Novel Method for Content Based Image Retrieval using Local Features and SVM...
A Novel Method for Content Based Image Retrieval using Local Features and SVM...A Novel Method for Content Based Image Retrieval using Local Features and SVM...
A Novel Method for Content Based Image Retrieval using Local Features and SVM...
 

Plus de Rakuten Group, Inc.

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話Rakuten Group, Inc.
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のりRakuten Group, Inc.
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Rakuten Group, Inc.
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みRakuten Group, Inc.
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開Rakuten Group, Inc.
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用Rakuten Group, Inc.
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャーRakuten Group, Inc.
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割Rakuten Group, Inc.
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Group, Inc.
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfRakuten Group, Inc.
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfRakuten Group, Inc.
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfRakuten Group, Inc.
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfRakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoRakuten Group, Inc.
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoRakuten Group, Inc.
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technologyRakuten Group, Inc.
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情Rakuten Group, Inc.
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャーRakuten Group, Inc.
 

Plus de Rakuten Group, Inc. (20)

コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
コードレビュー改善のためにJenkinsとIntelliJ IDEAのプラグインを自作してみた話
 
楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり楽天における安全な秘匿情報管理への道のり
楽天における安全な秘匿情報管理への道のり
 
What Makes Software Green?
What Makes Software Green?What Makes Software Green?
What Makes Software Green?
 
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
Simple and Effective Knowledge-Driven Query Expansion for QA-Based Product At...
 
DataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組みDataSkillCultureを浸透させる楽天の取り組み
DataSkillCultureを浸透させる楽天の取り組み
 
大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開大規模なリアルタイム監視の導入と展開
大規模なリアルタイム監視の導入と展開
 
楽天における大規模データベースの運用
楽天における大規模データベースの運用楽天における大規模データベースの運用
楽天における大規模データベースの運用
 
楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー楽天サービスを支えるネットワークインフラストラクチャー
楽天サービスを支えるネットワークインフラストラクチャー
 
楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割楽天の規模とクラウドプラットフォーム統括部の役割
楽天の規模とクラウドプラットフォーム統括部の役割
 
Rakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdfRakuten Services and Infrastructure Team.pdf
Rakuten Services and Infrastructure Team.pdf
 
The Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdfThe Data Platform Administration Handling the 100 PB.pdf
The Data Platform Administration Handling the 100 PB.pdf
 
Supporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdfSupporting Internal Customers as Technical Account Managers.pdf
Supporting Internal Customers as Technical Account Managers.pdf
 
Making Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdfMaking Cloud Native CI_CD Services.pdf
Making Cloud Native CI_CD Services.pdf
 
How We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdfHow We Defined Our Own Cloud.pdf
How We Defined Our Own Cloud.pdf
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
Travel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech infoTravel & Leisure Platform Department's tech info
Travel & Leisure Platform Department's tech info
 
OWASPTop10_Introduction
OWASPTop10_IntroductionOWASPTop10_Introduction
OWASPTop10_Introduction
 
Introduction of GORA API Group technology
Introduction of GORA API Group technologyIntroduction of GORA API Group technology
Introduction of GORA API Group technology
 
100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情100PBを越えるデータプラットフォームの実情
100PBを越えるデータプラットフォームの実情
 
社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー社内エンジニアを支えるテクニカルアカウントマネージャー
社内エンジニアを支えるテクニカルアカウントマネージャー
 

Dernier

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 

Dernier (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 

[RakutenTechConf2013] [C4-1] Text detection in product images

Notes de l'éditeur

  1. Hello, my name is Naoki Chiba. Today I am going to talk about text detection in product images
  2. These are examples of product images, which contain sales pitches such price, store name and shipping information. Text detection’s applications would be content retrieval/filtering, character recognition and text translation into different languages for international sales.
  3. Here is the outline of today’s talk. After talking about text detection overview, I am going to review current methods. And then I will talk about Rakuten’s approach.
  4. In academia, text detection has been an active area of research for a long time, started from from traditional scanned OCR, which scan documents by a flat-bed scanner. As matter of fact, current text detection is different.Because of the popularity of imaging devices such as mobile cameras, images may contain illumination variations, perspective distortion and the text is shorter than before. Text images can be categorized into two types: digital born text, which was inserted by an editor and natural scene text, which is having a lot of attention in academia.
  5. Product images have two different purposes. The first is to show sales pitches. The second is to show a product list to represent product variation. Depending on the purpose, role of text is different.
  6. This is an example of a product list. If the image contains store specific information such as merchant’s name, price or shipping information that might not be good.
  7. Another example is what we call “Now printing” images. We use this type of image when the product images are not available although the product has been released or we take pre orders. These images are going to be updated, when the product photo is available. But we need to detect them first. The problem here is that they are provided by our merchants, not by Rakuten due to online market place model. We do not know what images they are going to use before hand.
  8. In summary, product images can be regarded as text in natural scene images in addition to digital born text, which are mixed and difficult to detect.
  9. Next, I am going to show some current methods in academia to detect text in images.
  10. Current methods can be categorized into three methods: texture-based, region-based and hybrids of those methods.
  11. Texture based method uses special texture to find text by scanning a window across the image. Then it classifies a window by classifiers such as Support Vector Machine, AdaBoost or Neural network. But it has two problems. First, it is scale and rotation variant. Second is that the computational cost is high.
  12. The second method is called region-based method. It examines local features either edges or color clustering, followed by connected component analysis, text line grouping and word separation. But the problem is that it may contain a lot of false candidates. Therefore the third type is
  13. Therefore, the third type is hybrid method, which is getting a lot of attentions these days. It is based on region-based method either by edges or by color clustering. And then it confirms fthat the detected regions are text or not by a classifier using machine learning techniques.
  14. Still there are some problems. We would like to solve the following two problems. One is character/word annotation. So character annotation is a time-consuming task, especially when we have a lot of data. Also transparent text is hard to detect.
  15. For example, character annotation is to locate rectangles on top of text characters by hand. If the image contains a lot of characters, annotation by a human operator is very time consuming, especially, when we have a lot of images.
  16. Another problem we would like to solve is transparent text, which is difficult to detect, because the edges are weak. But once we detect them, there is a possibility to recover the background behind the text.
  17. So to solve these problems, I would like to show what RI, Rakuten Institute of Technology, is doing.
  18. To avoid character/word annotation, we built a text image classifier by using only image-wise annotation, which is much more efficient. We are also working on transparent text detection and background recovery. I am going to show the details of the two.
  19. Our text image detection is based on image-wise annotation, which is much less time than character or word annotation.By clustering detected regions by a machine learning technique, we can get a measure of text likeliness.
  20. When each detected region can be represented by image features f1 and f2, we cluster them by the features. Based on image-wise annotation, we can a probability of being text for each cluster. For example, red dots show regions appeared in text images and blue dots in non-text. Tcluster C4 has regions only appeared in non-text images, it is unlikely to be text.
  21. We measured the performance against a typical previous method. It was significantly better. The accuracy has been increased around 20%.
  22. Another problem we are solving is transparent text and background recovery.
  23. We propose adaptive edge detection by analyzing image content. To recover background, we estimate text color and opacity which is transparency of text.
  24. These are examples of detected edges. Compared with traditional edge detectors such as Sobel or Canny, ours are better.
  25. Let me introduce how we do our detection. We measure image complexity as texture strength by analyzing image content. We can measure it by eigenspace analysis. Based on the texture strength, we can setup edge detection thresholds adaptively.
  26. To detect text, we are having a hybrid method. Based on region-based, edge stroke width transform with a machine learning technique.
  27. Here is a system flow. After adaptive edge detection, we work on component analysis and detect text. Once we detect text, we can recover background.
  28. Here are examples. Our system was able to detect transparent text.
  29. Here is a system flow. After adaptive edge detection, we work on component analysis and detect text. Once we detect text, we can recover background.
  30. Transparent text can be represented by this formula. Observing pixel vales, I, are mixture of background values , O, and text color T. The mixing ratio is determined by opacity gamma. Assuming that text color and opacity are uniform in the text, we can solve these parameters by a least square method when we have two sets or more data because the number of unknown parameters is two.
  31. This is an example of recovered image.
  32. We also compared with a previous method called InPainting, which tries to fill the hole of text by surrounding pixel pattern. Although InPainting cannot recover the original content, in this case small hole, ours was able to recover it.
  33. Thank you for your attention. The details will be presented at Asian Conference on Pattern Recognition next month.