Separating compound figures in journal articles to allow for subfigure classification

•

0 j'aime•567 vues

Institute of Information Systems (HES-SO)

Journal images represent an important part of the knowledge stored in the medical literature. Figure classification has received much attention as the information of the image types can be used in a variety of contexts to focus image search and filter out unwanted information or ”noise”, for example non–clinical images. A major problem in figure classification is the fact that many figures in the biomedical literature are compound figures and do often contain more than a single figure type. Some journals do separate compound figures into several parts but many do not, thus requiring currently manual separation. In this work, a technique of compound figure separation is proposed and implemented based on systematic detection and analysis of uniform space gaps. The method discussed in this article is evaluated on a dataset of journal figures of the open access literature that was created for the ImageCLEF 2012 benchmark and contains about 3000 compound figures. Automatic tools can easily reach a relatively high accuracy in separating compound figures. To further increase accuracy efforts are needed to improve the detection process as well as to avoid over–separation with powerful analysis strategies. The tools of this article have also been tested on a database of approximately 150’000 compound figures from the biomedical literature, making these images available as separate figures for further image analysis and allowing to filter important information from them.

Technologie

Institute of
Information Systems

Separating compound figures in journal
articles to allow for subfigure classification

Ajad Chhatkuli
Antonio Foncubierta-Rodríguez
Dimitrios Markonis
Henning Müller

Motivation Institute of
Information Systems

• Figures in biomedical journals contain a lot of
information
• CBIR has been proposed for accessing medical
literature
• Modality classification
• Improves accessibility
• Allows result filtering
• But 50% of figures are compound or multipanel

Aim Institute of
Information Systems

• Develop a system that separates compound figures
in the biomedical literature
• Visual-information only
• Textual information is discarded
• Modality-independent
• One method for many images types
• Many methods for few images types
• Tunable according to the dataset
• Large-scale tested
• Approximately 250 open access journals

Compound figure examples Institute of
Information Systems

Methods. Dataset Institute of
Information Systems

• 2982 manually classified figures from ImageCLEF
2012 dataset
• Ground truth:
• Image subclass: 2x1,1x2,
• Position of separators

Methods. Overview Institute of
Information Systems

• Problem is separated in two
• Find subfigure separator candidates
• Preprocessing if required
• Analyze candidates
• Remove false positives
• Rule-based decisions

Methods. Separator detection Institute of
Information Systems

• Based on minimum
pixel projection for
white-space separated
figures
• Horizontal  Vertical
detection
• Inverse order by rotation
according to aspect ratio
• Recursive

Methods. Separator detection Institute of
Information Systems

• Rule-based processing
• Progressive truncation to remove labels if no
separators are found
• Text removal based on connected commponents if no
separators are found
• Complement image for black-space separations
• Standard deviation image for subtle separations
• Binarization of non-graph figures:
• Less than 40% of the image is white or almost white

Methods. Separator analysis Institute of
Information Systems

• Classification problem
• True/false separator
• Features used:
• Closeness to border, division ratio, standard
deviation, text removal analysis, histogram, gap
comparison
• Classifiers:
• SVM
• Rule-based classifier

Results Institute of
Information Systems

Successful examples Institute of
Information Systems

Unsuccessful examples Institute of
Information Systems

Not horizontal/vertical
No separation gap separation

Conclusions future work Institute of
Information Systems

• Good results for a wide range of images
• Using purely visual information
• Separation problem: detection and analysis
• Rule weights can be fine-tuned according to dataset
• What would be the impact of a larger training set?
• What would be the impact in existing modality
classification accuracy?

Institute of
Information Systems

Thanks for your attention!

More information at http://medgift.hevs.ch

Ajad Chhatkuli, Dimitrios Markonis, Antonio Foncubierta-Rodríguez, Fabrice Meriaudeau
and Henning Müller, Separating compound figures in journal articles to allow for subfigure
classification, in: SPIE, Medical Imaging, Orlando, FL, USA, 2013

Recommandé

A novel medical image segmentation and classification using combined feature ...eSAT Journals

A01110107IOSR Journals

IRJET- Sugarcane Leaf Disease DetectionIRJET Journal

Automatic grading of diabetic retinopathy through machine learning SupriyaKamatgi

Automatic Diagnosis of Abnormal Tumor Region from Brain Computed Tomography I...ijcseit

Classification of Abnormalities in Brain MRI Images Using PCA and SVMIJERA Editor

MELANOMA CELL DETECTION IN LYMPH NODES HISTOPATHOLOGICAL IMAGES USING DEEP LE...sipij

ENHANCED SYSTEM FOR COMPUTER-AIDED DETECTION OF MRI BRAIN TUMORSsipij

Recommandé

A novel medical image segmentation and classification using combined feature ...eSAT Journals

A01110107IOSR Journals

IRJET- Sugarcane Leaf Disease DetectionIRJET Journal

Automatic grading of diabetic retinopathy through machine learning SupriyaKamatgi

Automatic Diagnosis of Abnormal Tumor Region from Brain Computed Tomography I...ijcseit

Classification of Abnormalities in Brain MRI Images Using PCA and SVMIJERA Editor

MELANOMA CELL DETECTION IN LYMPH NODES HISTOPATHOLOGICAL IMAGES USING DEEP LE...sipij

ENHANCED SYSTEM FOR COMPUTER-AIDED DETECTION OF MRI BRAIN TUMORSsipij

An Efficient Brain Tumor Detection Algorithm based on Segmentation for MRI Sy...ijtsrd

IRJET - A Review on Identification and Disease Detection in Plants using Mach...IRJET Journal

Breast legionhashhash2000

IRJET- Nail based Disease Analysis at Earlier Stage using Median Filter i...IRJET Journal

IRJET- Supervised Learning Approach for Flower Images using Color, Shape and ...IRJET Journal

Skin Lesion Classification using Supervised Algorithm in Data Miningijtsrd

Identifying brain tumour from mri image using modified fcm and supportIAEME Publication

IRJET- Retinal Fundus Image Segmentation using Watershed AlgorithmIRJET Journal

C1103041623IOSR Journals

thesis_Jerzy_Zielinski_2012-08-27Jerzy Zielinski

Paper id 25201482IJRAT

A survey early detection ofijcsa

Region-based volumetric medical image retrievalInstitute of Information Systems (HES-SO)

Artificial intelligence Pattern recognition systemREHMAT ULLAH

Application of machine learning in industrial applicationsAnish Das

7-1 ARTIFICIAL INTELLIGENCE IN PATHOLOGY semiar 2.pptxHarishankarSharma27

Advanced Technologies in Medicine and Allied.pptWAQARULHASSAN48

nnU-Net: a self-configuring method for deep learning-based biomedical image s...ivaderivader

AI IN PATH final PPT.pptxDivyaGaurav4

Developing high content image analysis software for biologistsClaire McQuin

Practical aspects of medical image ai for hospital (IRB course)Sean Yu

Face recognition: A Comparison of Appearance Based Approachessadique_ghitm

Contenu connexe

Tendances

An Efficient Brain Tumor Detection Algorithm based on Segmentation for MRI Sy...ijtsrd

IRJET - A Review on Identification and Disease Detection in Plants using Mach...IRJET Journal

Breast legionhashhash2000

IRJET- Nail based Disease Analysis at Earlier Stage using Median Filter i...IRJET Journal

IRJET- Supervised Learning Approach for Flower Images using Color, Shape and ...IRJET Journal

Skin Lesion Classification using Supervised Algorithm in Data Miningijtsrd

Identifying brain tumour from mri image using modified fcm and supportIAEME Publication

IRJET- Retinal Fundus Image Segmentation using Watershed AlgorithmIRJET Journal

C1103041623IOSR Journals

thesis_Jerzy_Zielinski_2012-08-27Jerzy Zielinski

Paper id 25201482IJRAT

A survey early detection ofijcsa

Tendances (12)

An Efficient Brain Tumor Detection Algorithm based on Segmentation for MRI Sy...

IRJET - A Review on Identification and Disease Detection in Plants using Mach...

Breast legion

IRJET- Nail based Disease Analysis at Earlier Stage using Median Filter i...

IRJET- Supervised Learning Approach for Flower Images using Color, Shape and ...

Skin Lesion Classification using Supervised Algorithm in Data Mining

Identifying brain tumour from mri image using modified fcm and support

IRJET- Retinal Fundus Image Segmentation using Watershed Algorithm

C1103041623

thesis_Jerzy_Zielinski_2012-08-27

Paper id 25201482

A survey early detection of

Similaire à Separating compound figures in journal articles to allow for subfigure classification

Region-based volumetric medical image retrievalInstitute of Information Systems (HES-SO)

Artificial intelligence Pattern recognition systemREHMAT ULLAH

Application of machine learning in industrial applicationsAnish Das

7-1 ARTIFICIAL INTELLIGENCE IN PATHOLOGY semiar 2.pptxHarishankarSharma27

Advanced Technologies in Medicine and Allied.pptWAQARULHASSAN48

nnU-Net: a self-configuring method for deep learning-based biomedical image s...ivaderivader

AI IN PATH final PPT.pptxDivyaGaurav4

Developing high content image analysis software for biologistsClaire McQuin

Practical aspects of medical image ai for hospital (IRB course)Sean Yu

Face recognition: A Comparison of Appearance Based Approachessadique_ghitm

Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationImpetus Technologies

Supervised LearningFEG

Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...Daniel Roggen

Basic image analysis(processing and classification) and visualization using m...Vishwas N

Fcv core liuzukun

ITS 832 CHAPTER 5FROM BUILDING A MODEL TO ADAPTIVE ROBUST DE.docxvrickens

Machine Learning in Pathology Diagnostics with Simagis Livekhvatkov

Diagnosis Support by Machine Learning Using Posturography DataTeruKamogashira

LAS - System Biology LessonLASircc

Semantic enrichment and similarity approximation for biomedical sequence imagesSyed Ahmad Chan Bukhari, PhD

Similaire à Separating compound figures in journal articles to allow for subfigure classification (20)

Region-based volumetric medical image retrieval

Artificial intelligence Pattern recognition system

Application of machine learning in industrial applications

7-1 ARTIFICIAL INTELLIGENCE IN PATHOLOGY semiar 2.pptx

Advanced Technologies in Medicine and Allied.ppt

nnU-Net: a self-configuring method for deep learning-based biomedical image s...

AI IN PATH final PPT.pptx

Developing high content image analysis software for biologists

Practical aspects of medical image ai for hospital (IRB course)

Face recognition: A Comparison of Appearance Based Approaches

Anomaly Detection - Real World Scenarios, Approaches and Live Implementation

Supervised Learning

Wearable Computing - Part IV: Ensemble classifiers & Insight into ongoing res...

Basic image analysis(processing and classification) and visualization using m...

Fcv core liu

ITS 832 CHAPTER 5FROM BUILDING A MODEL TO ADAPTIVE ROBUST DE.docx

Machine Learning in Pathology Diagnostics with Simagis Live

Diagnosis Support by Machine Learning Using Posturography Data

LAS - System Biology Lesson

Semantic enrichment and similarity approximation for biomedical sequence images

Plus de Institute of Information Systems (HES-SO)

MIE20232.pptxInstitute of Information Systems (HES-SO)

Classification of noisy free-text prostate cancer pathology reports using nat...Institute of Information Systems (HES-SO)

Machine learning assisted citation screening for Systematic Reviews - Anjani ...Institute of Information Systems (HES-SO)

Exploiting biomedical literature to mine out a large multimodal dataset of ra...Institute of Information Systems (HES-SO)

L'IoT dans les usines. Quels avantages ?Institute of Information Systems (HES-SO)

Studying Public Medical Images from Open Access Literature and Social Network...Institute of Information Systems (HES-SO)

Risques opérationnels et le système de contrôle interne : les limites d’un te...Institute of Information Systems (HES-SO)

Le contrôle interne dans les administrations publiques tient-il toutes ses pr...Institute of Information Systems (HES-SO)

Le système de contrôle interne : Présentation générale, enjeux et méthodesInstitute of Information Systems (HES-SO)

Crowdsourcing-based Mobile Application for Wheelchair AccessibilityInstitute of Information Systems (HES-SO)

Quelle(s) valeur(s) pour le leadership stratégique ?Institute of Information Systems (HES-SO)

A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...Institute of Information Systems (HES-SO)

Challenges in medical imaging and the VISCERAL modelInstitute of Information Systems (HES-SO)

NOSE: une approche Smart-City pour les zones périphériques et extra-urbainesInstitute of Information Systems (HES-SO)

Medical image analysis and big data evaluation infrastructuresInstitute of Information Systems (HES-SO)

Medical image analysis, retrieval and evaluation infrastructuresInstitute of Information Systems (HES-SO)

How to detect soft falls on devicesInstitute of Information Systems (HES-SO)

FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSISInstitute of Information Systems (HES-SO)

MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLSInstitute of Information Systems (HES-SO)

Enhanced Students Laboratory The GET projectInstitute of Information Systems (HES-SO)

Plus de Institute of Information Systems (HES-SO) (20)

MIE20232.pptx

Classification of noisy free-text prostate cancer pathology reports using nat...

Machine learning assisted citation screening for Systematic Reviews - Anjani ...

Exploiting biomedical literature to mine out a large multimodal dataset of ra...

L'IoT dans les usines. Quels avantages ?

Studying Public Medical Images from Open Access Literature and Social Network...

Risques opérationnels et le système de contrôle interne : les limites d’un te...

Le contrôle interne dans les administrations publiques tient-il toutes ses pr...

Le système de contrôle interne : Présentation générale, enjeux et méthodes

Crowdsourcing-based Mobile Application for Wheelchair Accessibility

Quelle(s) valeur(s) pour le leadership stratégique ?

A 3-D Riesz-Covariance Texture Model for the Prediction of Nodule Recurrence ...

Challenges in medical imaging and the VISCERAL model

NOSE: une approche Smart-City pour les zones périphériques et extra-urbaines

Medical image analysis and big data evaluation infrastructures

Medical image analysis, retrieval and evaluation infrastructures

How to detect soft falls on devices

FUNDAMENTALS OF TEXTURE PROCESSING FOR BIOMEDICAL IMAGE ANALYSIS

MOBILE COLLECTION AND DISSEMINATION OF SENIORS’ SKILLS

Enhanced Students Laboratory The GET project

Dernier

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes

Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada

Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen

Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica

Scale your database traffic with Read & Write split using MySQL RouterMydbops

A Framework for Development in the AI AgeCprime

Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765

All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple

Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh

Dernier (20)

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes

Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...

Testing tools and AI - ideas what to try with some tool examples

Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx

Generative Artificial Intelligence: How generative AI works.pdf

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...

Scale your database traffic with Read & Write split using MySQL Router

A Framework for Development in the AI Age

Genislab builds better products and faster go-to-market with Lean project man...

Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration

All These Sophisticated Attacks, Can We Really Detect Them - PDF

UiPath Community: Communication Mining from Zero to Hero

[Webinar] SpiraTest - Setting New Standards in Quality Assurance

Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)

Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...

Glenn Lazarus- Why Your Observability Strategy Needs Security Observability

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

Generative AI - Gitex v1Generative AI - Gitex v1.pptx

Separating compound figures in journal articles to allow for subfigure classification

1. Institute of Information Systems Separating compound figures in journal articles to allow for subfigure classification Ajad Chhatkuli Antonio Foncubierta-Rodríguez Dimitrios Markonis Henning Müller

2. Motivation Institute of Information Systems • Figures in biomedical journals contain a lot of information • CBIR has been proposed for accessing medical literature • Modality classification • Improves accessibility • Allows result filtering • But 50% of figures are compound or multipanel

3. Aim Institute of Information Systems • Develop a system that separates compound figures in the biomedical literature • Visual-information only • Textual information is discarded • Modality-independent • One method for many images types • Many methods for few images types • Tunable according to the dataset • Large-scale tested • Approximately 250 open access journals

4. Compound figure examples Institute of Information Systems

5. Methods. Dataset Institute of Information Systems • 2982 manually classified figures from ImageCLEF 2012 dataset • Ground truth: • Image subclass: 2x1,1x2, • Position of separators

6. Methods. Overview Institute of Information Systems • Problem is separated in two • Find subfigure separator candidates • Preprocessing if required • Analyze candidates • Remove false positives • Rule-based decisions

7. Methods. Separator detection Institute of Information Systems • Based on minimum pixel projection for white-space separated figures • Horizontal  Vertical detection • Inverse order by rotation according to aspect ratio • Recursive

8. Methods. Separator detection Institute of Information Systems • Rule-based processing • Progressive truncation to remove labels if no separators are found • Text removal based on connected commponents if no separators are found • Complement image for black-space separations • Standard deviation image for subtle separations • Binarization of non-graph figures: • Less than 40% of the image is white or almost white

9. Methods. Separator analysis Institute of Information Systems • Classification problem • True/false separator • Features used: • Closeness to border, division ratio, standard deviation, text removal analysis, histogram, gap comparison • Classifiers: • SVM • Rule-based classifier

10. Results Institute of Information Systems

11. Successful examples Institute of Information Systems

12. Successful examples Institute of Information Systems

13. Unsuccessful examples Institute of Information Systems Not horizontal/vertical No separation gap separation

14. Conclusions future work Institute of Information Systems • Good results for a wide range of images • Using purely visual information • Separation problem: detection and analysis • Rule weights can be fine-tuned according to dataset • What would be the impact of a larger training set? • What would be the impact in existing modality classification accuracy?

15. Conclusions future work Institute of Information Systems • Good results for a wide range of images • Using purely visual information • Separation problem: detection and analysis • Rule weights can be fine-tuned according to dataset • What would be the impact of a larger training set? • What would be the impact in existing modality classification accuracy?

16. Institute of Information Systems Thanks for your attention! More information at http://medgift.hevs.ch Ajad Chhatkuli, Dimitrios Markonis, Antonio Foncubierta-Rodríguez, Fabrice Meriaudeau and Henning Müller, Separating compound figures in journal articles to allow for subfigure classification, in: SPIE, Medical Imaging, Orlando, FL, USA, 2013