PISA Production, Indexing and Search of Audio-visual Material

•

0 likes•424 views

This document summarizes several computer vision and audio analysis techniques demonstrated by the PISA group, including shot cut detection, scene segmentation, video reuse detection, face detection, and face recognition. The face detection method uses skin color segmentation and shape analysis to find face candidates with 92% accuracy. Face recognition is performed using a 3D morphable model fitted to 2D images.

Technology News & Politics

PISA
Production, Indexing and Search
of Audio-visual Material
Image Processing
Tinne Tuytelaars, IBBT – PSI – K.U.Leuven

Computer Assisted Analysis

! Intelligent analysis = reverse engineering

! Shot cut detection demo
! Scene segmentation demo
! Video reuse detection demo
! Face detection
! Face recognition demo
! Audio classification demo

61

Shot cut detection

= Split the video stream in atomic units, corresponding to a
continuously moving camera
! Distinguish between abrupt and smooth shotcuts
! Experimented with different methods
! Using color histograms
! Using affine motion compensation
! Using motion estimation within the compressed domain

62

Video reuse detection

The same video material is often reused
! can be detected automatically
! Robust to post-processing
! Efficiency
! Based on spatio-temporal local features and locality
sensitive hashing

69

Face Detection

• Face candidates selection:
quot; Candidate regions have skin color # region-based skin
segmentation
quot; Personalized chrominance skin boundary

• Verification based on cues:
quot; Shape of ellipse
quot; Ellipse-filling percentage
quot; Gray-tone smoothness
quot; Corners of the facial features

70

Face detection results

• 92% good face detections
quot;Comparable to state-of-the-art face learning of ‘Viola & Jones’ obtains 93%
quot;adaptation to lighting conditions and personal face looks

71

Face recognition

! Based on a 3D morphable model
! 3D model is fitted to 2D image
! Shape and texture parameters used as face descriptor
! Robust to
! viewpoint changes,
! illumination changes,
! partial occlusions.

72

Future work

! Include facial expressions in face recognition
! Multi-modal scene segmentation
! Feedback loop from Trouvaille
! Object or scene recognition
! Thesaurus-based speech recognition

79

Recently uploaded

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3

Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll

Decarbonising Buildings: Making a net-zero built environment a realityIES VE

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

From Family Reminiscence to Scholarly Archive .Alan Dix

UiPath Community: Communication Mining from Zero to HeroUiPathCommunity

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney

Manual 508 Accessibility Compliance AuditSkynet Technologies

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra

How to write a Business Continuity PlanDatabarracks

Scale your database traffic with Read & Write split using MySQL RouterMydbops

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3

A Journey Into the Emotions of Software DevelopersNicole Novielli

Recently uploaded (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx

Emixa Mendix Meetup 11 April 2024 about Mendix Native development

Decarbonising Buildings: Making a net-zero built environment a reality

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

From Family Reminiscence to Scholarly Archive .

UiPath Community: Communication Mining from Zero to Hero

Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...

Manual 508 Accessibility Compliance Audit

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024

Generative Artificial Intelligence: How generative AI works.pdf

What is DBT - The Ultimate Data Build Tool.pdf

Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx

DevEX - reference for building teams, processes, and platforms

[Webinar] SpiraTest - Setting New Standards in Quality Assurance

How to write a Business Continuity Plan

Scale your database traffic with Read & Write split using MySQL Router

Generative AI for Technical Writer or Information Developers

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx

A Journey Into the Emotions of Software Developers

PISA Production, Indexing and Search of Audio-visual Material

1. PISA Production, Indexing and Search of Audio-visual Material Image Processing Tinne Tuytelaars, IBBT – PSI – K.U.Leuven

2. Computer Assisted Analysis ! Intelligent analysis = reverse engineering ! Shot cut detection demo ! Scene segmentation demo ! Video reuse detection demo ! Face detection ! Face recognition demo ! Audio classification demo 61

3. Shot cut detection = Split the video stream in atomic units, corresponding to a continuously moving camera ! Distinguish between abrupt and smooth shotcuts ! Experimented with different methods ! Using color histograms ! Using affine motion compensation ! Using motion estimation within the compressed domain 62

4. Scene segmentation 68

5. Video reuse detection The same video material is often reused ! can be detected automatically ! Robust to post-processing ! Efficiency ! Based on spatio-temporal local features and locality sensitive hashing 69

6. Face Detection • Face candidates selection: quot; Candidate regions have skin color # region-based skin segmentation quot; Personalized chrominance skin boundary • Verification based on cues: quot; Shape of ellipse quot; Ellipse-filling percentage quot; Gray-tone smoothness quot; Corners of the facial features 70

7. Face detection results • 92% good face detections quot;Comparable to state-of-the-art face learning of ‘Viola & Jones’ obtains 93% quot;adaptation to lighting conditions and personal face looks 71

8. Face recognition ! Based on a 3D morphable model ! 3D model is fitted to 2D image ! Shape and texture parameters used as face descriptor ! Robust to ! viewpoint changes, ! illumination changes, ! partial occlusions. 72

9. Face recognition 73

10. Face recognition 74

11. Face recognition 75

12. Face recognition 76

13. Face recognition 77

14. Future work ! Include facial expressions in face recognition ! Multi-modal scene segmentation ! Feedback loop from Trouvaille ! Object or scene recognition ! Thesaurus-based speech recognition 79

PISA Production, Indexing and Search of Audio-visual Material

Recommended

Recommended

More Related Content

More from vrt-medialab

More from vrt-medialab (20)

Recently uploaded

Recently uploaded (20)

PISA Production, Indexing and Search of Audio-visual Material