SlideShare une entreprise Scribd logo
1  sur  49
Visual Attention: Detecting Saliency on Images Vicente Ordonez Department of Computer Science State University of New York Stony Brook, NY 11790
I will be working mainly on the following paper Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. (Xian Jiaotong University and Microsoft Research Asia) from CVPR 2007.  http://research.microsoft.com/en-us/um/people/jiansun/papers/SalientDetection_CVPR07.pdf
What is Saliency? What is Visual Attention? “Everyone knows what attention is...” —William James, 1890
This is a problem of… Arbitrary object detection? Background / Foreground segmentation? Modeling Visual Attention?
The Method Features:  Multiscale Contrast    (Done!) Center surround histogram   (Mostly Done!) (Done!) Color spatial distribution (Done!) Supervised learning using Conditional Random Fields to determine the parameters to combine the features obtained above.  (Done!) [I will use a labeled dataset of 5000 images provided by Microsoft Research Asia!]
Multiscale Contrast Function Generate the Gaussian Pyramid for the input image. For each level in the pyramid  Do gaussian blurring Do resampling I’m using a 6 levels Gaussian pyramid for each RGB channel.
How a Gaussian pyramid looks like Figure from David Forsyth
Generate contrast maps for each level of the Pyramid. Sum all of the results to produce the final multiscale contrast map. The two steps mentioned above are described in this formula: Multiscale Contrast Function
Input image
Contrast maps
Contrast maps Original image Contrast map at level 1 Contrast map at level 4 Contrast map at level 6
Multiscale Contrast Map Output
Center Surround Histogram Feature ,[object Object]
For each possible rectangle with a reasonable size and aspect ratio
Create a surrounding rectangle and calculate the histogram of the rectangle and the surrounding area.
Pick and record the rectangle that maximizes the Chi-Square distance between the two histograms calculated above and also record the Chi-Square distance.,[object Object]
Center Surround Histogram Feature The algorithm as described before is computationally expensive…  It is required to use a technique called Integral Histogram. It allows you fast calculation of the histogram of any given rectangular region of an image. The algorithm was introduced in: “Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces” by FatihPorikli, Mitsubishi Electric Research Lab in CVPR 2005.
Center Surround Histogram Feature Use the Chi Square Distances Map and the Map of Most Salient Rectangle Regions per pixel to generate the Center Surround Histogram Feature using the next formula:
Center Surround Histogram Results Using my Implementation        (15.2 sec, size = 245x384) Results Reported in the Paper
Center Surround Histogram Results Using my Implementation        (13.6 sec, size = 247x346) Results Reported in the Paper
Center Surround Histogram Results Using my Implementation        (10.2 sec, size = 248x277)
More Results
More Results
More results
More Results
More Results
More Results
More Results
More Results
More Results
More Results
More Results
Color Spatial Distribution
Color Spatial Distribution Make an initial clustering of the colors in the image using k-means.  Further refine the clusters by using Gaussian Mixture Models. The Gaussian Mixture Model parameters are calculated using the EM algorithm. I am using 5 clusters (5 colors) per image. And the results look similar to those presented in the paper with an execution time of around 17 seconds per image.
Color Spatial Distribution Calculate the vertical variance of the horizontal positions of the pixels for each cluster. And then the same for the vertical positions.  Sum the variances and use this value to weight more those clusters with less spatial variance. Penalize the clusters that contain the majority of its pixels away from the center of the image.
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Combine Features Together
Conditional Random Field Training and Inference Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent S Vishwanathan, N. Schraudolph, M. Schmidt, K. Murphy. ICML'06 (Intl Conf on Machine Learning).  I did the training using this toolbox from the above paper: http://people.cs.ubc.ca/~murphyk/Software/CRF/crf.html
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                      Combined features                    Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                      Combined features                    Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                 Combined features        Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                 Combined features        Ground truth

Contenu connexe

Tendances

Mid point line Algorithm - Computer Graphics
Mid point line Algorithm - Computer GraphicsMid point line Algorithm - Computer Graphics
Mid point line Algorithm - Computer GraphicsDrishti Bhalla
 
Image filtering in Digital image processing
Image filtering in Digital image processingImage filtering in Digital image processing
Image filtering in Digital image processingAbinaya B
 
Pixel Relationships Examples
Pixel Relationships ExamplesPixel Relationships Examples
Pixel Relationships ExamplesMarwa Ahmeid
 
Camera model ‫‬
Camera model ‫‬Camera model ‫‬
Camera model ‫‬Fatima Radi
 
4.intensity transformations
4.intensity transformations4.intensity transformations
4.intensity transformationsYahya Alkhaldi
 
Hidden surfaces
Hidden surfacesHidden surfaces
Hidden surfacesMohd Arif
 
study Seam Carving For Content Aware Image Resizing
study Seam Carving For Content Aware Image Resizingstudy Seam Carving For Content Aware Image Resizing
study Seam Carving For Content Aware Image ResizingChiamin Hsu
 
Smoothing Filters in Spatial Domain
Smoothing Filters in Spatial DomainSmoothing Filters in Spatial Domain
Smoothing Filters in Spatial DomainMadhu Bala
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processingAhmed Daoud
 
Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdf
Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdfDigital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdf
Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdfssuserbe3944
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit NotesAAKANKSHA JAIN
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transformpiyush_11
 

Tendances (20)

Mid point line Algorithm - Computer Graphics
Mid point line Algorithm - Computer GraphicsMid point line Algorithm - Computer Graphics
Mid point line Algorithm - Computer Graphics
 
Image filtering in Digital image processing
Image filtering in Digital image processingImage filtering in Digital image processing
Image filtering in Digital image processing
 
BRESENHAM’S LINE DRAWING ALGORITHM
BRESENHAM’S  LINE DRAWING ALGORITHMBRESENHAM’S  LINE DRAWING ALGORITHM
BRESENHAM’S LINE DRAWING ALGORITHM
 
Pixel Relationships Examples
Pixel Relationships ExamplesPixel Relationships Examples
Pixel Relationships Examples
 
Camera model ‫‬
Camera model ‫‬Camera model ‫‬
Camera model ‫‬
 
4.intensity transformations
4.intensity transformations4.intensity transformations
4.intensity transformations
 
Image Processing ppt
Image Processing pptImage Processing ppt
Image Processing ppt
 
Hidden surfaces
Hidden surfacesHidden surfaces
Hidden surfaces
 
study Seam Carving For Content Aware Image Resizing
study Seam Carving For Content Aware Image Resizingstudy Seam Carving For Content Aware Image Resizing
study Seam Carving For Content Aware Image Resizing
 
03 image transform
03 image transform03 image transform
03 image transform
 
Smoothing Filters in Spatial Domain
Smoothing Filters in Spatial DomainSmoothing Filters in Spatial Domain
Smoothing Filters in Spatial Domain
 
fractals
fractalsfractals
fractals
 
image enhancement
 image enhancement image enhancement
image enhancement
 
Chapter 9 morphological image processing
Chapter 9   morphological image processingChapter 9   morphological image processing
Chapter 9 morphological image processing
 
Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdf
Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdfDigital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdf
Digital Image Processing 3rd edition Rafael C. Gonzalez, Richard E. Woods.pdf
 
Morphological operations
Morphological operationsMorphological operations
Morphological operations
 
3D Display Technology: VDC-Whitepaper
3D Display Technology: VDC-Whitepaper3D Display Technology: VDC-Whitepaper
3D Display Technology: VDC-Whitepaper
 
Image processing second unit Notes
Image processing second unit NotesImage processing second unit Notes
Image processing second unit Notes
 
discrete wavelet transform
discrete wavelet transformdiscrete wavelet transform
discrete wavelet transform
 
Unit ii
Unit iiUnit ii
Unit ii
 

En vedette

Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetectionJie Feng
 
Salient Point Detection
Salient Point DetectionSalient Point Detection
Salient Point DetectionTylerTK
 
Visual attention
Visual attentionVisual attention
Visual attentionannakalme
 
Visual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMVisual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMInteractive Metronome
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performanceOlivier Le Meur
 

En vedette (6)

Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetection
 
Salient Point Detection
Salient Point DetectionSalient Point Detection
Salient Point Detection
 
Visual attention
Visual attentionVisual attention
Visual attention
 
Visual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMVisual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IM
 
Chris Atherton at TCUK09
Chris Atherton at TCUK09Chris Atherton at TCUK09
Chris Atherton at TCUK09
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performance
 

Similaire à Visual Saliency: Learning to Detect Salient Objects

Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMeetupDataScienceRoma
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfSofianeHassine2
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14Ashish Mundhra
 
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...CSCJournals
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phoneshabeebsab
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingYu Huang
 
Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Visionothersk46
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
A Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting TechniquesA Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting Techniquesijsrd.com
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTIRJET Journal
 
Design and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewDesign and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewsipij
 
Super Resolution of Image
Super Resolution of ImageSuper Resolution of Image
Super Resolution of ImageSatheesh K
 
Remotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmRemotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmKriti Bajpai
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUEScscpconf
 

Similaire à Visual Saliency: Learning to Detect Salient Objects (20)

Praseed Pai
Praseed PaiPraseed Pai
Praseed Pai
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdf
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14
 
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
 
Lw3620362041
Lw3620362041Lw3620362041
Lw3620362041
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phones
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Vision
 
Normal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IKNormal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IK
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
IEEE ICAPR 2009
IEEE ICAPR 2009IEEE ICAPR 2009
IEEE ICAPR 2009
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
A Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting TechniquesA Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting Techniques
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
 
Design and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewDesign and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of view
 
Super Resolution of Image
Super Resolution of ImageSuper Resolution of Image
Super Resolution of Image
 
Remotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmRemotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acm
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
 
Av4301248253
Av4301248253Av4301248253
Av4301248253
 

Plus de Vicente Ordonez

From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesVicente Ordonez
 
Data-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsData-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsVicente Ordonez
 
Im2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsIm2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsVicente Ordonez
 
Contenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosContenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosVicente Ordonez
 
Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Vicente Ordonez
 
Sistema de Recuperacion de Audio
Sistema de Recuperacion de AudioSistema de Recuperacion de Audio
Sistema de Recuperacion de AudioVicente Ordonez
 
Transmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetTransmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetVicente Ordonez
 
Buscadores de Podcast en Internet
Buscadores de Podcast en InternetBuscadores de Podcast en Internet
Buscadores de Podcast en InternetVicente Ordonez
 
Portal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsPortal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsVicente Ordonez
 

Plus de Vicente Ordonez (16)

From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level Categories
 
Data-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsData-driven Generation of Image Descriptions
Data-driven Generation of Image Descriptions
 
Im2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsIm2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned Photographs
 
Texture Synthesis
Texture SynthesisTexture Synthesis
Texture Synthesis
 
Contenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosContenido Generado Por Los Usuarios
Contenido Generado Por Los Usuarios
 
Pantallas Plasma vs LCD
Pantallas Plasma vs LCDPantallas Plasma vs LCD
Pantallas Plasma vs LCD
 
Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009
 
Sistema de Recuperacion de Audio
Sistema de Recuperacion de AudioSistema de Recuperacion de Audio
Sistema de Recuperacion de Audio
 
Suenaemprendevive
SuenaemprendeviveSuenaemprendevive
Suenaemprendevive
 
MapReduce
MapReduceMapReduce
MapReduce
 
Robotica
RoboticaRobotica
Robotica
 
Transmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetTransmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / Internet
 
Buscadores de Podcast en Internet
Buscadores de Podcast en InternetBuscadores de Podcast en Internet
Buscadores de Podcast en Internet
 
Sistemas Operativos 3D
Sistemas Operativos 3DSistemas Operativos 3D
Sistemas Operativos 3D
 
Ajax Atlas
Ajax AtlasAjax Atlas
Ajax Atlas
 
Portal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsPortal Concepts and .NET Webparts
Portal Concepts and .NET Webparts
 

Dernier

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Dernier (20)

"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Visual Saliency: Learning to Detect Salient Objects

  • 1. Visual Attention: Detecting Saliency on Images Vicente Ordonez Department of Computer Science State University of New York Stony Brook, NY 11790
  • 2. I will be working mainly on the following paper Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. (Xian Jiaotong University and Microsoft Research Asia) from CVPR 2007. http://research.microsoft.com/en-us/um/people/jiansun/papers/SalientDetection_CVPR07.pdf
  • 3. What is Saliency? What is Visual Attention? “Everyone knows what attention is...” —William James, 1890
  • 4. This is a problem of… Arbitrary object detection? Background / Foreground segmentation? Modeling Visual Attention?
  • 5. The Method Features: Multiscale Contrast (Done!) Center surround histogram (Mostly Done!) (Done!) Color spatial distribution (Done!) Supervised learning using Conditional Random Fields to determine the parameters to combine the features obtained above. (Done!) [I will use a labeled dataset of 5000 images provided by Microsoft Research Asia!]
  • 6. Multiscale Contrast Function Generate the Gaussian Pyramid for the input image. For each level in the pyramid Do gaussian blurring Do resampling I’m using a 6 levels Gaussian pyramid for each RGB channel.
  • 7. How a Gaussian pyramid looks like Figure from David Forsyth
  • 8. Generate contrast maps for each level of the Pyramid. Sum all of the results to produce the final multiscale contrast map. The two steps mentioned above are described in this formula: Multiscale Contrast Function
  • 11. Contrast maps Original image Contrast map at level 1 Contrast map at level 4 Contrast map at level 6
  • 13.
  • 14. For each possible rectangle with a reasonable size and aspect ratio
  • 15. Create a surrounding rectangle and calculate the histogram of the rectangle and the surrounding area.
  • 16.
  • 17. Center Surround Histogram Feature The algorithm as described before is computationally expensive… It is required to use a technique called Integral Histogram. It allows you fast calculation of the histogram of any given rectangular region of an image. The algorithm was introduced in: “Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces” by FatihPorikli, Mitsubishi Electric Research Lab in CVPR 2005.
  • 18. Center Surround Histogram Feature Use the Chi Square Distances Map and the Map of Most Salient Rectangle Regions per pixel to generate the Center Surround Histogram Feature using the next formula:
  • 19. Center Surround Histogram Results Using my Implementation (15.2 sec, size = 245x384) Results Reported in the Paper
  • 20. Center Surround Histogram Results Using my Implementation (13.6 sec, size = 247x346) Results Reported in the Paper
  • 21. Center Surround Histogram Results Using my Implementation (10.2 sec, size = 248x277)
  • 34. Color Spatial Distribution Make an initial clustering of the colors in the image using k-means. Further refine the clusters by using Gaussian Mixture Models. The Gaussian Mixture Model parameters are calculated using the EM algorithm. I am using 5 clusters (5 colors) per image. And the results look similar to those presented in the paper with an execution time of around 17 seconds per image.
  • 35. Color Spatial Distribution Calculate the vertical variance of the horizontal positions of the pixels for each cluster. And then the same for the vertical positions. Sum the variances and use this value to weight more those clusters with less spatial variance. Penalize the clusters that contain the majority of its pixels away from the center of the image.
  • 45. Conditional Random Field Training and Inference Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent S Vishwanathan, N. Schraudolph, M. Schmidt, K. Murphy. ICML'06 (Intl Conf on Machine Learning).  I did the training using this toolbox from the above paper: http://people.cs.ubc.ca/~murphyk/Software/CRF/crf.html
  • 46. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 47. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 48. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 49. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 50. Precision / Recall obtained
  • 51. Some Conclusions The results of the original research paper on computing the visual features have been successfully replicated in a considerable extent. The Conditional Random Field framework used in this project turned out to perform well for this task. The center-surround histogram map turned out to be the feature that gave the higher precision. The amount of time required for computing the individual features is in the order of several seconds.

Notes de l'éditeur

  1. Not so good result
  2. Good result
  3. Not so good result