Unsupervised Learning of Object Landmarks through Conditional Image Generation

•Télécharger en tant que PPTX, PDF•

0 j'aime•99 vues

哲

哲东郑

Bingwen Hu

Technologie

Unsupervised Learning of Object Landmarks
through Conditional Image Generation
Tomas Jakab1∗ Ankush Gupta1∗ Hakan Bilen2 Andrea Vedaldi1
1 Visual Geometry Group, University of Oxford
2 School of Informatics, University of Edinburgh
Advances in Neural Information Processing Systems (NeurIPS) 2018
Bingwen hu
2019-01-20

Goal
Learn semantically meaningful landmarks without any manual annotations.
It automatically learns from images or videos and works across different datasets of faces, humans,
and 3D objects.
Why to learn landmarks?
Low dimensional object representation
Interpretable
Why unsupervised?
Reduce dependency on expensive manual annotations
Leverage vast amount of videos available online

Architecture
Source image
Target image
appearance
encoding
unsupervised keypoint extraction
image
reconstruction
heatmap for each keypoint

(1) Heatmaps bottleneck
Then, each heatmap is replaced with Gaussian-like function centred at u*k with
a small fixed standard deviation

it provides a differentiable and distributed representation of the location of
landmarks.
 it restricts the information from the target image to spatial locations only

(2) Generator network using a perceptual loss
Where Γ(x) is an off-the-shelf pre-trained neural network, for
example VGG-19. Γl denotes the output of the l-th sub-network
 The perceptual loss compares a set of the activations extracted from multiple
layers of a deep network for both the reference and the generated images,
instead of the only raw pixel values.

Model details
• Landmark detection network: ingests the image x' to produce K
landmark heatmaps y'
It is composed of sequential blocks consisting of two convolutional.
The spatial size of the final output, outputting the heatmaps, is set to 16×16.
These K feature channels are then used to render 16×16×K 2D-Gaussian
maps y' (with σ = 0:1)
• Image generation network: input the image x and the landmarks
y' = Φ(x'), reconstructe x'
First, the image x is encoded as a feature tensor Z
Next, the features z and the landmarks y' are stacked to gether and fed to a
regressor that reconstructs the target frame x'.

Experiments——Learning human body landmarks

Experiments——Learning 3D object landmarks

Experiments——Disentangling appearance and geometry

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Contenu connexe

Tendances

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Universitat Politècnica de Catalunya

Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018Universitat Politècnica de Catalunya

ICRA 2015 interactive presentationSunando Sengupta

PCL (Point Cloud Library)University of Oklahoma

Convolutional Neural Networks on Graphs with Fast Localized Spectral FilteringSOYEON KIM

Data Challenges with 3D Computer VisionMartin Scholl

Understanding neural radiance fieldsVarun Bhaseen

Visual cryptographykiranlohakare2

Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...JinTaek Seo

Visual Hull Construction from Semitransparent Coloured Silhouettes ijcga

Introduction to 3D Computer Vision and Differentiable RenderingPreferred Networks

6 texture mapping computer graphicscairo university

Find nuclei in images with U-netDing Li

Objects as points (CenterNet) review [CDM]Dongmin Choi

Introduction to object detectionAmar Jindal

T01022103108IOSR Journals

Visual hull construction from semitransparent coloured silhouettesijcga

Fast Object Recognition from 3D Depth Data with Extreme Learning MachineSoma Boubou

3D Generalization Lenses (IV 2008)Matthias Trapp

Tendances (20)

Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)

Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018

ICRA 2015 interactive presentation

PCL (Point Cloud Library)

Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering

Data Challenges with 3D Computer Vision

Understanding neural radiance fields

Visual cryptography

Beginning direct3d gameprogramming01_thehistoryofdirect3dgraphics_20160407_ji...

Visual Hull Construction from Semitransparent Coloured Silhouettes

Introduction to 3D Computer Vision and Differentiable Rendering

6 texture mapping computer graphics

Find nuclei in images with U-net

Objects as points (CenterNet) review [CDM]

Introduction to object detection

T01022103108

Visual hull construction from semitransparent coloured silhouettes

Fast Object Recognition from 3D Depth Data with Extreme Learning Machine

3D Generalization Lenses (IV 2008)

Similaire à Unsupervised Learning of Object Landmarks through Conditional Image Generation

Lecture1Mobeen Mustafa

[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...Seiya Ito

Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...inside-BigData.com

MLIP - Chapter 6 - Generation, Super-Resolution, Style transferCharles Deledalle

Paper id 252014130IJRAT

Final PosterElizabeth Koshelev

Cj36511514IJERA Editor

Deferred Pixel Shading on the PLAYSTATION®3Slide_N

Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya

Mnist reportRaghunandanJairam

Optimal nonlocal means algorithm for denoising ultrasound imageAlexander Decker

11.optimal nonlocal means algorithm for denoising ultrasound imageAlexander Decker

mvitelli_ee367_final_reportMatt Vitelli

Mnist report pptRaghunandanJairam

Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...IDES Editor

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Universitat Politècnica de Catalunya

Convolutional Neural Network (CNN)of Deep Learningalihassaah1994

Module 1.pptxMattupallipardhu

Biometric simulator for visually impaired (1)Rahul Bhagat

UNetEliyaLaialy (2).pptxNoorUlHaq47

Similaire à Unsupervised Learning of Object Landmarks through Conditional Image Generation (20)

Lecture1

[論文紹介] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Ne...

Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...

MLIP - Chapter 6 - Generation, Super-Resolution, Style transfer

Paper id 252014130

Final Poster

Cj36511514

Deferred Pixel Shading on the PLAYSTATION®3

Attention Models (D3L6 2017 UPC Deep Learning for Computer Vision)

Mnist report

Optimal nonlocal means algorithm for denoising ultrasound image

11.optimal nonlocal means algorithm for denoising ultrasound image

mvitelli_ee367_final_report

Mnist report ppt

Random Valued Impulse Noise Removal in Colour Images using Adaptive Threshold...

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018

Convolutional Neural Network (CNN)of Deep Learning

Module 1.pptx

Biometric simulator for visually impaired (1)

UNetEliyaLaialy (2).pptx

Plus de 哲东郑

Deep learning for person re-identification哲东郑

Cross-domain complementary learning with synthetic data for multi-person part...哲东郑

Step zhedong哲东郑

Visual saliency哲东郑

Image Synthesis From Reconfigurable Layout and Style哲东郑

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东郑

Weijian image retrieval哲东郑

Scops self supervised co-part segmentation哲东郑

Video object detection哲东郑

Center nets哲东郑

C2 ae open set recognition哲东郑

Sota semantic segmentation哲东郑

Deep randomized embedding哲东郑

Semantic Image Synthesis with Spatially-Adaptive Normalization哲东郑

Instance level facial attributes transfer with geometry-aware flow哲东郑

Learning to adapt structured output space for semantic哲东郑

Graph based global reasoning networks 哲东郑

Style gan哲东郑

Vi2vi哲东郑

Variational Discriminator Bottleneck哲东郑

Plus de 哲东郑 (20)

Deep learning for person re-identification

Cross-domain complementary learning with synthetic data for multi-person part...

Step zhedong

Visual saliency

Image Synthesis From Reconfigurable Layout and Style

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval

Weijian image retrieval

Scops self supervised co-part segmentation

Video object detection

Center nets

C2 ae open set recognition

Sota semantic segmentation

Deep randomized embedding

Semantic Image Synthesis with Spatially-Adaptive Normalization

Instance level facial attributes transfer with geometry-aware flow

Learning to adapt structured output space for semantic

Graph based global reasoning networks

Style gan

Vi2vi

Variational Discriminator Bottleneck

Dernier

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services

Exploring Multimodal Embeddings with MilvusZilliz

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Why Teams call analytics are critical to your entire businesspanagenda

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@

Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

DBX First Quarter 2024 Investor PresentationDropbox

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Platformless Horizons for Digital AdaptabilityWSO2

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays

Dernier (20)

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Vector Search -An Introduction in Oracle Database 23ai.pptx

Exploring Multimodal Embeddings with Milvus

Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

presentation ICT roal in 21st century education

Why Teams call analytics are critical to your entire business

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...

Six Myths about Ontologies: The Basics of Formal Ontology

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

DBX First Quarter 2024 Investor Presentation

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Platformless Horizons for Digital Adaptability

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Unsupervised Learning of Object Landmarks through Conditional Image Generation

1. Unsupervised Learning of Object Landmarks through Conditional Image Generation Tomas Jakab1∗ Ankush Gupta1∗ Hakan Bilen2 Andrea Vedaldi1 1 Visual Geometry Group, University of Oxford 2 School of Informatics, University of Edinburgh Advances in Neural Information Processing Systems (NeurIPS) 2018 Bingwen hu 2019-01-20

2. Goal Learn semantically meaningful landmarks without any manual annotations. It automatically learns from images or videos and works across different datasets of faces, humans, and 3D objects. Why to learn landmarks? Low dimensional object representation Interpretable Why unsupervised? Reduce dependency on expensive manual annotations Leverage vast amount of videos available online

3. Architecture Source image Target image appearance encoding unsupervised keypoint extraction image reconstruction heatmap for each keypoint

4. Method

5. (1) Heatmaps bottleneck Then, each heatmap is replaced with Gaussian-like function centred at u*k with a small fixed standard deviation

6. it provides a differentiable and distributed representation of the location of landmarks.  it restricts the information from the target image to spatial locations only

7. (2) Generator network using a perceptual loss Where Γ(x) is an off-the-shelf pre-trained neural network, for example VGG-19. Γl denotes the output of the l-th sub-network  The perceptual loss compares a set of the activations extracted from multiple layers of a deep network for both the reference and the generated images, instead of the only raw pixel values.

8. Model details • Landmark detection network: ingests the image x' to produce K landmark heatmaps y' It is composed of sequential blocks consisting of two convolutional. The spatial size of the final output, outputting the heatmaps, is set to 16×16. These K feature channels are then used to render 16×16×K 2D-Gaussian maps y' (with σ = 0:1) • Image generation network: input the image x and the landmarks y' = Φ(x'), reconstructe x' First, the image x is encoded as a feature tensor Z Next, the features z and the landmarks y' are stacked to gether and fed to a regressor that reconstructs the target frame x'.

10.  Experiments

11. Experiments——Learning facial landmarks

12. Experiments——Learning human body landmarks

13. Experiments——Learning 3D object landmarks

14. Experiments——Disentangling appearance and geometry

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Unsupervised Learning of Object Landmarks through Conditional Image Generation

Similaire à Unsupervised Learning of Object Landmarks through Conditional Image Generation (20)

Plus de 哲东郑

Plus de 哲东郑 (20)

Dernier

Dernier (20)

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Unsupervised Learning of Object Landmarks through Conditional Image Generation

Similaire à Unsupervised Learning of Object Landmarks through Conditional Image Generation (20)

Plus de 哲东 郑

Plus de 哲东 郑 (20)

Dernier

Dernier (20)

Unsupervised Learning of Object Landmarks through Conditional Image Generation

Plus de 哲东郑

Plus de 哲东郑 (20)