SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
Image Synthesis From
Reconfigurable Layout and Style
Weijian Deng
Outline
1. Introduction to Image Synthesis from Layout
a. Task
b. Background knowledge
2. Solution: A layout- and style-based architecture (LostGAN)
3. Experiment
1. Introduction to Image Synthesis from Layout
Image Generation from Layout
Layout
Image
Image to Image Translation
Image to Image Translation
Image to Image Translation
Pixel-pixel GAN
CVPR 2017
Image Generation from Reconfigurable Layout and Style
Reconfigurablity
1. Image style reconfiguration
2. Object style reconfiguration
3. Layout reconfiguration
Image Generation from Reconfigurable Layout and Style
Reconfigurablity
1. Image style reconfiguration
Image Generation from Reconfigurable Layout and Style
Reconfigurablity
2. Object style reconfiguration
Change the style at instance-level.
Image Generation from Reconfigurable Layout and Style
Reconfigurablity
3. Layout reconfiguration
 Change the location of bbox
 Add/ delete bbox
2. Solution: A layout- and style-based
architecture (LostGAN)
Image Generation from Reconfigurable Layout and Style
Reconfigurablity
1. Image style reconfiguration
2. Object style reconfiguration
3. Layout reconfiguration
StyleGAN
A layout- and style-based architecture (LostGAN)
A layout- and style-based architecture (LostGAN)
A layout- and style-based architecture (LostGAN)
1. Label Embedding
2. Object instance-specific projection
3. Mask prediction
4. ISLA γ and β computation
Embedding matrix
× +
γ
β
A layout- and style-based architecture (LostGAN)
1. Label Embedding
Embedding matrix
A layout- and style-based architecture (LostGAN)
1. Label Embedding
2. Object instance-specific projection
Embedding matrix
A layout- and style-based architecture (LostGAN)
1. Label Embedding
2. Object instance-specific projection
3. Mask prediction
Embedding matrix
A layout- and style-based architecture (LostGAN)
1. Label Embedding
2. Object instance-specific projection
3. Mask prediction
4. ISLA γ and β computation
Embedding matrix
A layout- and style-based architecture (LostGAN)
A layout- and style-based architecture (LostGAN)
Experiment
Experiment
Experiment
Train resnet-101 on cropped objects from generated images (generate
five samples for each layout) and evaluate on objects from real images
Experiment
Image Generation from Reconfigurable Layout and Style
Reconfigurablity
1. Image style reconfiguration
2. Object style reconfiguration
3. Layout reconfiguration
Thanks!
Semantic Image Synthesis with Spatially-Adaptive Normalization
Zhujunyan CVPR 2019
Semantic Image Synthesis with Spatially-Adaptive Normalization
Zhujunyan CVPR 2019

Contenu connexe

Plus de 哲东 郑

Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification哲东 郑
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...哲东 郑
 
Visual saliency
Visual saliencyVisual saliency
Visual saliency哲东 郑
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东 郑
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval哲东 郑
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation哲东 郑
 
Video object detection
Video object detectionVideo object detection
Video object detection哲东 郑
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition哲东 郑
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation哲东 郑
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding哲东 郑
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization哲东 郑
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow哲东 郑
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic哲东 郑
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation哲东 郑
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks 哲东 郑
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck哲东 郑
 

Plus de 哲东 郑 (20)

Deep learning for person re-identification
Deep learning for person re-identificationDeep learning for person re-identification
Deep learning for person re-identification
 
Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...Cross-domain complementary learning with synthetic data for multi-person part...
Cross-domain complementary learning with synthetic data for multi-person part...
 
Step zhedong
Step zhedongStep zhedong
Step zhedong
 
Visual saliency
Visual saliencyVisual saliency
Visual saliency
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation
 
Video object detection
Video object detectionVideo object detection
Video object detection
 
Center nets
Center netsCenter nets
Center nets
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks
 
Style gan
Style ganStyle gan
Style gan
 
Vi2vi
Vi2viVi2vi
Vi2vi
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck
 

Dernier

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Dernier (20)

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Image Synthesis From Reconfigurable Layout and Style