[PR12] PixelRNN- Jaejun Yoo

•

5 j'aime•2,288 vues

JaeJun Yoo

Introduction to pixel recurrent neural network (PixelRNN) and PixelCNN (Korean) Video: https://youtu.be/BvcwEz4VPIQ

Technologie

Pixel Recurrent Neural Network
PR12와 함께 이해하는
Jaejun Yoo
Ph.D. Candidate @KAIST
PR12
16TH July, 2017

GENERATIVE MODEL!
NIPS 2016 Tutorial: GANs

MASK
Channel Masks
• Sequential order: R -> G -> B
• Used in input-to-state convolutions
• Two types of masks:
• Channels are connected to themselves
• Used in all other subsequent layers
• Channels are not connected to themselves
• Only used in the first layer

EXPERIMENT AND RESULTS
• Dataset: MNIST, CIFAR-10, and ImageNet
• Method: log-likelihood

Prerequisite: KL Divergence
WIKI: Information theory

EXPERIMENT AND RESULTS
“Lower is better”

EXPERIMENT AND RESULTS
CIFAR-10 (32x32) ImageNet (32x32)

• Deep neural network that sequentially predicts the pixels in an image
along the two spatial dimensions
• Suggests three novel architectures to achieve this goal
• PixelRNN (Row LSTM, Diagonal BiLSTM), PixelCNN (All Convolutional Net)
• Combined with efficient convolution preprocessing, both PixelCNN and
PixelRNN use this “product of conditionals” approach to great effect.
• The model provides a tractable P(x) with the best log-likelihood scores
in its family.
• Image generation of good quality and diversity
SUMMARY

Reference
• Slides: Hugo Larochelle, Google Brain: Autoregressive Generative Models
with Deep Learning
• Slides: http://slazebni.cs.illinois.edu/spring17/lec13_advanced.pdf
• Slides: https://www.slideshare.net/neouyghur/pixel-recurrent-neural-
networks-73970786
• Code: https://github.com/igul222/pixel_rnn/blob/master/pixel_rnn.py
• Related Paper “Generating images with recurrent adversarial networks”
• Related Paper “WaveNet”

Recommandé

Pixel Recurrent Neural Networksneouyghur

Pixel RNN to Pixel CNN++Dongheon Lee

[DL輪読会]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...Deep Learning JP

[DL輪読会]A System for General In-Hand Object Re-OrientationDeep Learning JP

文献紹介：Benchmarking Neural Network Robustness to Common Corruptions and Perturb...Toru Tamaki

Model compressionNanhee Kim

Introduction to Capsule Networks (CapsNets)Aurélien Géron

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya

Recommandé

Pixel Recurrent Neural Networksneouyghur

Pixel RNN to Pixel CNN++Dongheon Lee

[DL輪読会]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...Deep Learning JP

[DL輪読会]A System for General In-Hand Object Re-OrientationDeep Learning JP

文献紹介：Benchmarking Neural Network Robustness to Common Corruptions and Perturb...Toru Tamaki

Model compressionNanhee Kim

Introduction to Capsule Networks (CapsNets)Aurélien Géron

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya

A brief introduction to recent segmentation methodsShunta Saito

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP

Diffusion models beat gans on image synthesisBeerenSahu

diffusion 모델부터 DALLE2까지.pdf수철 박

Image anomaly detection with generative adversarial networksSakshiSingh480

【DL輪読会】Generative models for molecular discovery: Recent advances and challengesDeep Learning JP

FastDepth: Fast Monocular Depth Estimation on Embedded Systemsharmonylab

“Introduction to DNN Model Compression Techniques,” a Presentation from XailientEdge AI and Vision Alliance

Metric Learning 세미나.pptxDongkyunKim17

[DL輪読会]Ensemble Distribution DistillationDeep Learning JP

PR-355: Masked Autoencoders Are Scalable Vision LearnersJinwon Lee

文献紹介：Swin Transformer: Hierarchical Vision Transformer Using Shifted WindowsToru Tamaki

【DL輪読会】DayDreamer: World Models for Physical Robot LearningDeep Learning JP

GANの概要とDCGANのアーキテクチャ／アルゴリズムHirosaji

[DL輪読会]End-to-end Recovery of Human Shape and PoseDeep Learning JP

“How Transformers are Changing the Direction of Deep Learning Architectures,”...Edge AI and Vision Alliance

Tutorial on Deep Generative ModelsMLReview

Cs231n 2017 lecture13 Generative ModelYanbin Kong

[DL輪読会]MetaFormer is Actually What You Need for VisionDeep Learning JP

Relative attributesTakuya Minagawa

Chainer OpenPOWER developer congress HandsON 20170522_otaPreferred Networks

Model-Based Reinforcement Learning @NIPS2017mooopan

Contenu connexe

Tendances

A brief introduction to recent segmentation methodsShunta Saito

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving PlannersDeep Learning JP

Diffusion models beat gans on image synthesisBeerenSahu

diffusion 모델부터 DALLE2까지.pdf수철 박

Image anomaly detection with generative adversarial networksSakshiSingh480

【DL輪読会】Generative models for molecular discovery: Recent advances and challengesDeep Learning JP

FastDepth: Fast Monocular Depth Estimation on Embedded Systemsharmonylab

“Introduction to DNN Model Compression Techniques,” a Presentation from XailientEdge AI and Vision Alliance

Metric Learning 세미나.pptxDongkyunKim17

[DL輪読会]Ensemble Distribution DistillationDeep Learning JP

PR-355: Masked Autoencoders Are Scalable Vision LearnersJinwon Lee

文献紹介：Swin Transformer: Hierarchical Vision Transformer Using Shifted WindowsToru Tamaki

【DL輪読会】DayDreamer: World Models for Physical Robot LearningDeep Learning JP

GANの概要とDCGANのアーキテクチャ／アルゴリズムHirosaji

[DL輪読会]End-to-end Recovery of Human Shape and PoseDeep Learning JP

“How Transformers are Changing the Direction of Deep Learning Architectures,”...Edge AI and Vision Alliance

Tutorial on Deep Generative ModelsMLReview

Cs231n 2017 lecture13 Generative ModelYanbin Kong

[DL輪読会]MetaFormer is Actually What You Need for VisionDeep Learning JP

Relative attributesTakuya Minagawa

Tendances (20)

A brief introduction to recent segmentation methods

【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners

Diffusion models beat gans on image synthesis

diffusion 모델부터 DALLE2까지.pdf

Image anomaly detection with generative adversarial networks

【DL輪読会】Generative models for molecular discovery: Recent advances and challenges

FastDepth: Fast Monocular Depth Estimation on Embedded Systems

“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient

Metric Learning 세미나.pptx

[DL輪読会]Ensemble Distribution Distillation

PR-355: Masked Autoencoders Are Scalable Vision Learners

文献紹介：Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows

【DL輪読会】DayDreamer: World Models for Physical Robot Learning

GANの概要とDCGANのアーキテクチャ／アルゴリズム

[DL輪読会]End-to-end Recovery of Human Shape and Pose

“How Transformers are Changing the Direction of Deep Learning Architectures,”...

Tutorial on Deep Generative Models

Cs231n 2017 lecture13 Generative Model

[DL輪読会]MetaFormer is Actually What You Need for Vision

Relative attributes

Similaire à [PR12] PixelRNN- Jaejun Yoo

Chainer OpenPOWER developer congress HandsON 20170522_otaPreferred Networks

Model-Based Reinforcement Learning @NIPS2017mooopan

Use CNN for Sequence ModelingDongang (Sean) Wang

CNNs: from the Basics to Recent AdvancesDmytro Mishkin

LSST Solar System Science: MOPS Status, the Science, and Your QuestionsMario Juric

DLD meetup 2017, Efficient Deep LearningBrodmann17

Mask R-CNNJaehyun Jun

PR12-193 NISP: Pruning Networks using Neural Importance Score PropagationTaesu Kim

(Research Note) Delving deeper into convolutional neural networks for camera ...Jacky Liu

ViT (Vision Transformer) Review [CDM]Dongmin Choi

OBDPC 2022klepsydratechnologie

Deep learning and image analytics using Python by Dr SanparitBAINIDA

Developing Computational Skills in the Sciences with Matlab Webinar 2017SERC at Carleton College

TMS workshop on machine learning in materials science: Intro to deep learning...BrianDeCost

A Generalization of Transformer Networks to Graphs.pptxssuser2624f71

Single Image Super Resolution using Fuzzy Deep Convolutional NetworksGreeshma M.S.R

Vlsi implimentation of a cost efficient near-lossless cfa image compression f...Shafeek Basheer

Transformer in VisionSangmin Woo

Panoptic Segmentation @CVPR2019Kousuke Kuzuoka

Applications of Neural NetworksMichael Motoki

Similaire à [PR12] PixelRNN- Jaejun Yoo (20)

Chainer OpenPOWER developer congress HandsON 20170522_ota

Model-Based Reinforcement Learning @NIPS2017

Use CNN for Sequence Modeling

CNNs: from the Basics to Recent Advances

LSST Solar System Science: MOPS Status, the Science, and Your Questions

DLD meetup 2017, Efficient Deep Learning

Mask R-CNN

PR12-193 NISP: Pruning Networks using Neural Importance Score Propagation

(Research Note) Delving deeper into convolutional neural networks for camera ...

ViT (Vision Transformer) Review [CDM]

OBDPC 2022

Deep learning and image analytics using Python by Dr Sanparit

Developing Computational Skills in the Sciences with Matlab Webinar 2017

TMS workshop on machine learning in materials science: Intro to deep learning...

A Generalization of Transformer Networks to Graphs.pptx

Single Image Super Resolution using Fuzzy Deep Convolutional Networks

Vlsi implimentation of a cost efficient near-lossless cfa image compression f...

Transformer in Vision

Panoptic Segmentation @CVPR2019

Applications of Neural Networks

Plus de JaeJun Yoo

[PR12] Generative Models as Distributions of FunctionsJaeJun Yoo

[CVPR2020] Simple but effective image enhancement techniquesJaeJun Yoo

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...JaeJun Yoo

Super resolution in deep learning era - Jaejun YooJaeJun Yoo

A beginner's guide to Style Transfer and recent trendsJaeJun Yoo

[PR12] Spectral Normalization for Generative Adversarial NetworksJaeJun Yoo

Introduction to ambient GANJaeJun Yoo

[PR12] categorical reparameterization with gumbel softmaxJaeJun Yoo

[PR12] understanding deep learning requires rethinking generalizationJaeJun Yoo

[PR12] Capsule Networks - Jaejun YooJaeJun Yoo

[PR12] Inception and Xception - Jaejun YooJaeJun Yoo

[Pr12] dann jaejun yooJaeJun Yoo

Variants of GANs - Jaejun YooJaeJun Yoo

[PR12] intro. to gans jaejun yooJaeJun Yoo

Plus de JaeJun Yoo (14)

[PR12] Generative Models as Distributions of Functions

[CVPR2020] Simple but effective image enhancement techniques

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Anal...

Super resolution in deep learning era - Jaejun Yoo

A beginner's guide to Style Transfer and recent trends

[PR12] Spectral Normalization for Generative Adversarial Networks

Introduction to ambient GAN

[PR12] categorical reparameterization with gumbel softmax

[PR12] understanding deep learning requires rethinking generalization

[PR12] Capsule Networks - Jaejun Yoo

[PR12] Inception and Xception - Jaejun Yoo

[Pr12] dann jaejun yoo

Variants of GANs - Jaejun Yoo

[PR12] intro. to gans jaejun yoo

Dernier

MINDCTI Revenue Release Quarter One 2024MIND CTI

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays

Exploring Multimodal Embeddings with MilvusZilliz

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

[BuildWithAI] Introduction to Gemini.pdfSandro Moreira

FWD Group - Insurer Innovation Award 2024The Digital Insurer

Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

DBX First Quarter 2024 Investor PresentationDropbox

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh

DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

ICT role in 21st century education and its challengesrafiqahmad00786416

Dernier (20)

MINDCTI Revenue Release Quarter One 2024

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...

Exploring Multimodal Embeddings with Milvus

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

[BuildWithAI] Introduction to Gemini.pdf

FWD Group - Insurer Innovation Award 2024

Spring Boot vs Quarkus the ultimate battle - DevoxxUK

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

AWS Community Day CPH - Three problems of Terraform

DBX First Quarter 2024 Investor Presentation

EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER

Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

ICT role in 21st century education and its challenges

[PR12] PixelRNN- Jaejun Yoo

1. Pixel Recurrent Neural Network PR12와 함께 이해하는 Jaejun Yoo Ph.D. Candidate @KAIST PR12 16TH July, 2017

2. Pixel Recurrent Neural Network PR12와 함께 이해하는 Jaejun Yoo Ph.D. Candidate @KAIST PR12 16TH July, 2017

3. Pixel Recurrent Neural Network PR12와 함께 이해하는 Jaejun Yoo Ph.D. Candidate @KAIST PR12 16TH July, 2017

4. GENERATIVE MODEL! NIPS 2016 Tutorial: GANs

8. MASK

9. MASK Channel Masks • Sequential order: R -> G -> B • Used in input-to-state convolutions • Two types of masks: • Channels are connected to themselves • Used in all other subsequent layers • Channels are not connected to themselves • Only used in the first layer

10. Receptive Fields

11.

12.

13. Architecture

14. Architecture

15. Row LSTM

16. Row LSTM : Receptive Field

17. Architecture

18. Diagonal LSTM

19.

20.

21. Diagonal BiLSTM : Receptive Field

22.

23.

24.

25. EXPERIMENT AND RESULTS

26. EXPERIMENT AND RESULTS • Dataset: MNIST, CIFAR-10, and ImageNet • Method: log-likelihood

27.

28. Prerequisite: KL Divergence WIKI: Information theory

29. Prerequisite: MLE & KL Divergence

30. EXPERIMENT AND RESULTS

31. EXPERIMENT AND RESULTS “Lower is better”

32. EXPERIMENT AND RESULTS

33. EXPERIMENT AND RESULTS

34. EXPERIMENT AND RESULTS CIFAR-10 (32x32) ImageNet (32x32)

35. • Deep neural network that sequentially predicts the pixels in an image along the two spatial dimensions • Suggests three novel architectures to achieve this goal • PixelRNN (Row LSTM, Diagonal BiLSTM), PixelCNN (All Convolutional Net) • Combined with efficient convolution preprocessing, both PixelCNN and PixelRNN use this “product of conditionals” approach to great effect. • The model provides a tractable P(x) with the best log-likelihood scores in its family. • Image generation of good quality and diversity SUMMARY

36. Reference • Slides: Hugo Larochelle, Google Brain: Autoregressive Generative Models with Deep Learning • Slides: http://slazebni.cs.illinois.edu/spring17/lec13_advanced.pdf • Slides: https://www.slideshare.net/neouyghur/pixel-recurrent-neural- networks-73970786 • Code: https://github.com/igul222/pixel_rnn/blob/master/pixel_rnn.py • Related Paper “Generating images with recurrent adversarial networks” • Related Paper “WaveNet”

37. GRAN