Introduction to capsule networks

•

2 j'aime•285 vues

My presentation for Kharkiv AI club about capsule networks. Introduction to capsule networks theory, basics. Links, references, explanations of capsules and routing

Technologie

KHARKIV AI CLUB
Andrii Babii, Ph.D. - Kharkiv National University of Radio Electronics
apratster@gmail.com
Introduction to capsule networks

2
Before we start
Hinton’s articles:
•Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
•Matrix capsules with EM routing - Hinton, G. E., Sabour, S. and Frosst, N. (2018)
•Optimizing Neural Networks that Generate Images - Tijmen Tieleman's disseration
•Transforming Auto-encoders - Hinton, G. E., Krizhevsky, A. and Wang, S. D. (2011)
•A parallel computation that assigns canonical object-based frames of reference. -
Hinton, G.E. (1981)
•Shape representation in parallel systems - Hinton, G.E. (1981)
Glossary (by Sebastian Kwiatkowski)
http://www.aisummary.com/blog/capsule-networks-glossary/
Implementation
https://github.com/Sarasra/models/tree/master/research/capsules
For article “Dynamic Routing Between Capsules”
and more implementations links and info:
https://github.com/sekwiatkowski/awesome-capsule-networks

3
Equivariance and invariance
Maximum value m
Rotate !
Maximum value m will be invariant to rotation, it is same.
Coordinates of the point with maximum value m, will vary "equally" with the distortion.
Coordinates of maximum will be equivariant

4
Convolutional network
Convolution
https://hcordeirodotcom.files.wordpress.com/2012/02/atividade-2-kernel-convolution.jpg

5
Convolutional network
Feature map
http://www.cs.toronto.edu/~ranzato/research/projects.html

6
Convolutional network
Pooling layer
http://cs231n.github.io/convolutional-networks/

7
What wrong with convolutional networks?
Geoffrey Hinton talk "What is wrong with
convolutional neural nets ?"
https://www.youtube.com/watch?v=rTawFwUvnLE
“Feature extraction levels are interleaved with subsampling layers that pool the
outputs of nearby feature detectors of the same type”
Slide:
-It is a bad fit to the psychology of shape perception: It does not explain why we
assign intrinsic coordinate frames to objects and why they have such huge effects
- It solves the wrong problem: We want equivariance, not invariance. Disentangling
rather than discarding.
- It fails to use the underlying linear structure: It does not make use of the natural
linear manifold that perfectly handles the largest source of variance in images
-Pooling is a poor way to do dynamic routing: We need to route each part of the
input to the neurons that know how to deal with it. Finding the best routing is
equivalent to parsing the image

8
How it can be improved ?
Capsules + Dynamic Routing
What we want to store:
Is this feature detected? (Probability)
Attributes of feature: pose (position, size, orientation), deformation, velocity, albedo,
hue, texture, etc.
From scalar values to vectors!

9
Capsule
https://github.com/naturomics/CapsNet-Tensorflow/blob/master/imgs/capsuleVSneuron.png

10
Squash
Squash function:
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

11
Squash
Squash function:
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

12
Routing
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

15
How it can be solved ?
Architecture of simple Capsule Net for MNIST:
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

16
Examples
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
(MNIST)
Fashion MNIST
https://github.com/XifengGuo/CapsNet-Fashion-MNIST
Capsule Deep Neural Network for Recognition of Historical Graffiti Handwriting
N Gordienko, Y Kochura, V Taran, G Peng
Food detection, face detection and lot more:

+ Good preliminary results (MNIST).
+ Requires less training data.
+ Works good with overlapping objects.
+ Can detect partially visible objects.
+ Results are interpretable, components hierarchy can be mapped.
+ Equivariance
- No known yet accuracy on large data sources CIFAR10? Accuracy seems low.
- Really slow training time (so far).
- Non linear squashing reflect the probability nature not so good as we want.
- Cosine of an angle for measuring the agreement?
Advantages and disadvantages

Contenu connexe

Tendances

Density based clusteringYaswanthHariKumarVud

Capsule networksJaehyeon Park

Knowledge Representation, Inference and ReasoningSagacious IT Solution

Feature Extractionskylian

Uncertainty in Deep LearningRoberto Pereira Silveira

Coco datasetChenafter1990

[PR12] Capsule Networks - Jaejun YooJaeJun Yoo

DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook

CNN Machine learning DeepLearningAbhishek Sharma

Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini

data generalization and summarization janani thirupathi

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn

CNN Algorithmgeorgejustymirobi1

Ch 7 Knowledge Representation.pdfKrishnaMadala1

Spider Monkey Optimization AlgorithmAhmed Fouad Ali

Introduction to Knowledge Graphs and Semantic AISemantic Web Company

Graph neural network #2-2 (heterogeneous graph transformer)seungwoo kim

(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BARThyunyoung Lee

DBSCAN : A Clustering AlgorithmPınar Yahşi

Fuzzy SetEhsan Hamzei

Tendances (20)

Density based clustering

Capsule networks

Knowledge Representation, Inference and Reasoning

Feature Extraction

Uncertainty in Deep Learning

Coco dataset

[PR12] Capsule Networks - Jaejun Yoo

DBSCAN (2014_11_25 06_21_12 UTC)

CNN Machine learning DeepLearning

Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio

data generalization and summarization

Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...

CNN Algorithm

Ch 7 Knowledge Representation.pdf

Spider Monkey Optimization Algorithm

Introduction to Knowledge Graphs and Semantic AI

Graph neural network #2-2 (heterogeneous graph transformer)

(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART

DBSCAN : A Clustering Algorithm

Fuzzy Set

Similaire à Introduction to capsule networks

Theoretical Neuroscience and Deep Learning TheoryMLReview

SCT courseNeuroPoly

Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering

UTS workshop talkLei Wang

EWIC talk - 07 June, 2018University of Huddersfield

DLD_WeightSharing_SlideKang-Ho Lee

Capsules Neural Network: BasicsTahmina Zebin

Rsc educational bellec_parcellationPierre Bellec

Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach IJECEIAES

Ei25822827IJERA Editor

Kefed introduction 12-06-10-0043Gully Burns

3D Reconstruction of Peripheral Nerves Based on Calcium Chloride Enhanced Mic...ANALYTICAL AND QUANTITATIVE CYTOPATHOLOGY AND HISTOPATHOLOGY

Network Biology: A paradigm for modeling biological complex systemsGanesh Bagler

Fcv bio cv_cottrellzukun

Content-based Image RetrievalUniversity of Zurich

CARLsim 3: Concepts, Tools, and ApplicationsMichael Beyeler

Kefed introduction 12-05-10-2224Gully Burns

Why Neurons have thousands of synapses? A model of sequence memory in the brainNumenta

Individual functional atlasing of the human brain with multitask fMRI data: l...Ana Luísa Pinho

Similaire à Introduction to capsule networks (20)

Theoretical Neuroscience and Deep Learning Theory

SCT course

Modeling perceptual similarity and shift invariance in deep networks

UTS workshop talk

EWIC talk - 07 June, 2018

DLD_WeightSharing_Slide

Capsules Neural Network: Basics

Rsc educational bellec_parcellation

Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach

Ei25822827

Kefed introduction 12-06-10-0043

3D Reconstruction of Peripheral Nerves Based on Calcium Chloride Enhanced Mic...

Network Biology: A paradigm for modeling biological complex systems

Fcv bio cv_cottrell

Content-based Image Retrieval

CARLsim 3: Concepts, Tools, and Applications

Kefed introduction 12-05-10-2224

Why Neurons have thousands of synapses? A model of sequence memory in the brain

Individual functional atlasing of the human brain with multitask fMRI data: l...

Dernier

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

How to convert PDF to text with Nanonetsnaman860154

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

A Domino Admins Adventures (Engage 2024)Gabriella Davis

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Dernier (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

My Hashitalk Indonesia April 2024 Presentation

Breaking the Kubernetes Kill Chain: Host Path Mount

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

How to convert PDF to text with Nanonets

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

How to Troubleshoot Apps for the Modern Connected Worker

Unblocking The Main Thread Solving ANRs and Frozen Frames

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Understanding the Laravel MVC Architecture

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

Scaling API-first – The story of a global engineering organization

Data Cloud, More than a CDP by Matt Robison

A Domino Admins Adventures (Engage 2024)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Introduction to capsule networks

1. KHARKIV AI CLUB Andrii Babii, Ph.D. - Kharkiv National University of Radio Electronics apratster@gmail.com Introduction to capsule networks

2. 2 Before we start Hinton’s articles: •Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017) •Matrix capsules with EM routing - Hinton, G. E., Sabour, S. and Frosst, N. (2018) •Optimizing Neural Networks that Generate Images - Tijmen Tieleman's disseration •Transforming Auto-encoders - Hinton, G. E., Krizhevsky, A. and Wang, S. D. (2011) •A parallel computation that assigns canonical object-based frames of reference. - Hinton, G.E. (1981) •Shape representation in parallel systems - Hinton, G.E. (1981) Glossary (by Sebastian Kwiatkowski) http://www.aisummary.com/blog/capsule-networks-glossary/ Implementation https://github.com/Sarasra/models/tree/master/research/capsules For article “Dynamic Routing Between Capsules” and more implementations links and info: https://github.com/sekwiatkowski/awesome-capsule-networks

3. 3 Equivariance and invariance Maximum value m Rotate ! Maximum value m will be invariant to rotation, it is same. Coordinates of the point with maximum value m, will vary "equally" with the distortion. Coordinates of maximum will be equivariant

4. 4 Convolutional network Convolution https://hcordeirodotcom.files.wordpress.com/2012/02/atividade-2-kernel-convolution.jpg

5. 5 Convolutional network Feature map http://www.cs.toronto.edu/~ranzato/research/projects.html

6. 6 Convolutional network Pooling layer http://cs231n.github.io/convolutional-networks/

7. 7 What wrong with convolutional networks? Geoffrey Hinton talk "What is wrong with convolutional neural nets ?" https://www.youtube.com/watch?v=rTawFwUvnLE “Feature extraction levels are interleaved with subsampling layers that pool the outputs of nearby feature detectors of the same type” Slide: -It is a bad fit to the psychology of shape perception: It does not explain why we assign intrinsic coordinate frames to objects and why they have such huge effects - It solves the wrong problem: We want equivariance, not invariance. Disentangling rather than discarding. - It fails to use the underlying linear structure: It does not make use of the natural linear manifold that perfectly handles the largest source of variance in images -Pooling is a poor way to do dynamic routing: We need to route each part of the input to the neurons that know how to deal with it. Finding the best routing is equivalent to parsing the image

8. 8 How it can be improved ? Capsules + Dynamic Routing What we want to store: Is this feature detected? (Probability) Attributes of feature: pose (position, size, orientation), deformation, velocity, albedo, hue, texture, etc. From scalar values to vectors!

9. 9 Capsule https://github.com/naturomics/CapsNet-Tensorflow/blob/master/imgs/capsuleVSneuron.png

10. 10 Squash Squash function: Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

11. 11 Squash Squash function: Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

12. 12 Routing Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

13. 13 Routing

14. 14 Routing ? Agree from both capsules

15. 15 How it can be solved ? Architecture of simple Capsule Net for MNIST: Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)

16. 16 Examples Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017) (MNIST) Fashion MNIST https://github.com/XifengGuo/CapsNet-Fashion-MNIST Capsule Deep Neural Network for Recognition of Historical Graffiti Handwriting N Gordienko, Y Kochura, V Taran, G Peng Food detection, face detection and lot more:

17. + Good preliminary results (MNIST). + Requires less training data. + Works good with overlapping objects. + Can detect partially visible objects. + Results are interpretable, components hierarchy can be mapped. + Equivariance - No known yet accuracy on large data sources CIFAR10? Accuracy seems low. - Really slow training time (so far). - Non linear squashing reflect the probability nature not so good as we want. - Cosine of an angle for measuring the agreement? Advantages and disadvantages

18. 18 Questions, Discussion…

Introduction to capsule networks

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Introduction to capsule networks

Similaire à Introduction to capsule networks (20)

Dernier

Dernier (20)

Introduction to capsule networks