SlideShare une entreprise Scribd logo
1  sur  18
Télécharger pour lire hors ligne
KHARKIV AI CLUB
Andrii Babii, Ph.D. - Kharkiv National University of Radio Electronics
apratster@gmail.com
Introduction to capsule networks
2
Before we start
Hinton’s articles:
•Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
•Matrix capsules with EM routing - Hinton, G. E., Sabour, S. and Frosst, N. (2018)
•Optimizing Neural Networks that Generate Images - Tijmen Tieleman's disseration
•Transforming Auto-encoders - Hinton, G. E., Krizhevsky, A. and Wang, S. D. (2011)
•A parallel computation that assigns canonical object-based frames of reference. -
Hinton, G.E. (1981)
•Shape representation in parallel systems - Hinton, G.E. (1981)
Glossary (by Sebastian Kwiatkowski)
http://www.aisummary.com/blog/capsule-networks-glossary/
Implementation
https://github.com/Sarasra/models/tree/master/research/capsules
For article “Dynamic Routing Between Capsules”
and more implementations links and info:
https://github.com/sekwiatkowski/awesome-capsule-networks
3
Equivariance and invariance
Maximum value m
Rotate !
Maximum value m will be invariant to rotation, it is same.
Coordinates of the point with maximum value m, will vary "equally" with the distortion.
Coordinates of maximum will be equivariant
4
Convolutional network
Convolution
https://hcordeirodotcom.files.wordpress.com/2012/02/atividade-2-kernel-convolution.jpg
5
Convolutional network
Feature map
http://www.cs.toronto.edu/~ranzato/research/projects.html
6
Convolutional network
Pooling layer
http://cs231n.github.io/convolutional-networks/
7
What wrong with convolutional networks?
Geoffrey Hinton talk "What is wrong with
convolutional neural nets ?"
https://www.youtube.com/watch?v=rTawFwUvnLE
“Feature extraction levels are interleaved with subsampling layers that pool the
outputs of nearby feature detectors of the same type”
Slide:
-It is a bad fit to the psychology of shape perception: It does not explain why we
assign intrinsic coordinate frames to objects and why they have such huge effects
- It solves the wrong problem: We want equivariance, not invariance. Disentangling
rather than discarding.
- It fails to use the underlying linear structure: It does not make use of the natural
linear manifold that perfectly handles the largest source of variance in images
-Pooling is a poor way to do dynamic routing: We need to route each part of the
input to the neurons that know how to deal with it. Finding the best routing is
equivalent to parsing the image
8
How it can be improved ?
Capsules + Dynamic Routing
What we want to store:
Is this feature detected? (Probability)
Attributes of feature: pose (position, size, orientation), deformation, velocity, albedo,
hue, texture, etc.
From scalar values to vectors!
9
Capsule
https://github.com/naturomics/CapsNet-Tensorflow/blob/master/imgs/capsuleVSneuron.png
10
Squash
Squash function:
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
11
Squash
Squash function:
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
12
Routing
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
13
Routing
14
Routing
?
Agree from both capsules
15
How it can be solved ?
Architecture of simple Capsule Net for MNIST:
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
16
Examples
Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
(MNIST)
Fashion MNIST
https://github.com/XifengGuo/CapsNet-Fashion-MNIST
Capsule Deep Neural Network for Recognition of Historical Graffiti Handwriting
N Gordienko, Y Kochura, V Taran, G Peng
Food detection, face detection and lot more:
+ Good preliminary results (MNIST).
+ Requires less training data.
+ Works good with overlapping objects.
+ Can detect partially visible objects.
+ Results are interpretable, components hierarchy can be mapped.
+ Equivariance
- No known yet accuracy on large data sources CIFAR10? Accuracy seems low.
- Really slow training time (so far).
- Non linear squashing reflect the probability nature not so good as we want.
- Cosine of an angle for measuring the agreement?
Advantages and disadvantages
18
Questions, Discussion…

Contenu connexe

Tendances

Knowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and ReasoningKnowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and ReasoningSagacious IT Solution
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
[PR12] Capsule Networks - Jaejun Yoo
[PR12] Capsule Networks - Jaejun Yoo[PR12] Capsule Networks - Jaejun Yoo
[PR12] Capsule Networks - Jaejun YooJaeJun Yoo
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)Cory Cook
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearningAbhishek Sharma
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioMarina Santini
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization janani thirupathi
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Ch 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdfCh 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdfKrishnaMadala1
 
Spider Monkey Optimization Algorithm
Spider Monkey Optimization AlgorithmSpider Monkey Optimization Algorithm
Spider Monkey Optimization AlgorithmAhmed Fouad Ali
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AISemantic Web Company
 
Graph neural network #2-2 (heterogeneous graph transformer)
Graph neural network #2-2 (heterogeneous graph transformer)Graph neural network #2-2 (heterogeneous graph transformer)
Graph neural network #2-2 (heterogeneous graph transformer)seungwoo kim
 
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BARThyunyoung Lee
 
DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmPınar Yahşi
 

Tendances (20)

Density based clustering
Density based clusteringDensity based clustering
Density based clustering
 
Capsule networks
Capsule networksCapsule networks
Capsule networks
 
Knowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and ReasoningKnowledge Representation, Inference and Reasoning
Knowledge Representation, Inference and Reasoning
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Uncertainty in Deep Learning
Uncertainty in Deep LearningUncertainty in Deep Learning
Uncertainty in Deep Learning
 
Coco dataset
Coco datasetCoco dataset
Coco dataset
 
[PR12] Capsule Networks - Jaejun Yoo
[PR12] Capsule Networks - Jaejun Yoo[PR12] Capsule Networks - Jaejun Yoo
[PR12] Capsule Networks - Jaejun Yoo
 
DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)DBSCAN (2014_11_25 06_21_12 UTC)
DBSCAN (2014_11_25 06_21_12 UTC)
 
CNN Machine learning DeepLearning
CNN Machine learning DeepLearningCNN Machine learning DeepLearning
CNN Machine learning DeepLearning
 
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain RatioLecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
 
data generalization and summarization
data generalization and summarization data generalization and summarization
data generalization and summarization
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
CNN Algorithm
CNN AlgorithmCNN Algorithm
CNN Algorithm
 
Ch 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdfCh 7 Knowledge Representation.pdf
Ch 7 Knowledge Representation.pdf
 
Spider Monkey Optimization Algorithm
Spider Monkey Optimization AlgorithmSpider Monkey Optimization Algorithm
Spider Monkey Optimization Algorithm
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
Graph neural network #2-2 (heterogeneous graph transformer)
Graph neural network #2-2 (heterogeneous graph transformer)Graph neural network #2-2 (heterogeneous graph transformer)
Graph neural network #2-2 (heterogeneous graph transformer)
 
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
(Presentation)NLP Pretraining models based on deeplearning -BERT, GPT, and BART
 
DBSCAN : A Clustering Algorithm
DBSCAN : A Clustering AlgorithmDBSCAN : A Clustering Algorithm
DBSCAN : A Clustering Algorithm
 
Fuzzy Set
Fuzzy SetFuzzy Set
Fuzzy Set
 

Similaire à Introduction to capsule networks

Theoretical Neuroscience and Deep Learning Theory
Theoretical Neuroscience and Deep Learning TheoryTheoretical Neuroscience and Deep Learning Theory
Theoretical Neuroscience and Deep Learning TheoryMLReview
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
 
UTS workshop talk
UTS workshop talkUTS workshop talk
UTS workshop talkLei Wang
 
DLD_WeightSharing_Slide
DLD_WeightSharing_SlideDLD_WeightSharing_Slide
DLD_WeightSharing_SlideKang-Ho Lee
 
Capsules Neural Network: Basics
Capsules Neural Network: BasicsCapsules Neural Network: Basics
Capsules Neural Network: BasicsTahmina Zebin
 
Rsc educational bellec_parcellation
Rsc educational bellec_parcellationRsc educational bellec_parcellation
Rsc educational bellec_parcellationPierre Bellec
 
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach IJECEIAES
 
Kefed introduction 12-06-10-0043
Kefed introduction 12-06-10-0043Kefed introduction 12-06-10-0043
Kefed introduction 12-06-10-0043Gully Burns
 
Network Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsNetwork Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsGanesh Bagler
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrellzukun
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrellzukun
 
CARLsim 3: Concepts, Tools, and Applications
CARLsim 3: Concepts, Tools, and ApplicationsCARLsim 3: Concepts, Tools, and Applications
CARLsim 3: Concepts, Tools, and ApplicationsMichael Beyeler
 
Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224Gully Burns
 
Why Neurons have thousands of synapses? A model of sequence memory in the brain
Why Neurons have thousands of synapses? A model of sequence memory in the brainWhy Neurons have thousands of synapses? A model of sequence memory in the brain
Why Neurons have thousands of synapses? A model of sequence memory in the brainNumenta
 
Individual functional atlasing of the human brain with multitask fMRI data: l...
Individual functional atlasing of the human brain with multitask fMRI data: l...Individual functional atlasing of the human brain with multitask fMRI data: l...
Individual functional atlasing of the human brain with multitask fMRI data: l...Ana Luísa Pinho
 

Similaire à Introduction to capsule networks (20)

Theoretical Neuroscience and Deep Learning Theory
Theoretical Neuroscience and Deep Learning TheoryTheoretical Neuroscience and Deep Learning Theory
Theoretical Neuroscience and Deep Learning Theory
 
SCT course
SCT courseSCT course
SCT course
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networks
 
UTS workshop talk
UTS workshop talkUTS workshop talk
UTS workshop talk
 
EWIC talk - 07 June, 2018
EWIC talk - 07 June, 2018EWIC talk - 07 June, 2018
EWIC talk - 07 June, 2018
 
DLD_WeightSharing_Slide
DLD_WeightSharing_SlideDLD_WeightSharing_Slide
DLD_WeightSharing_Slide
 
Capsules Neural Network: Basics
Capsules Neural Network: BasicsCapsules Neural Network: Basics
Capsules Neural Network: Basics
 
Rsc educational bellec_parcellation
Rsc educational bellec_parcellationRsc educational bellec_parcellation
Rsc educational bellec_parcellation
 
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
Bat-Cluster: A Bat Algorithm-based Automated Graph Clustering Approach
 
Ei25822827
Ei25822827Ei25822827
Ei25822827
 
Kefed introduction 12-06-10-0043
Kefed introduction 12-06-10-0043Kefed introduction 12-06-10-0043
Kefed introduction 12-06-10-0043
 
3D Reconstruction of Peripheral Nerves Based on Calcium Chloride Enhanced Mic...
3D Reconstruction of Peripheral Nerves Based on Calcium Chloride Enhanced Mic...3D Reconstruction of Peripheral Nerves Based on Calcium Chloride Enhanced Mic...
3D Reconstruction of Peripheral Nerves Based on Calcium Chloride Enhanced Mic...
 
Network Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systemsNetwork Biology: A paradigm for modeling biological complex systems
Network Biology: A paradigm for modeling biological complex systems
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrell
 
Fcv bio cv_cottrell
Fcv bio cv_cottrellFcv bio cv_cottrell
Fcv bio cv_cottrell
 
Content-based Image Retrieval
Content-based Image RetrievalContent-based Image Retrieval
Content-based Image Retrieval
 
CARLsim 3: Concepts, Tools, and Applications
CARLsim 3: Concepts, Tools, and ApplicationsCARLsim 3: Concepts, Tools, and Applications
CARLsim 3: Concepts, Tools, and Applications
 
Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224Kefed introduction 12-05-10-2224
Kefed introduction 12-05-10-2224
 
Why Neurons have thousands of synapses? A model of sequence memory in the brain
Why Neurons have thousands of synapses? A model of sequence memory in the brainWhy Neurons have thousands of synapses? A model of sequence memory in the brain
Why Neurons have thousands of synapses? A model of sequence memory in the brain
 
Individual functional atlasing of the human brain with multitask fMRI data: l...
Individual functional atlasing of the human brain with multitask fMRI data: l...Individual functional atlasing of the human brain with multitask fMRI data: l...
Individual functional atlasing of the human brain with multitask fMRI data: l...
 

Dernier

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Dernier (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Introduction to capsule networks

  • 1. KHARKIV AI CLUB Andrii Babii, Ph.D. - Kharkiv National University of Radio Electronics apratster@gmail.com Introduction to capsule networks
  • 2. 2 Before we start Hinton’s articles: •Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017) •Matrix capsules with EM routing - Hinton, G. E., Sabour, S. and Frosst, N. (2018) •Optimizing Neural Networks that Generate Images - Tijmen Tieleman's disseration •Transforming Auto-encoders - Hinton, G. E., Krizhevsky, A. and Wang, S. D. (2011) •A parallel computation that assigns canonical object-based frames of reference. - Hinton, G.E. (1981) •Shape representation in parallel systems - Hinton, G.E. (1981) Glossary (by Sebastian Kwiatkowski) http://www.aisummary.com/blog/capsule-networks-glossary/ Implementation https://github.com/Sarasra/models/tree/master/research/capsules For article “Dynamic Routing Between Capsules” and more implementations links and info: https://github.com/sekwiatkowski/awesome-capsule-networks
  • 3. 3 Equivariance and invariance Maximum value m Rotate ! Maximum value m will be invariant to rotation, it is same. Coordinates of the point with maximum value m, will vary "equally" with the distortion. Coordinates of maximum will be equivariant
  • 7. 7 What wrong with convolutional networks? Geoffrey Hinton talk "What is wrong with convolutional neural nets ?" https://www.youtube.com/watch?v=rTawFwUvnLE “Feature extraction levels are interleaved with subsampling layers that pool the outputs of nearby feature detectors of the same type” Slide: -It is a bad fit to the psychology of shape perception: It does not explain why we assign intrinsic coordinate frames to objects and why they have such huge effects - It solves the wrong problem: We want equivariance, not invariance. Disentangling rather than discarding. - It fails to use the underlying linear structure: It does not make use of the natural linear manifold that perfectly handles the largest source of variance in images -Pooling is a poor way to do dynamic routing: We need to route each part of the input to the neurons that know how to deal with it. Finding the best routing is equivalent to parsing the image
  • 8. 8 How it can be improved ? Capsules + Dynamic Routing What we want to store: Is this feature detected? (Probability) Attributes of feature: pose (position, size, orientation), deformation, velocity, albedo, hue, texture, etc. From scalar values to vectors!
  • 10. 10 Squash Squash function: Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
  • 11. 11 Squash Squash function: Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
  • 12. 12 Routing Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
  • 15. 15 How it can be solved ? Architecture of simple Capsule Net for MNIST: Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017)
  • 16. 16 Examples Dynamic Routing Between Capsules - Sabour, S., Frosst, N. and Hinton, G.E. (2017) (MNIST) Fashion MNIST https://github.com/XifengGuo/CapsNet-Fashion-MNIST Capsule Deep Neural Network for Recognition of Historical Graffiti Handwriting N Gordienko, Y Kochura, V Taran, G Peng Food detection, face detection and lot more:
  • 17. + Good preliminary results (MNIST). + Requires less training data. + Works good with overlapping objects. + Can detect partially visible objects. + Results are interpretable, components hierarchy can be mapped. + Equivariance - No known yet accuracy on large data sources CIFAR10? Accuracy seems low. - Really slow training time (so far). - Non linear squashing reflect the probability nature not so good as we want. - Cosine of an angle for measuring the agreement? Advantages and disadvantages