SlideShare une entreprise Scribd logo
1  sur  18
Introducing sparsity in artificial neural
networks:
a sensitivity-based approach
ENZO TARTAGLIONE
POSTDOC AT UNIVERSITY OF TORINO
Universita’ degli Studi
Di Torino
Computer Science Dept.
EIDOS group
Deep networks
• High number of hidden layers
• More complex classification tasks
(ImageNet)
• Use of convolutional layers, pooling layers,
very large fully-connected layers
• Very high number of parameters (hundreds
of millions and even more…)
• Is it possible to boost the performance
making the ANN robust to noise?
2
STATE OF THE ART
Size of ANN models vs generalization
3
STATE OF THE ART
Approaches to reduce the size of an ANN
4
Quantization [Zhou et al., 2016] [Han et al., 2015-1]
Modify the architecture [Howard et al., 2017]
Regularize and prune to achieve sparsity
STATE OF THE ART
Why sparse networks?
5
Less memory required.
Less comp. resources.
Deployability on embedded
devices.
STATE OF THE ART
Typical architectures are
overparametrized!
Some existing pruning strategies…
6
Design a proxy L0 regularizer [Louizos et al., 2018]
Greedy thresholding after L2+dropout strategy [Han et al., 2015-2]
Grouping for convolutional features[Lebedev and Lempitsky, 2016] [Hadifar et al., 2020]
Dropout-based approaches [Srivastava et al., 2014]
Lasso-based regularizers [Scardapane et al., 2017]
…
STATE OF THE ART
When is a parameter necessary?
7
.
.
.
.
I
N
P
U
T
O
U
T
P
U
T
𝑦
𝑤
Changing w we change
the output of the
network… we don’t want
to modify it!
Changing w we do not
change the output of the
network… we are free to
change it!
Forward Propagation
PUBLISHED
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
PRE-TRAINED MODEL
Definition of sensitivity
Small perturbation of w
where C is the size of the output and 𝛼 𝑘 a weight scalar factor.
8
Δ𝑦 𝑘 ≈ Δ𝑤𝑖
𝜕𝑦 𝑘
𝜕𝑤𝑖
𝑆 𝒚, 𝑤𝑖 =
𝑘=1
𝐶
𝛼 𝑘
𝜕𝑦 𝑘
𝜕𝑤𝑖
PUBLISHED
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
Towards the definition of the update term
We need an insensitivity parameter:
To guarantee this quantity always being positive… trivial choice:
9
Value Importance
𝑆
𝑆
𝑤 ?
?
𝑆 𝒚, 𝑤𝑖 = 1 − 𝑆(𝒚, 𝑤𝑖)
𝑆 𝑏 𝒚, 𝑤𝑖 = max 0, 𝑆 𝒚, 𝑤𝑖
PUBLISHED
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
Weight update proposed:
where
• 𝑆 𝑏 is a function which states whether the parameter is relevant or not to the computation of
the output y of the network.
• 𝒚 is the output of the network.
• 𝑤𝑖 is the parameter.
• 𝐿 is a generic loss function.
Sensitivity-based regularization
10
𝑤𝑖
𝑡
≔ 𝑤𝑖
𝑡−1
− 𝜂
𝜕𝐿
𝜕𝑤𝑖
𝑡−1 − 𝜆𝑤𝑖
𝑡−1
𝑆 𝑏(𝒚, 𝑤𝑖
𝑡−1
)
PUBLISHED
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
Which function are we minimizing?
We need to solve the integral
Math, math, math….
valid for any architecture and any loss function, but for ReLU-activated networks…
11
PUBLISHED
𝑅 = 𝑤 ⋅ 𝑆 𝑏 𝒚, 𝑤 𝑑𝑤
𝑅 = Θ 𝑆 𝒚, 𝑤
𝑤2
2
1 −
𝑘=1
𝐶
𝛼 𝑘 𝑠𝑖𝑔𝑛
𝜕𝑦 𝑘
𝜕𝑤
𝑛=1
∞
−1 𝑛+1
𝜕 𝑛
𝑦 𝑘
𝜕𝑤 𝑛
𝑤 𝑛−1
𝑛 + 1 !
𝑅 =
𝑤2
2
𝑆 𝑏 𝒚, 𝑤
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
Thresholding
During training, due to numerical
errors and asymptotic behaviors it
might happen that its value never
reaches zero.
For this we introduce a simple
thresholding mechanism
12
PUBLISHED
𝑤𝑖 < 𝑇
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
Overview on the technique
13
𝑤𝑖
𝑡
≔ 𝑤𝑖
𝑡−1
− 𝜂
𝜕𝐿
𝜕𝑤𝑖
𝑡−1 − 𝜆𝑤𝑖
𝑡−1
𝑆 𝑏(𝒚, 𝑤𝑖
𝑡−1
)
Forward Propagation
Back-Propagation
Update
Pruning
𝜕𝐿
𝜕𝑤
, S(w)
At the end of the epoch
PUBLISHED
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
Sensitivity-based regularization:
results on LeNet300-MNIST
14
PUBLISHED
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
15
Sensitivity-based regularization:
results on VGG16-ImageNet
PUBLISHED
Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven
regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
16
References
[Zhou et al., 2016] Zhou, Shuchang, et al. "Dorefa-net: Training low bitwidth convolutional neural networks with
low bitwidth gradients." arXiv preprint arXiv:1606.06160 (2016).
[Han et al., 2015-1] Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural
networks with pruning, trained quantization and huffman coding." arXiv preprint arXiv:1510.00149 (2015).
[Howard et al., 2017] Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile
vision applications." arXiv preprint arXiv:1704.04861 (2017).
[Louizos et al., 2018] C. Louizos, M. Welling, and D. P. Kingma, “Learning sparse neuralnetworks
throughl0regularization,”6th International Conference onLearning Representations, ICLR 2018 - Conference Track
Proceedings,2018.
[Han et al, 2015-2] . Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and con-nections for efficient
neural network,” inAdvances in neural informationprocessing systems, 2015, pp. 1135–1143.
[Srivastava et al., 2014] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-dinov, “Dropout: a
simple way to prevent neural networks from over-fitting,”The Journal of Machine Learning Research, vol. 15, no.
1, pp.1929–1958, 2014
17
References (II)
[Lebedev and Lempitsky, 2016] . Lebedev and V. Lempitsky, “Fast convnets using group-wise
braindamage,” inProceedings of the IEEE Conference on Computer Visionand Pattern
Recognition, 2016, pp. 2554–2564.
[Scardapane et al., 2017] Scardapane, Simone, et al. "Group sparse regularization for deep
neural networks." Neurocomputing 241 (2017): 81-89.
[Hadifar et al., 2020] Hadifar, Amir, et al. "Block-wise Dynamic Sparseness." arXiv preprint
arXiv:2001.04686 (2020).
18

Contenu connexe

Tendances

HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningYan Xu
 
Emotion detection using cnn.pptx
Emotion detection using cnn.pptxEmotion detection using cnn.pptx
Emotion detection using cnn.pptxRADO7900
 
Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Sivagowry Shathesh
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learningRADO7900
 
IRJET- Machine Learning V/S Deep Learning
IRJET- Machine Learning V/S Deep LearningIRJET- Machine Learning V/S Deep Learning
IRJET- Machine Learning V/S Deep LearningIRJET Journal
 
The Deep Learning Glossary
The Deep Learning GlossaryThe Deep Learning Glossary
The Deep Learning GlossaryNVIDIA
 
neural network
neural networkneural network
neural networkSTUDENT
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning David Voyles
 
Bhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogueBhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogueVijayananda Mohire
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIBayes Nets meetup London
 
Deep learning - what is it and why now?
Deep learning - what is it and why now?Deep learning - what is it and why now?
Deep learning - what is it and why now?Natalia Konstantinova
 
State-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsState-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsKnoldus Inc.
 
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...Edureka!
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningAmr Rashed
 
Neural networks in business forecasting
Neural networks in business forecastingNeural networks in business forecasting
Neural networks in business forecastingAmir Shokri
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Sarvesh Kumar
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgdataHacker. rs
 

Tendances (20)

HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Emotion detection using cnn.pptx
Emotion detection using cnn.pptxEmotion detection using cnn.pptx
Emotion detection using cnn.pptx
 
Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing
 
1.Introduction to deep learning
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
IRJET- Machine Learning V/S Deep Learning
IRJET- Machine Learning V/S Deep LearningIRJET- Machine Learning V/S Deep Learning
IRJET- Machine Learning V/S Deep Learning
 
The Deep Learning Glossary
The Deep Learning GlossaryThe Deep Learning Glossary
The Deep Learning Glossary
 
neural network
neural networkneural network
neural network
 
Intro to deep learning
Intro to deep learning Intro to deep learning
Intro to deep learning
 
Bhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogueBhadale group of companies ai neural networks and algorithms catalogue
Bhadale group of companies ai neural networks and algorithms catalogue
 
David Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AIDavid Barber - Deep Nets, Bayes and the story of AI
David Barber - Deep Nets, Bayes and the story of AI
 
Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)Deep Learning Representations for All (a.ka. the AI hype)
Deep Learning Representations for All (a.ka. the AI hype)
 
Deep learning - what is it and why now?
Deep learning - what is it and why now?Deep learning - what is it and why now?
Deep learning - what is it and why now?
 
State-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsState-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domains
 
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
Deep Learning Using TensorFlow | TensorFlow Tutorial | AI & Deep Learning Tra...
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Neural networks in business forecasting
Neural networks in business forecastingNeural networks in business forecasting
Neural networks in business forecasting
 
Neural networks
Neural networksNeural networks
Neural networks
 
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
Model of Differential Equation for Genetic Algorithm with Neural Network (GAN...
 
Notes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew NgNotes from Coursera Deep Learning courses by Andrew Ng
Notes from Coursera Deep Learning courses by Andrew Ng
 

Similaire à Learning Sparse Neural Networksvia Sensitivity-Driven Regularization

IBM Cloud Paris Meetup 20180517 - Deep Learning Challenges
IBM Cloud Paris Meetup 20180517 - Deep Learning ChallengesIBM Cloud Paris Meetup 20180517 - Deep Learning Challenges
IBM Cloud Paris Meetup 20180517 - Deep Learning ChallengesIBM France Lab
 
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...Md Rakibul Hasan
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan forijaia
 
A Study of Social Media Data and Data Mining Techniques
A Study of Social Media Data and Data Mining TechniquesA Study of Social Media Data and Data Mining Techniques
A Study of Social Media Data and Data Mining TechniquesIJERA Editor
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)SungminYou
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
 
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...ijaia
 
Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...
Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...
Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...Daniel George
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applicationsSangeeta Tiwari
 
Stock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural NetworksStock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural Networksijbuiiir1
 
Deep randomized neural networks
Deep randomized neural networksDeep randomized neural networks
Deep randomized neural networksClaudio Gallicchio
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networkGauravPandey319
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequencesClaudio Gallicchio
 
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...IJERA Editor
 
Classification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural NetworkClassification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural Networkirjes
 
IRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A Survey
IRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A SurveyIRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A Survey
IRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A SurveyIRJET Journal
 
SOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNING
SOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNINGSOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNING
SOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNINGIRJET Journal
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesNamkug Kim
 
Sign Language Recognition using Deep Learning
Sign Language Recognition using Deep LearningSign Language Recognition using Deep Learning
Sign Language Recognition using Deep LearningIRJET Journal
 

Similaire à Learning Sparse Neural Networksvia Sensitivity-Driven Regularization (20)

Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
IBM Cloud Paris Meetup 20180517 - Deep Learning Challenges
IBM Cloud Paris Meetup 20180517 - Deep Learning ChallengesIBM Cloud Paris Meetup 20180517 - Deep Learning Challenges
IBM Cloud Paris Meetup 20180517 - Deep Learning Challenges
 
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
An Evolutionary-based Neural Network for Distinguishing between Genuine and P...
 
X trepan an extended trepan for
X trepan an extended trepan forX trepan an extended trepan for
X trepan an extended trepan for
 
A Study of Social Media Data and Data Mining Techniques
A Study of Social Media Data and Data Mining TechniquesA Study of Social Media Data and Data Mining Techniques
A Study of Social Media Data and Data Mining Techniques
 
Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)Deep learning lecture - part 1 (basics, CNN)
Deep learning lecture - part 1 (basics, CNN)
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
 
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
NETWORK LEARNING AND TRAINING OF A CASCADED LINK-BASED FEED FORWARD NEURAL NE...
 
Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...
Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...
Deep Learning for Hidden Signals - Enabling Real-time Multimessenger Astrophy...
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
 
Stock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural NetworksStock Prediction Using Artificial Neural Networks
Stock Prediction Using Artificial Neural Networks
 
Deep randomized neural networks
Deep randomized neural networksDeep randomized neural networks
Deep randomized neural networks
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Reservoir computing fast deep learning for sequences
Reservoir computing   fast deep learning for sequencesReservoir computing   fast deep learning for sequences
Reservoir computing fast deep learning for sequences
 
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...
Rainfall Prediction using Data-Core Based Fuzzy Min-Max Neural Network for Cl...
 
Classification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural NetworkClassification Of Iris Plant Using Feedforward Neural Network
Classification Of Iris Plant Using Feedforward Neural Network
 
IRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A Survey
IRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A SurveyIRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A Survey
IRJET- Prediction of Autism Spectrum Disorder using Deep Learning: A Survey
 
SOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNING
SOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNINGSOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNING
SOCIAL DISTANCING MONITORING IN COVID-19 USING DEEP LEARNING
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectives
 
Sign Language Recognition using Deep Learning
Sign Language Recognition using Deep LearningSign Language Recognition using Deep Learning
Sign Language Recognition using Deep Learning
 

Dernier

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 

Dernier (20)

Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 

Learning Sparse Neural Networksvia Sensitivity-Driven Regularization

  • 1. Introducing sparsity in artificial neural networks: a sensitivity-based approach ENZO TARTAGLIONE POSTDOC AT UNIVERSITY OF TORINO Universita’ degli Studi Di Torino Computer Science Dept. EIDOS group
  • 2. Deep networks • High number of hidden layers • More complex classification tasks (ImageNet) • Use of convolutional layers, pooling layers, very large fully-connected layers • Very high number of parameters (hundreds of millions and even more…) • Is it possible to boost the performance making the ANN robust to noise? 2 STATE OF THE ART
  • 3. Size of ANN models vs generalization 3 STATE OF THE ART
  • 4. Approaches to reduce the size of an ANN 4 Quantization [Zhou et al., 2016] [Han et al., 2015-1] Modify the architecture [Howard et al., 2017] Regularize and prune to achieve sparsity STATE OF THE ART
  • 5. Why sparse networks? 5 Less memory required. Less comp. resources. Deployability on embedded devices. STATE OF THE ART Typical architectures are overparametrized!
  • 6. Some existing pruning strategies… 6 Design a proxy L0 regularizer [Louizos et al., 2018] Greedy thresholding after L2+dropout strategy [Han et al., 2015-2] Grouping for convolutional features[Lebedev and Lempitsky, 2016] [Hadifar et al., 2020] Dropout-based approaches [Srivastava et al., 2014] Lasso-based regularizers [Scardapane et al., 2017] … STATE OF THE ART
  • 7. When is a parameter necessary? 7 . . . . I N P U T O U T P U T 𝑦 𝑤 Changing w we change the output of the network… we don’t want to modify it! Changing w we do not change the output of the network… we are free to change it! Forward Propagation PUBLISHED Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018) PRE-TRAINED MODEL
  • 8. Definition of sensitivity Small perturbation of w where C is the size of the output and 𝛼 𝑘 a weight scalar factor. 8 Δ𝑦 𝑘 ≈ Δ𝑤𝑖 𝜕𝑦 𝑘 𝜕𝑤𝑖 𝑆 𝒚, 𝑤𝑖 = 𝑘=1 𝐶 𝛼 𝑘 𝜕𝑦 𝑘 𝜕𝑤𝑖 PUBLISHED Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 9. Towards the definition of the update term We need an insensitivity parameter: To guarantee this quantity always being positive… trivial choice: 9 Value Importance 𝑆 𝑆 𝑤 ? ? 𝑆 𝒚, 𝑤𝑖 = 1 − 𝑆(𝒚, 𝑤𝑖) 𝑆 𝑏 𝒚, 𝑤𝑖 = max 0, 𝑆 𝒚, 𝑤𝑖 PUBLISHED Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 10. Weight update proposed: where • 𝑆 𝑏 is a function which states whether the parameter is relevant or not to the computation of the output y of the network. • 𝒚 is the output of the network. • 𝑤𝑖 is the parameter. • 𝐿 is a generic loss function. Sensitivity-based regularization 10 𝑤𝑖 𝑡 ≔ 𝑤𝑖 𝑡−1 − 𝜂 𝜕𝐿 𝜕𝑤𝑖 𝑡−1 − 𝜆𝑤𝑖 𝑡−1 𝑆 𝑏(𝒚, 𝑤𝑖 𝑡−1 ) PUBLISHED Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 11. Which function are we minimizing? We need to solve the integral Math, math, math…. valid for any architecture and any loss function, but for ReLU-activated networks… 11 PUBLISHED 𝑅 = 𝑤 ⋅ 𝑆 𝑏 𝒚, 𝑤 𝑑𝑤 𝑅 = Θ 𝑆 𝒚, 𝑤 𝑤2 2 1 − 𝑘=1 𝐶 𝛼 𝑘 𝑠𝑖𝑔𝑛 𝜕𝑦 𝑘 𝜕𝑤 𝑛=1 ∞ −1 𝑛+1 𝜕 𝑛 𝑦 𝑘 𝜕𝑤 𝑛 𝑤 𝑛−1 𝑛 + 1 ! 𝑅 = 𝑤2 2 𝑆 𝑏 𝒚, 𝑤 Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 12. Thresholding During training, due to numerical errors and asymptotic behaviors it might happen that its value never reaches zero. For this we introduce a simple thresholding mechanism 12 PUBLISHED 𝑤𝑖 < 𝑇 Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 13. Overview on the technique 13 𝑤𝑖 𝑡 ≔ 𝑤𝑖 𝑡−1 − 𝜂 𝜕𝐿 𝜕𝑤𝑖 𝑡−1 − 𝜆𝑤𝑖 𝑡−1 𝑆 𝑏(𝒚, 𝑤𝑖 𝑡−1 ) Forward Propagation Back-Propagation Update Pruning 𝜕𝐿 𝜕𝑤 , S(w) At the end of the epoch PUBLISHED Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 14. Sensitivity-based regularization: results on LeNet300-MNIST 14 PUBLISHED Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 15. 15 Sensitivity-based regularization: results on VGG16-ImageNet PUBLISHED Tartaglione, E., Lepsoy, S., Fiandrotti, A., Francini, G. (2018). Learning sparse neural networks via sensitivity-driven regularization. In Advances in Neural Information Processing Systems (NeurIPS 2018)
  • 16. 16
  • 17. References [Zhou et al., 2016] Zhou, Shuchang, et al. "Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients." arXiv preprint arXiv:1606.06160 (2016). [Han et al., 2015-1] Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding." arXiv preprint arXiv:1510.00149 (2015). [Howard et al., 2017] Howard, Andrew G., et al. "Mobilenets: Efficient convolutional neural networks for mobile vision applications." arXiv preprint arXiv:1704.04861 (2017). [Louizos et al., 2018] C. Louizos, M. Welling, and D. P. Kingma, “Learning sparse neuralnetworks throughl0regularization,”6th International Conference onLearning Representations, ICLR 2018 - Conference Track Proceedings,2018. [Han et al, 2015-2] . Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and con-nections for efficient neural network,” inAdvances in neural informationprocessing systems, 2015, pp. 1135–1143. [Srivastava et al., 2014] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhut-dinov, “Dropout: a simple way to prevent neural networks from over-fitting,”The Journal of Machine Learning Research, vol. 15, no. 1, pp.1929–1958, 2014 17
  • 18. References (II) [Lebedev and Lempitsky, 2016] . Lebedev and V. Lempitsky, “Fast convnets using group-wise braindamage,” inProceedings of the IEEE Conference on Computer Visionand Pattern Recognition, 2016, pp. 2554–2564. [Scardapane et al., 2017] Scardapane, Simone, et al. "Group sparse regularization for deep neural networks." Neurocomputing 241 (2017): 81-89. [Hadifar et al., 2020] Hadifar, Amir, et al. "Block-wise Dynamic Sparseness." arXiv preprint arXiv:2001.04686 (2020). 18

Notes de l'éditeur

  1. - noise -> are some parameters less relevant than others?
  2. Two questions: how to estimate the change of the output? Where we drive non-necessary w?
  3. T is magnitude-based for all the literature out there