SlideShare une entreprise Scribd logo
Neural Network Compression
Luigi Troiano
Dept. of Engineering
University of Sannio
This work is licensed under a Creative Commons Attribution 4.0 International License.
Today, AI is Data Center oriented
AI
AI
AI
AI
Computational
power is required
to train large
models over
massive data
Data Center @Nvidia
MARCH 19, 2019
NVIDIA Annual Investor Day
The future of AI is on the Edge
Edge Computing market will reach $34 billion by 2023, growing at 35% annually
Edge AI inference will grow from just 6% in 2017 to 43% in 2023
By 2022, 80% of smartphones shipped will have AI capabilities on the device
itself, up from 10% in 2017
Anatomy of a Convolutional Neural Networks
Umut Güçlü, Marcel A. J. van Gerven: Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream, Journal of
Neuroscience 8 July 2015, 35 (27) 10005-10014
An example: GoogleNet
CONVOLUTION
POOLING
SOFTMAX
CONCAT/NORMALIZE
GoogLeNet
Aims of Neural Network Compression
1. Lowering the bandwidth necessary for transferring a network
2. Reducing the space for keeping the network in memory
3. Making more efficient the inference of the network
4. Shortening the time required for training
5. Reducing/eliminating multiple nodes for complex networks
6. Improving the network generalization
Approaches
■ Quantization
■ Pruning and sharing
■ Design Structure Matrix
■ Low-rank tensor decomposition
■ Tensor Factorization
■ Transforming Network Representation
Quantization
Reducing the number of bits required to represent each weight with minimal loss of accuracy
• Integer: 8-bit quantization of the parameters can result in significant speed-up
• Fixed-point: 16-bit fixed-point representation with stochastic rounding
• Non-uniform: K-means scalar quantization
• Binarization: 1-bit representation for each weight
Pruning and Sharing
Reducing network complexity by removing or grouping redundant and non-informative weights
• Remove connections:
• Based on the magnitude of the weighting matrix
• Based on the Hessian of the loss function
• Prune weights in a pre-trained CNN model and then retrain the model
• Sharing the weights:
• Using low-cost hash (possibly LSH) function to group weights into buckets
• Adding a L0, L1 or L2 regularizer to the loss function in order to by encouraging weights to become
exactly zero, then remove them
Combining methods
S. Han, H. Mao, and W. J. Dally proposed:
1. Learning the connectivity via common network training
2. Pruning the small-weight connections
3. Quantization of the link weights using weight sharing
4. Huffman coding to the quantized weights as well as the codebook
5. Retraining the network to learn the final weights for the remaining sparse connections
Design Structure Matrix (DSM)
Searching for relevant dependencies by rearranging the weights.
• Impose the structure since beginning
• Circulant matrix
• Adaptive Fastfood Transform:
• R = SHGPHB
S, G and B are random diagonal matrices, P is a random
permutation matrix, and H is the Walsh-Hadamard matrix.
Low-rank Tensor Decomposition
Producing/assuming 4D-Tensor decomposition in order to reduce
memory occupation
• Learning separable 1D filters
• Compress 2D convolutional layers
• Canonical Polyadic (CP) decomposition
• Batch Normalization
Tensor Factorization
To approximate tensors by means of low-rank reduces
tensors
• Single Value Decomposition (SVD)
• Non-negative Matrix Factorization
• CUR
Transforming Network Representation
• Transferred/Compact Convolutional Filters
• Knowledge Distillation
Conclusions
Applications of AI involves complex neural networks
Today Training (and also Inference) are mainly performed at Data Centers
There is an increasing interest for Edge AI
This requires to face transmission and execution of models over limited
resources
Research is addressing this problems by multiple approaches
MPEG will play a relevant role in the the standardization of compressed nets
Luigi Troiano
Artificial Intelligence, Data Science and Big Data
Dept. of Engineering
University of Sannio
Benevento, Italy
https://www.linkedin.com/in/luigitroiano/
THANK YOU

Contenu connexe

Tendances

"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
Edge AI and Vision Alliance
 
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
Edge AI and Vision Alliance
 
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Bharath Sudharsan
 
Spine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localizationSpine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localization
Devansh16
 
RL-Cache: Learning-Based Cache Admission for Content Delivery
RL-Cache: Learning-Based Cache Admission for Content DeliveryRL-Cache: Learning-Based Cache Admission for Content Delivery
RL-Cache: Learning-Based Cache Admission for Content Delivery
Förderverein Technische Fakultät
 
In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
Jinwon Lee
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
Tahmid Abtahi
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
Sanghamitra Deb
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
Itachi SK
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
Shunta Saito
 
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Bharath Sudharsan
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
datasciencekorea
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
NUPUR YADAV
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
Jinwon Lee
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
ParrotAI
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_paper
shanullah3
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
威智 黃
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
SungminYou
 
PR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional KernelsPR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional Kernels
Jinwon Lee
 
deep CNN vs conventional ML
deep CNN vs conventional MLdeep CNN vs conventional ML
deep CNN vs conventional ML
Chao Han chaohan@vt.edu
 

Tendances (20)

"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
“Introduction to DNN Model Compression Techniques,” a Presentation from Xailient
 
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
Globe2Train: A Framework for Distributed ML Model Training using IoT Devices ...
 
Spine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localizationSpine net learning scale permuted backbone for recognition and localization
Spine net learning scale permuted backbone for recognition and localization
 
RL-Cache: Learning-Based Cache Admission for Content Delivery
RL-Cache: Learning-Based Cache Admission for Content DeliveryRL-Cache: Learning-Based Cache Admission for Content Delivery
RL-Cache: Learning-Based Cache Admission for Content Delivery
 
In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
 
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
A Framework for Scene Recognition Using Convolutional Neural Network as Featu...
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Convolutional neural network
Convolutional neural networkConvolutional neural network
Convolutional neural network
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
 
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
Deep Learning - 인공지능 기계학습의 새로운 트랜드 :김인중
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Efficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter SharingEfficient Neural Architecture Search via Parameter Sharing
Efficient Neural Architecture Search via Parameter Sharing
 
Introduction to Convolutional Neural Networks
Introduction to Convolutional Neural NetworksIntroduction to Convolutional Neural Networks
Introduction to Convolutional Neural Networks
 
Efficient de cvpr_2020_paper
Efficient de cvpr_2020_paperEfficient de cvpr_2020_paper
Efficient de cvpr_2020_paper
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
 
Convolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNetConvolutional neural network from VGG to DenseNet
Convolutional neural network from VGG to DenseNet
 
PR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional KernelsPR-183: MixNet: Mixed Depthwise Convolutional Kernels
PR-183: MixNet: Mixed Depthwise Convolutional Kernels
 
deep CNN vs conventional ML
deep CNN vs conventional MLdeep CNN vs conventional ML
deep CNN vs conventional ML
 

Similaire à 2019-06-14:7 - Neutral Network Compression

Compressing Neural Networks with Intel AI Lab's Distiller
Compressing Neural Networks with Intel AI Lab's DistillerCompressing Neural Networks with Intel AI Lab's Distiller
Compressing Neural Networks with Intel AI Lab's Distiller
Intel Corporation
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep Learning
IRJET Journal
 
E04423133
E04423133E04423133
E04423133
IOSR-JEN
 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
MLconf
 
A White Paper On Neural Network Quantization
A White Paper On Neural Network QuantizationA White Paper On Neural Network Quantization
A White Paper On Neural Network Quantization
April Knyff
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
Brodmann17
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
BaoTramDuong2
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
Putra Wanda
 
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
IOSR Journals
 
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
thanhdowork
 
Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...
Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...
Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...
KTN
 
DEEP LEARNING.docx
DEEP LEARNING.docxDEEP LEARNING.docx
DEEP LEARNING.docx
ArunangshuPal6
 
grid computing
grid computinggrid computing
grid computing
elliando dias
 
Performance evaluation of variants of particle swarm optimization algorithms ...
Performance evaluation of variants of particle swarm optimization algorithms ...Performance evaluation of variants of particle swarm optimization algorithms ...
Performance evaluation of variants of particle swarm optimization algorithms ...
Aayush Gupta
 
self operating maps
self operating mapsself operating maps
self operating maps
AltafSMT
 
A Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionA Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware Detection
IJCSIS Research Publications
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural Networks
IRJET Journal
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
Pramit Choudhary
 
Associative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networksAssociative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networks
eSAT Publishing House
 
Efficient Clustering scheme in Cognitive Radio Wireless sensor network
Efficient Clustering scheme in Cognitive Radio Wireless sensor networkEfficient Clustering scheme in Cognitive Radio Wireless sensor network
Efficient Clustering scheme in Cognitive Radio Wireless sensor network
aziznitham
 

Similaire à 2019-06-14:7 - Neutral Network Compression (20)

Compressing Neural Networks with Intel AI Lab's Distiller
Compressing Neural Networks with Intel AI Lab's DistillerCompressing Neural Networks with Intel AI Lab's Distiller
Compressing Neural Networks with Intel AI Lab's Distiller
 
A Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep LearningA Survey on Image Processing using CNN in Deep Learning
A Survey on Image Processing using CNN in Deep Learning
 
E04423133
E04423133E04423133
E04423133
 
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
Corinna Cortes, Head of Research, Google, at MLconf NYC 2017
 
A White Paper On Neural Network Quantization
A White Paper On Neural Network QuantizationA White Paper On Neural Network Quantization
A White Paper On Neural Network Quantization
 
DLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep LearningDLD meetup 2017, Efficient Deep Learning
DLD meetup 2017, Efficient Deep Learning
 
Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...Discover How Scientific Data is Used for the Public Good with Natural Languag...
Discover How Scientific Data is Used for the Public Good with Natural Languag...
 
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural NetworkRunPool: A Dynamic Pooling Layer for Convolution Neural Network
RunPool: A Dynamic Pooling Layer for Convolution Neural Network
 
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
Using Multi-layered Feed-forward Neural Network (MLFNN) Architecture as Bidir...
 
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
240401_Thuy_Labseminar[Train Once and Explain Everywhere: Pre-training Interp...
 
Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...
Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...
Implementing AI: Hardware Challenges: Memristive Technologies: from Functiona...
 
DEEP LEARNING.docx
DEEP LEARNING.docxDEEP LEARNING.docx
DEEP LEARNING.docx
 
grid computing
grid computinggrid computing
grid computing
 
Performance evaluation of variants of particle swarm optimization algorithms ...
Performance evaluation of variants of particle swarm optimization algorithms ...Performance evaluation of variants of particle swarm optimization algorithms ...
Performance evaluation of variants of particle swarm optimization algorithms ...
 
self operating maps
self operating mapsself operating maps
self operating maps
 
A Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionA Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware Detection
 
Handwritten Digit Recognition using Convolutional Neural Networks
Handwritten Digit Recognition using Convolutional Neural  NetworksHandwritten Digit Recognition using Convolutional Neural  Networks
Handwritten Digit Recognition using Convolutional Neural Networks
 
Model Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep LearningModel Evaluation in the land of Deep Learning
Model Evaluation in the land of Deep Learning
 
Associative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networksAssociative memory implementation with artificial neural networks
Associative memory implementation with artificial neural networks
 
Efficient Clustering scheme in Cognitive Radio Wireless sensor network
Efficient Clustering scheme in Cognitive Radio Wireless sensor networkEfficient Clustering scheme in Cognitive Radio Wireless sensor network
Efficient Clustering scheme in Cognitive Radio Wireless sensor network
 

Plus de uninfoit

Pillole di normazione tecnica
Pillole di normazione tecnicaPillole di normazione tecnica
Pillole di normazione tecnica
uninfoit
 
Riunione in AIAD-STAN del 16/12/2020
Riunione in AIAD-STAN del 16/12/2020Riunione in AIAD-STAN del 16/12/2020
Riunione in AIAD-STAN del 16/12/2020
uninfoit
 
Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"
Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"
Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"
uninfoit
 
Italian NeTEx Profile group Kick-off Meeting
Italian NeTEx Profile group Kick-off MeetingItalian NeTEx Profile group Kick-off Meeting
Italian NeTEx Profile group Kick-off Meeting
uninfoit
 
Normazione Tecnica e Industria 4.0
Normazione Tecnica e Industria 4.0Normazione Tecnica e Industria 4.0
Normazione Tecnica e Industria 4.0
uninfoit
 
Assemblea dei Soci UNI del 25 novembre 2019
Assemblea dei Soci UNI del 25 novembre 2019Assemblea dei Soci UNI del 25 novembre 2019
Assemblea dei Soci UNI del 25 novembre 2019
uninfoit
 
Confindustria Salerno 21 novembre 2019
Confindustria Salerno 21 novembre 2019Confindustria Salerno 21 novembre 2019
Confindustria Salerno 21 novembre 2019
uninfoit
 
UNINFO at Z-Fact0r event - Bergamo 10 Ottobre
UNINFO at Z-Fact0r event - Bergamo 10 OttobreUNINFO at Z-Fact0r event - Bergamo 10 Ottobre
UNINFO at Z-Fact0r event - Bergamo 10 Ottobre
uninfoit
 
2019-06-14:6 - Reti neurali e compressione immagine
2019-06-14:6 - Reti neurali e compressione immagine2019-06-14:6 - Reti neurali e compressione immagine
2019-06-14:6 - Reti neurali e compressione immagine
uninfoit
 
2019-06-14:5 - Componenti per reti neurali
2019-06-14:5 - Componenti per reti neurali2019-06-14:5 - Componenti per reti neurali
2019-06-14:5 - Componenti per reti neurali
uninfoit
 
2019-06-14:2 - Perchè comprimere una rete neurale?
2019-06-14:2 - Perchè comprimere una rete neurale?2019-06-14:2 - Perchè comprimere una rete neurale?
2019-06-14:2 - Perchè comprimere una rete neurale?
uninfoit
 
24/05/2019 Workshop AIDI-UniFI-UNINFO
24/05/2019 Workshop AIDI-UniFI-UNINFO24/05/2019 Workshop AIDI-UniFI-UNINFO
24/05/2019 Workshop AIDI-UniFI-UNINFO
uninfoit
 
20190314 - Seminario UNINFO Security Summit
20190314 - Seminario UNINFO Security Summit 20190314 - Seminario UNINFO Security Summit
20190314 - Seminario UNINFO Security Summit
uninfoit
 
Normazione Tecnica e DLT @ Digital Innovation - 2019 - San Marino
Normazione Tecnica e DLT @ Digital Innovation - 2019 - San MarinoNormazione Tecnica e DLT @ Digital Innovation - 2019 - San Marino
Normazione Tecnica e DLT @ Digital Innovation - 2019 - San Marino
uninfoit
 
Codice di Condotta e Certificazione
Codice di Condotta e CertificazioneCodice di Condotta e Certificazione
Codice di Condotta e Certificazione
uninfoit
 
Stakeholder meeting per CEN/TC353
Stakeholder meeting per CEN/TC353Stakeholder meeting per CEN/TC353
Stakeholder meeting per CEN/TC353
uninfoit
 
Conferenza Nazionale NIS e GDPR - Tor Vergata
Conferenza Nazionale NIS e GDPR - Tor VergataConferenza Nazionale NIS e GDPR - Tor Vergata
Conferenza Nazionale NIS e GDPR - Tor Vergata
uninfoit
 
Normazione Tecnica per Tecnologie Additive
Normazione Tecnica per Tecnologie AdditiveNormazione Tecnica per Tecnologie Additive
Normazione Tecnica per Tecnologie Additive
uninfoit
 
Squillace - Convegno su Sicurezza hardware nei sistemi digitali
Squillace - Convegno su Sicurezza hardware nei sistemi digitaliSquillace - Convegno su Sicurezza hardware nei sistemi digitali
Squillace - Convegno su Sicurezza hardware nei sistemi digitali
uninfoit
 
Sicurezza Informatica Evento AITA Genova
Sicurezza Informatica Evento AITA GenovaSicurezza Informatica Evento AITA Genova
Sicurezza Informatica Evento AITA Genova
uninfoit
 

Plus de uninfoit (20)

Pillole di normazione tecnica
Pillole di normazione tecnicaPillole di normazione tecnica
Pillole di normazione tecnica
 
Riunione in AIAD-STAN del 16/12/2020
Riunione in AIAD-STAN del 16/12/2020Riunione in AIAD-STAN del 16/12/2020
Riunione in AIAD-STAN del 16/12/2020
 
Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"
Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"
Le norme, le PdR e le attività di Normazione Tecnica in corso per "Industry 4.0"
 
Italian NeTEx Profile group Kick-off Meeting
Italian NeTEx Profile group Kick-off MeetingItalian NeTEx Profile group Kick-off Meeting
Italian NeTEx Profile group Kick-off Meeting
 
Normazione Tecnica e Industria 4.0
Normazione Tecnica e Industria 4.0Normazione Tecnica e Industria 4.0
Normazione Tecnica e Industria 4.0
 
Assemblea dei Soci UNI del 25 novembre 2019
Assemblea dei Soci UNI del 25 novembre 2019Assemblea dei Soci UNI del 25 novembre 2019
Assemblea dei Soci UNI del 25 novembre 2019
 
Confindustria Salerno 21 novembre 2019
Confindustria Salerno 21 novembre 2019Confindustria Salerno 21 novembre 2019
Confindustria Salerno 21 novembre 2019
 
UNINFO at Z-Fact0r event - Bergamo 10 Ottobre
UNINFO at Z-Fact0r event - Bergamo 10 OttobreUNINFO at Z-Fact0r event - Bergamo 10 Ottobre
UNINFO at Z-Fact0r event - Bergamo 10 Ottobre
 
2019-06-14:6 - Reti neurali e compressione immagine
2019-06-14:6 - Reti neurali e compressione immagine2019-06-14:6 - Reti neurali e compressione immagine
2019-06-14:6 - Reti neurali e compressione immagine
 
2019-06-14:5 - Componenti per reti neurali
2019-06-14:5 - Componenti per reti neurali2019-06-14:5 - Componenti per reti neurali
2019-06-14:5 - Componenti per reti neurali
 
2019-06-14:2 - Perchè comprimere una rete neurale?
2019-06-14:2 - Perchè comprimere una rete neurale?2019-06-14:2 - Perchè comprimere una rete neurale?
2019-06-14:2 - Perchè comprimere una rete neurale?
 
24/05/2019 Workshop AIDI-UniFI-UNINFO
24/05/2019 Workshop AIDI-UniFI-UNINFO24/05/2019 Workshop AIDI-UniFI-UNINFO
24/05/2019 Workshop AIDI-UniFI-UNINFO
 
20190314 - Seminario UNINFO Security Summit
20190314 - Seminario UNINFO Security Summit 20190314 - Seminario UNINFO Security Summit
20190314 - Seminario UNINFO Security Summit
 
Normazione Tecnica e DLT @ Digital Innovation - 2019 - San Marino
Normazione Tecnica e DLT @ Digital Innovation - 2019 - San MarinoNormazione Tecnica e DLT @ Digital Innovation - 2019 - San Marino
Normazione Tecnica e DLT @ Digital Innovation - 2019 - San Marino
 
Codice di Condotta e Certificazione
Codice di Condotta e CertificazioneCodice di Condotta e Certificazione
Codice di Condotta e Certificazione
 
Stakeholder meeting per CEN/TC353
Stakeholder meeting per CEN/TC353Stakeholder meeting per CEN/TC353
Stakeholder meeting per CEN/TC353
 
Conferenza Nazionale NIS e GDPR - Tor Vergata
Conferenza Nazionale NIS e GDPR - Tor VergataConferenza Nazionale NIS e GDPR - Tor Vergata
Conferenza Nazionale NIS e GDPR - Tor Vergata
 
Normazione Tecnica per Tecnologie Additive
Normazione Tecnica per Tecnologie AdditiveNormazione Tecnica per Tecnologie Additive
Normazione Tecnica per Tecnologie Additive
 
Squillace - Convegno su Sicurezza hardware nei sistemi digitali
Squillace - Convegno su Sicurezza hardware nei sistemi digitaliSquillace - Convegno su Sicurezza hardware nei sistemi digitali
Squillace - Convegno su Sicurezza hardware nei sistemi digitali
 
Sicurezza Informatica Evento AITA Genova
Sicurezza Informatica Evento AITA GenovaSicurezza Informatica Evento AITA Genova
Sicurezza Informatica Evento AITA Genova
 

Dernier

UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
Federico Razzoli
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
jpupo2018
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 

Dernier (20)

UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Webinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data WarehouseWebinar: Designing a schema for a Data Warehouse
Webinar: Designing a schema for a Data Warehouse
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Project Management Semester Long Project - Acuity
Project Management Semester Long Project - AcuityProject Management Semester Long Project - Acuity
Project Management Semester Long Project - Acuity
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 

2019-06-14:7 - Neutral Network Compression

  • 1. Neural Network Compression Luigi Troiano Dept. of Engineering University of Sannio
  • 2. This work is licensed under a Creative Commons Attribution 4.0 International License.
  • 3. Today, AI is Data Center oriented AI AI AI AI Computational power is required to train large models over massive data
  • 4. Data Center @Nvidia MARCH 19, 2019 NVIDIA Annual Investor Day
  • 5. The future of AI is on the Edge Edge Computing market will reach $34 billion by 2023, growing at 35% annually Edge AI inference will grow from just 6% in 2017 to 43% in 2023 By 2022, 80% of smartphones shipped will have AI capabilities on the device itself, up from 10% in 2017
  • 6. Anatomy of a Convolutional Neural Networks Umut Güçlü, Marcel A. J. van Gerven: Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream, Journal of Neuroscience 8 July 2015, 35 (27) 10005-10014
  • 9. Aims of Neural Network Compression 1. Lowering the bandwidth necessary for transferring a network 2. Reducing the space for keeping the network in memory 3. Making more efficient the inference of the network 4. Shortening the time required for training 5. Reducing/eliminating multiple nodes for complex networks 6. Improving the network generalization
  • 10. Approaches ■ Quantization ■ Pruning and sharing ■ Design Structure Matrix ■ Low-rank tensor decomposition ■ Tensor Factorization ■ Transforming Network Representation
  • 11. Quantization Reducing the number of bits required to represent each weight with minimal loss of accuracy • Integer: 8-bit quantization of the parameters can result in significant speed-up • Fixed-point: 16-bit fixed-point representation with stochastic rounding • Non-uniform: K-means scalar quantization • Binarization: 1-bit representation for each weight
  • 12. Pruning and Sharing Reducing network complexity by removing or grouping redundant and non-informative weights • Remove connections: • Based on the magnitude of the weighting matrix • Based on the Hessian of the loss function • Prune weights in a pre-trained CNN model and then retrain the model • Sharing the weights: • Using low-cost hash (possibly LSH) function to group weights into buckets • Adding a L0, L1 or L2 regularizer to the loss function in order to by encouraging weights to become exactly zero, then remove them
  • 13. Combining methods S. Han, H. Mao, and W. J. Dally proposed: 1. Learning the connectivity via common network training 2. Pruning the small-weight connections 3. Quantization of the link weights using weight sharing 4. Huffman coding to the quantized weights as well as the codebook 5. Retraining the network to learn the final weights for the remaining sparse connections
  • 14. Design Structure Matrix (DSM) Searching for relevant dependencies by rearranging the weights. • Impose the structure since beginning • Circulant matrix • Adaptive Fastfood Transform: • R = SHGPHB S, G and B are random diagonal matrices, P is a random permutation matrix, and H is the Walsh-Hadamard matrix.
  • 15. Low-rank Tensor Decomposition Producing/assuming 4D-Tensor decomposition in order to reduce memory occupation • Learning separable 1D filters • Compress 2D convolutional layers • Canonical Polyadic (CP) decomposition • Batch Normalization
  • 16. Tensor Factorization To approximate tensors by means of low-rank reduces tensors • Single Value Decomposition (SVD) • Non-negative Matrix Factorization • CUR
  • 17. Transforming Network Representation • Transferred/Compact Convolutional Filters • Knowledge Distillation
  • 18. Conclusions Applications of AI involves complex neural networks Today Training (and also Inference) are mainly performed at Data Centers There is an increasing interest for Edge AI This requires to face transmission and execution of models over limited resources Research is addressing this problems by multiple approaches MPEG will play a relevant role in the the standardization of compressed nets
  • 19.
  • 20.
  • 21.
  • 22. Luigi Troiano Artificial Intelligence, Data Science and Big Data Dept. of Engineering University of Sannio Benevento, Italy https://www.linkedin.com/in/luigitroiano/ THANK YOU