SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Deep Learning: AI Breakthrough
Mohsen Fayyaz
Sensifai
Tehran University – 15 Dey 1395 (4 Jan 2017)
Video Processing and Deep Learning
What is Video?
• Batches of Frames
• Can we process video as batches of frames?
Motion cannot be inferred from single frame
Why do we need video processing?
• Self-Driving Cars: Video Semantic Segmentation
Feature Space Optimization for Semantic Video Segmentation, Kundu et. al., 2016
Why do we need video processing?
• Robots: Action Recognition
Simonyan et. al., 2014
Why do we need video processing?
• Google, YouTube, Aparat : Video Tagging
Densecap, Johnson et. al., 2016 (Image captioning)
Why do we need video processing?
• Network Video Broadcasting: Frame Prediction
Patraucean et. al., 2016
From Images to Video
3
Image
CNN
Extracted
Features
Frames
?
Extracted
Features
Image Video
From Images to Video
CNN
Extracted Spatio-Temporal
Features
Frames
LSTM
Donahe et. al., 2015
From Images to Video
CNN
Extracted Spatio-Temporal
Features
Frames
LSTM
Donahe et. al., 2015
What if we want regional
features?
From Images to Video - STFCN
CNN
Extracted Regional Spatio-Temporal
FeaturesFrames
Convolutional LSTM
Fayyaz et. al., 2016
From Images to Video – C3D
3D
CNN
Extracted Regional Spatio-Temporal
FeaturesFrames
Tran et. al., 2015
Now that we have the appropriate tool
Let’s see some real world applications
Video Semantic Segmentation - STFCN
Fayyaz et. al., 2016
Video Semantic Segmentation – C3D
Tran et. al., 2015
Action Recognition & Video Classification
Simonyan et. al., 2014
Does video have visual data only?
Action Recognition & Video Classification
Wu et al., 2015
Audio
+
Vision
Let’s briefly take a look at some state-of-the-
art Image based Networks
Extremely Deep Networks
Residual Networks
• Problem: Gradients Vanish in Back-propagation
• Solution: Let’s make a shortcut for them!
• Y = 𝐻(𝑋, 𝑊𝐻) -> Y = 𝐻 𝑋, 𝑊𝐻 + 𝑋
Extremely Deep Networks
Highway Networks
• Similar to ResNets
• The shortcuts are controlled using a learnable parameter to
have a better trade-off between being
• Y = 𝐻 𝑋, 𝑊𝐻 . 𝑇 𝑋, 𝑊𝑇 + 𝑋. (1 − 𝑇 𝑋, 𝑊𝑇 )
Extremely Deep Networks
DenseNets
• If ResNet works with just connecting previous layers, why
not connecting all?!
• 𝑌 = 𝐹(𝑋 𝑛, 𝑋 𝑛−1, …, 𝑋0)
• Improvements in both Forward &
• Backward
Now what if we use the idea of propagating
data and gradients between shallow and
deep layers in video based networks?
Up to here everything was Supervised
But there are bunch of data across the
Internet with weak labels …
Let’s go through Weakly-Supervised
methods
Weakly Supervised Learning
Weakly Supervised Learning with CNNs
• Multiple Labeling
• Weakly Localization
• Data can be crawled
over Internet
• Can be adopted to Video
Oquab et. al., 2015
How about some Unsupervised methods …
Unsupervised Learning
Anticipating Visual Representations From Unlabeled Video
• Training on Big Huge amount of unlabeled Video across the net
• Training Classifiers on the final output
Vondrick et. al., 2016
Practical considerations
What Hardware do I use?
• NVIDIA GPU + SSD + HDD
• More info on:
http://www.DeepLearning.ir
What framework do I use?
Caffe
Torch
Tensorflow
Theano
Keras
Microsoft CNTK
Deeplearning4j
…
What framework do I use?
Tensorflow Torch Theano
From Karpathy’s slides
Distributed Training:
Will be presented at my next presentation
at Sharif University of Technology
on 22 Dey 1395 (11 Jan 2017)
From Karpathy’s slides
Thank You
Fayyaz@Sensifai.com

Contenu connexe

Similaire à Deep Learning: AI Breakthrough

"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
Edge AI and Vision Alliance
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
NAVER Engineering
 

Similaire à Deep Learning: AI Breakthrough (20)

Video+Language: From Classification to Description
Video+Language: From Classification to DescriptionVideo+Language: From Classification to Description
Video+Language: From Classification to Description
 
Video + Language 2019
Video + Language 2019Video + Language 2019
Video + Language 2019
 
Video + Language
Video + LanguageVideo + Language
Video + Language
 
Final Year Project.pdf
Final Year Project.pdfFinal Year Project.pdf
Final Year Project.pdf
 
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"..."How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
"How Image Sensor and Video Compression Parameters Impact Vision Algorithms,"...
 
Speed_Perception_Phase1
Speed_Perception_Phase1Speed_Perception_Phase1
Speed_Perception_Phase1
 
Measuring the end user
Measuring the end userMeasuring the end user
Measuring the end user
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Video Captioning at TRECVID 2022
Video Captioning at TRECVID 2022Video Captioning at TRECVID 2022
Video Captioning at TRECVID 2022
 
Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?Deep Learning, Where Are You Going?
Deep Learning, Where Are You Going?
 
Measuring the End User
Measuring the End User Measuring the End User
Measuring the End User
 
Deep Learning: a birds eye view
Deep Learning: a birds eye viewDeep Learning: a birds eye view
Deep Learning: a birds eye view
 
Experiences with openEyA-Lecture Capture System (Pros and Cons)
Experiences with openEyA-Lecture Capture System (Pros and Cons)Experiences with openEyA-Lecture Capture System (Pros and Cons)
Experiences with openEyA-Lecture Capture System (Pros and Cons)
 
Deep learning: the future of recommendations
Deep learning: the future of recommendationsDeep learning: the future of recommendations
Deep learning: the future of recommendations
 
Deep Learning for Robotics
Deep Learning for RoboticsDeep Learning for Robotics
Deep Learning for Robotics
 
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
Can We Make Maps from Videos? ~From AI Algorithm to Engineering for Continuou...
 
E Learning Management System By Tuhin Roy Using PHP
E Learning Management System By Tuhin Roy Using PHPE Learning Management System By Tuhin Roy Using PHP
E Learning Management System By Tuhin Roy Using PHP
 
GluonCV
GluonCVGluonCV
GluonCV
 
Deep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendationsDeep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendations
 
Deep Representation: Building a Semantic Image Search Engine
Deep Representation: Building a Semantic Image Search EngineDeep Representation: Building a Semantic Image Search Engine
Deep Representation: Building a Semantic Image Search Engine
 

Dernier

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Lokesh Kothari
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 

Dernier (20)

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 

Deep Learning: AI Breakthrough

  • 1. Deep Learning: AI Breakthrough Mohsen Fayyaz Sensifai Tehran University – 15 Dey 1395 (4 Jan 2017)
  • 2. Video Processing and Deep Learning
  • 3. What is Video? • Batches of Frames • Can we process video as batches of frames? Motion cannot be inferred from single frame
  • 4. Why do we need video processing? • Self-Driving Cars: Video Semantic Segmentation Feature Space Optimization for Semantic Video Segmentation, Kundu et. al., 2016
  • 5. Why do we need video processing? • Robots: Action Recognition Simonyan et. al., 2014
  • 6. Why do we need video processing? • Google, YouTube, Aparat : Video Tagging Densecap, Johnson et. al., 2016 (Image captioning)
  • 7. Why do we need video processing? • Network Video Broadcasting: Frame Prediction Patraucean et. al., 2016
  • 8. From Images to Video 3 Image CNN Extracted Features Frames ? Extracted Features Image Video
  • 9. From Images to Video CNN Extracted Spatio-Temporal Features Frames LSTM Donahe et. al., 2015
  • 10. From Images to Video CNN Extracted Spatio-Temporal Features Frames LSTM Donahe et. al., 2015 What if we want regional features?
  • 11. From Images to Video - STFCN CNN Extracted Regional Spatio-Temporal FeaturesFrames Convolutional LSTM Fayyaz et. al., 2016
  • 12. From Images to Video – C3D 3D CNN Extracted Regional Spatio-Temporal FeaturesFrames Tran et. al., 2015
  • 13. Now that we have the appropriate tool Let’s see some real world applications
  • 14. Video Semantic Segmentation - STFCN Fayyaz et. al., 2016
  • 15. Video Semantic Segmentation – C3D Tran et. al., 2015
  • 16. Action Recognition & Video Classification Simonyan et. al., 2014
  • 17. Does video have visual data only?
  • 18. Action Recognition & Video Classification Wu et al., 2015 Audio + Vision
  • 19. Let’s briefly take a look at some state-of-the- art Image based Networks
  • 20. Extremely Deep Networks Residual Networks • Problem: Gradients Vanish in Back-propagation • Solution: Let’s make a shortcut for them! • Y = 𝐻(𝑋, 𝑊𝐻) -> Y = 𝐻 𝑋, 𝑊𝐻 + 𝑋
  • 21. Extremely Deep Networks Highway Networks • Similar to ResNets • The shortcuts are controlled using a learnable parameter to have a better trade-off between being • Y = 𝐻 𝑋, 𝑊𝐻 . 𝑇 𝑋, 𝑊𝑇 + 𝑋. (1 − 𝑇 𝑋, 𝑊𝑇 )
  • 22. Extremely Deep Networks DenseNets • If ResNet works with just connecting previous layers, why not connecting all?! • 𝑌 = 𝐹(𝑋 𝑛, 𝑋 𝑛−1, …, 𝑋0) • Improvements in both Forward & • Backward
  • 23. Now what if we use the idea of propagating data and gradients between shallow and deep layers in video based networks?
  • 24. Up to here everything was Supervised But there are bunch of data across the Internet with weak labels … Let’s go through Weakly-Supervised methods
  • 25. Weakly Supervised Learning Weakly Supervised Learning with CNNs • Multiple Labeling • Weakly Localization • Data can be crawled over Internet • Can be adopted to Video Oquab et. al., 2015
  • 26. How about some Unsupervised methods …
  • 27. Unsupervised Learning Anticipating Visual Representations From Unlabeled Video • Training on Big Huge amount of unlabeled Video across the net • Training Classifiers on the final output Vondrick et. al., 2016
  • 29. What Hardware do I use? • NVIDIA GPU + SSD + HDD • More info on: http://www.DeepLearning.ir
  • 30. What framework do I use? Caffe Torch Tensorflow Theano Keras Microsoft CNTK Deeplearning4j …
  • 31. What framework do I use? Tensorflow Torch Theano From Karpathy’s slides
  • 32. Distributed Training: Will be presented at my next presentation at Sharif University of Technology on 22 Dey 1395 (11 Jan 2017) From Karpathy’s slides