SlideShare une entreprise Scribd logo
1  sur  18
CS380/CS580 Artificial Intelligence for Games
EFFICIENT BACKPROP
Yann LeCun
Leon Bottou
Genevieve B Orr
Klaus-Robert Miller
EFFICIENT
BACKPROP
A. Introduction
B. Few Practical Tricks
C. Multiple Dimension Gradient
D. Optimization Methods
Overview
Making neural network work is more of an art than
science
Choices:-
Number of nodes
Number of layers
Activation function
Learning rate
And so on
▐ Introduction
Trick 1: Stochastic versus Batch learning
Stochastic Batch
▐ A Few Practical Tricks
Trick 2: Shuffling the Examples
▐ A Few Practical Tricks
Trick 3: Normalizing the Inputs
▐ A Few Practical Tricks
Trick 4: The Sigmoid
▐ A Few Practical Tricks
Trick:
Trick 5: Initializing the Weights
Weights Very Large/Small Small Gradient Slow Learning
Weights should be in the range of linear region of sigmod
Advantage:
(1) Gradients will be large enough
(2) Easier to learn linear part for network
Trick: Initializing Weights where, m= No. of Input values
▐ A Few Practical Tricks
Trick 6: Choosing Learning Rates
Approach 1 : Adjusting Learning rate depending on the weight vector.
Problem: Cannot be applied to Stochastic or Online learning methods.
Approach 2: Maintain different learning rates for each element of weight vector.
Calculate 2nd Derivative
Make sure that all weights converge at the same speed.
Trick: Learning rates should be proportional to the square root of connections
sharing that weight.
▐ A Few Practical Tricks
Trick 6: Choosing Learning Rates
Approach 1 : Adjusting Learning rate depending on the weight vector.
Problem: Cannot be applied to Stochastic or Online learning methods.
Approach 2: Maintain different learning rates for each element of weight vector.
Calculate 2nd Derivative
Make sure that all weights converge at the same speed.
Trick: Learning rates should be proportional to the square root of connections
sharing that weight.
▐ A Few Practical Tricks
Learning rate affects the convergence
▐ Single Dimension Gradient
Taylor Series:-
▐ Single Dimension Gradient
▐ Single Dimension Gradient
▐ Multiple Dimension Gradient
Hessian: Measure of curvature of E in multiple dimension.
▐ Multiple Dimension Gradient
Hessian: Measure of curvature of E in multiple dimension.
▐ Second Order Optimization Methods
Newton Algorithm
Whitening Transform well known
in signal processing can convert
ellipsoidal to spherical shape
▐ Second Order Optimization Methods
Conjugate Gradient
Minimize the gradient along a line.
(1) Does not use Hessian explicitly
(2)It is O(N) Method
(3)Works only for batch training
(4)Gradient doesn’t change the
direction but only it’s length
CS380/CS580 Artificial Intelligence for Games
Thank you
Questions/Comments

Contenu connexe

Similaire à Efficient Backpropagation

Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learningJie-Han Chen
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learningJie-Han Chen
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...Dongmin Choi
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Cheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksCheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksSteve Nouri
 
An overview of gradient descent optimization algorithms.pdf
An overview of gradient descent optimization algorithms.pdfAn overview of gradient descent optimization algorithms.pdf
An overview of gradient descent optimization algorithms.pdfvudinhphuong96
 
Linear Regression.pptx
Linear Regression.pptxLinear Regression.pptx
Linear Regression.pptxssuser2624f71
 
Quasi newton artificial neural network training algorithms
Quasi newton artificial neural network training algorithmsQuasi newton artificial neural network training algorithms
Quasi newton artificial neural network training algorithmsMrinmoy Majumder
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep LearningSourya Dey
 
Setting Artificial Neural Networks parameters
Setting Artificial Neural Networks parametersSetting Artificial Neural Networks parameters
Setting Artificial Neural Networks parametersMadhumita Tamhane
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch Eran Shlomo
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Wuhyun Rico Shin
 

Similaire à Efficient Backpropagation (20)

Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learning
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learning
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...
 
Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)
 
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
Optimizing Deep Networks (D1L6 Insight@DCU Machine Learning Workshop 2017)
 
Deeplearning
Deeplearning Deeplearning
Deeplearning
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
Optimization for Deep Networks (D2L1 2017 UPC Deep Learning for Computer Vision)
 
Cheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricksCheatsheet deep-learning-tips-tricks
Cheatsheet deep-learning-tips-tricks
 
An overview of gradient descent optimization algorithms.pdf
An overview of gradient descent optimization algorithms.pdfAn overview of gradient descent optimization algorithms.pdf
An overview of gradient descent optimization algorithms.pdf
 
Linear Regression.pptx
Linear Regression.pptxLinear Regression.pptx
Linear Regression.pptx
 
Quasi newton artificial neural network training algorithms
Quasi newton artificial neural network training algorithmsQuasi newton artificial neural network training algorithms
Quasi newton artificial neural network training algorithms
 
Neural Network Part-2
Neural Network Part-2Neural Network Part-2
Neural Network Part-2
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Techniques in Deep Learning
Techniques in Deep LearningTechniques in Deep Learning
Techniques in Deep Learning
 
Setting Artificial Neural Networks parameters
Setting Artificial Neural Networks parametersSetting Artificial Neural Networks parameters
Setting Artificial Neural Networks parameters
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
deep CNN vs conventional ML
deep CNN vs conventional MLdeep CNN vs conventional ML
deep CNN vs conventional ML
 
Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.Paper review: Learned Optimizers that Scale and Generalize.
Paper review: Learned Optimizers that Scale and Generalize.
 

Plus de Aakash Chotrani

What is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamWhat is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamAakash Chotrani
 
Deep q learning with lunar lander
Deep q learning with lunar landerDeep q learning with lunar lander
Deep q learning with lunar landerAakash Chotrani
 
Course recommender system
Course recommender systemCourse recommender system
Course recommender systemAakash Chotrani
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in gamesAakash Chotrani
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Aakash Chotrani
 

Plus de Aakash Chotrani (7)

What is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamWhat is goap, and why is it not already mainstream
What is goap, and why is it not already mainstream
 
Deep q learning with lunar lander
Deep q learning with lunar landerDeep q learning with lunar lander
Deep q learning with lunar lander
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Course recommender system
Course recommender systemCourse recommender system
Course recommender system
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in games
 
Simple & Fast Fluids
Simple & Fast FluidsSimple & Fast Fluids
Simple & Fast Fluids
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning
 

Dernier

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024Timothy Spann
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 

Dernier (20)

Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 

Efficient Backpropagation

  • 1. CS380/CS580 Artificial Intelligence for Games EFFICIENT BACKPROP Yann LeCun Leon Bottou Genevieve B Orr Klaus-Robert Miller
  • 2. EFFICIENT BACKPROP A. Introduction B. Few Practical Tricks C. Multiple Dimension Gradient D. Optimization Methods Overview
  • 3. Making neural network work is more of an art than science Choices:- Number of nodes Number of layers Activation function Learning rate And so on ▐ Introduction
  • 4. Trick 1: Stochastic versus Batch learning Stochastic Batch ▐ A Few Practical Tricks
  • 5. Trick 2: Shuffling the Examples ▐ A Few Practical Tricks
  • 6. Trick 3: Normalizing the Inputs ▐ A Few Practical Tricks
  • 7. Trick 4: The Sigmoid ▐ A Few Practical Tricks Trick:
  • 8. Trick 5: Initializing the Weights Weights Very Large/Small Small Gradient Slow Learning Weights should be in the range of linear region of sigmod Advantage: (1) Gradients will be large enough (2) Easier to learn linear part for network Trick: Initializing Weights where, m= No. of Input values ▐ A Few Practical Tricks
  • 9. Trick 6: Choosing Learning Rates Approach 1 : Adjusting Learning rate depending on the weight vector. Problem: Cannot be applied to Stochastic or Online learning methods. Approach 2: Maintain different learning rates for each element of weight vector. Calculate 2nd Derivative Make sure that all weights converge at the same speed. Trick: Learning rates should be proportional to the square root of connections sharing that weight. ▐ A Few Practical Tricks
  • 10. Trick 6: Choosing Learning Rates Approach 1 : Adjusting Learning rate depending on the weight vector. Problem: Cannot be applied to Stochastic or Online learning methods. Approach 2: Maintain different learning rates for each element of weight vector. Calculate 2nd Derivative Make sure that all weights converge at the same speed. Trick: Learning rates should be proportional to the square root of connections sharing that weight. ▐ A Few Practical Tricks
  • 11. Learning rate affects the convergence ▐ Single Dimension Gradient
  • 12. Taylor Series:- ▐ Single Dimension Gradient
  • 14. ▐ Multiple Dimension Gradient Hessian: Measure of curvature of E in multiple dimension.
  • 15. ▐ Multiple Dimension Gradient Hessian: Measure of curvature of E in multiple dimension.
  • 16. ▐ Second Order Optimization Methods Newton Algorithm Whitening Transform well known in signal processing can convert ellipsoidal to spherical shape
  • 17. ▐ Second Order Optimization Methods Conjugate Gradient Minimize the gradient along a line. (1) Does not use Hessian explicitly (2)It is O(N) Method (3)Works only for batch training (4)Gradient doesn’t change the direction but only it’s length
  • 18. CS380/CS580 Artificial Intelligence for Games Thank you Questions/Comments