GradientDescent.pptx

•Download as PPTX, PDF•

0 likes•10 views

Dhiraj7023

Gradient Descent Optimizer in Linear Regression.

Education

Gradient Descent
• Gradient Descent is an optimization algorithm.
• Purpose: To find the optimal value of the cost function
• MSE/SSE is the default cost functions
• End goal is to find the best fit line.
• Used in both ML and DL

Best Fit Line
Among all the possible lines there will be only one line that minimizes errors.
That one line will be the best fit line

What is an error/residue in linear regression
• A residual is a measure of how far away a point is vertically from the regression line.
• Simply, it is the error between a predicted value and the observed actual value.

What is loss function in linear regression
• In statistics and machine learning, a loss function quantifies the losses generated by
the errors that we commit when:
Loss function = (𝒚 𝒂𝒄𝒕𝒖𝒂𝒍 − 𝒚(𝒑𝒓𝒆𝒅))^2

Types of loss function
• Mean Absolute Error (MAE).
• Mean Absolute Percentage Error (MAPE).
• Mean Squared Error (MSE).
• Root Mean Squared Error (RMSE).
• Huber Loss.
• Log-Cosh Loss.

What is cost function in linear regression
• Cost function measures how a machine learning model performs.
• Formula for cost function:

Relation Between Variance And Error or Residual
Equation of variance :
Equation of cost function :

Two types of variance
1. stochastic variance or noise :
- Unexplained Error

2. Deterministic variance :
- Explained error

Partial Differentiation
1. The process of finding the partial differentiation of the given function is called
partial differentiation.
2. It is used when we take one of the target lines of the graph of a given function &
obtain its slope.

Partial Differentiation
1. For small changes in the value of m
or c in either direction, gives how
much your error varies. i.e:- increase
or decrease
2. Partial derivative helps to reach that
point from where our BFL pass.

Learning Rate
Learning Rate gives the rate of speed where the Gradient moves during gradient
descent.

Learning Rate
1. Setting a high value makes your path unstable, and too low makes convergence slow.
2. If we put the Learning rate value Zero, then there is no movement.
• C(new)= C(old) – [ŋ*(slope)]

Learning Rate
• Example:-
C(old)= -10
Slope = -30
C(new) = C(old) – slope
= -10 - ( -30)
= 20
For reducing C(new) value we use Learning Rate
C(new) = C(old) - (ŋ * slope)
= -10 - (0.1 * -30)
= -7

Gradient Descent in DL
Learning Rate : 0.01 Data Points : 300 Epochs :
170

Gradient Descent in DL
Learning Rate : 0.02 Data Points : 300 Epochs :
40

Similar to GradientDescent.pptx

ME6501 Cad 2 mark question and answerJavith Saleem

Static Analysis of Computer programs Arvind Devaraj

Robust and Real Time Detection of Curvy Lanes (Curves) Having Desired Slopes ...csandit

Robust and real time detection ofcsandit

Simplex algorithmKhwaja Bilal Hassan

Two marks with answers ME6501 CADPriscilla CPG

CSCI 2033 Elementary Computational Linear Algebra(Spring 20.docxmydrynan

Simplex AlgorithmAizaz Ahmad

05 contours seg_matchingankit_ppt

ML PRESENTATION (1).pptxDADDYisback11

EPE821_Lecture3.pptxIhtisham Uddin

Simplex AlgorithmMuhammad Kashif

DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORMsipij

MSE.pptxJantuRahaman

Manifold Blurring Mean Shift algorithms for manifold denoising, report, 2012Florent Renucci

Methods of Optimization in Machine LearningKnoldus Inc.

Daa chapter 1B.Kirron Reddi

04 image transformations_iiankit_ppt

Gradient Decent in Linear Regression.pptxalphaomega55

Machine learning Module-2, 6th Semester ElectiveMayuraD1

Similar to GradientDescent.pptx (20)

ME6501 Cad 2 mark question and answer

Static Analysis of Computer programs

Robust and Real Time Detection of Curvy Lanes (Curves) Having Desired Slopes ...

Robust and real time detection of

Simplex algorithm

Two marks with answers ME6501 CAD

CSCI 2033 Elementary Computational Linear Algebra(Spring 20.docx

Simplex Algorithm

05 contours seg_matching

ML PRESENTATION (1).pptx

EPE821_Lecture3.pptx

Simplex Algorithm

DESIGN OF DELAY COMPUTATION METHOD FOR CYCLOTOMIC FAST FOURIER TRANSFORM

MSE.pptx

Manifold Blurring Mean Shift algorithms for manifold denoising, report, 2012

Methods of Optimization in Machine Learning

Daa chapter 1

04 image transformations_ii

Gradient Decent in Linear Regression.pptx

Machine learning Module-2, 6th Semester Elective

Recently uploaded

How to Manage Global Discount in Odoo 17 POSCeline George

UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi

ICT role in 21st century education and it's challenges.MaryamAhmad92

Single or Multiple melodic lines structuredhanjurrannsibayan2

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxmarlenawright1

Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic

This PowerPoint helps students to consider the concept of infinity.christianmathematics

Sociology 101 Demonstration of Learning Exhibitjbellavia9

General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil

Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University of Engineering & Technology, Jamshoro

How to setup Pycharm environment for Odoo 17.pptxCeline George

SOC 101 Demonstration of Learning Presentationcamerronhm

Holdier Curriculum Vitae (April 2024).pdfagholdier

How to Add New Custom Addons Path in Odoo 17Celine George

Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxPooja Bhuva

Accessible Digital Futures project (20/03/2024)Jisc

REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva

Understanding Accommodations and ModificationsMJDuyan

Recently uploaded (20)

How to Manage Global Discount in Odoo 17 POS

UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf

ICT role in 21st century education and it's challenges.

Single or Multiple melodic lines structure

COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx

Key note speaker Neum_Admir Softic_ENG.pdf

This PowerPoint helps students to consider the concept of infinity.

Sociology 101 Demonstration of Learning Exhibit

General Principles of Intellectual Property: Concepts of Intellectual Proper...

Mehran University Newsletter Vol-X, Issue-I, 2024

How to setup Pycharm environment for Odoo 17.pptx

SOC 101 Demonstration of Learning Presentation

Holdier Curriculum Vitae (April 2024).pdf

How to Add New Custom Addons Path in Odoo 17

Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx

Accessible Digital Futures project (20/03/2024)

REMIFENTANIL: An Ultra short acting opioid.pptx

Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...

Understanding Accommodations and Modifications

GradientDescent.pptx

1. Gradient Descent • Gradient Descent is an optimization algorithm. • Purpose: To find the optimal value of the cost function • MSE/SSE is the default cost functions • End goal is to find the best fit line. • Used in both ML and DL

2. Best Fit Line Among all the possible lines there will be only one line that minimizes errors. That one line will be the best fit line

3. What is an error/residue in linear regression • A residual is a measure of how far away a point is vertically from the regression line. • Simply, it is the error between a predicted value and the observed actual value.

4. What is loss function in linear regression • In statistics and machine learning, a loss function quantifies the losses generated by the errors that we commit when: Loss function = (𝒚 𝒂𝒄𝒕𝒖𝒂𝒍 − 𝒚(𝒑𝒓𝒆𝒅))^2

5. Types of loss function • Mean Absolute Error (MAE). • Mean Absolute Percentage Error (MAPE). • Mean Squared Error (MSE). • Root Mean Squared Error (RMSE). • Huber Loss. • Log-Cosh Loss.

6. What is cost function in linear regression • Cost function measures how a machine learning model performs. • Formula for cost function:

7. Relation Between Variance And Error or Residual Equation of variance : Equation of cost function :

8. Two types of variance 1. stochastic variance or noise : - Unexplained Error

9. 2. Deterministic variance : - Explained error

10. CONVEX FUNCTION Cost function :

11. Partial Differentiation 1. The process of finding the partial differentiation of the given function is called partial differentiation. 2. It is used when we take one of the target lines of the graph of a given function & obtain its slope.

12. Partial Differentiation 1. For small changes in the value of m or c in either direction, gives how much your error varies. i.e:- increase or decrease 2. Partial derivative helps to reach that point from where our BFL pass.

13. Learning Rate Learning Rate gives the rate of speed where the Gradient moves during gradient descent.

14. Learning Rate 1. Setting a high value makes your path unstable, and too low makes convergence slow. 2. If we put the Learning rate value Zero, then there is no movement. • C(new)= C(old) – [ŋ*(slope)]

15. Learning Rate • Example:- C(old)= -10 Slope = -30 C(new) = C(old) – slope = -10 - ( -30) = 20 For reducing C(new) value we use Learning Rate C(new) = C(old) - (ŋ * slope) = -10 - (0.1 * -30) = -7

16. Gradient Descent in ML

17. Gradient Descent in DL Learning Rate : 0.01 Data Points : 300 Epochs : 170

18. Gradient Descent in DL Learning Rate : 0.02 Data Points : 300 Epochs : 40

19. THANK YOU

GradientDescent.pptx

Recommended

Recommended

More Related Content

Similar to GradientDescent.pptx

Similar to GradientDescent.pptx (20)

Recently uploaded

Recently uploaded (20)

GradientDescent.pptx