3. LINEAR
REGRESSION
𝑦 = 𝜃0 + 𝜃1𝑥1 + 𝜃2𝑥2 + 𝜃3𝑥3 + ⋯ + 𝜃𝑛𝑥𝑛
• y ∶ predicted value
• n : size of data
• x : input data
𝑀𝑆𝐸 𝑋, ℎ𝜃 =
1
𝑚 𝑖=1
𝑚
(𝜃𝑇𝑥 𝑖 − 𝑦 𝑖 )2
• Mean Squared Error
Cost function
Predicted Value – Actual Value
Similar to the actual values, MSE value is small
4. GRADIENT
DESCENT AND
ITS TYPES
Method to find local optima (maximum or minimum)
To adjust the parameters repeatedly to minimize the cost
function
Gradient tells us direction of greatest increase
Learning step is learning rate
Cost
Local minimum
Global minimum
Plateau
Figure: Gradient Descent pitfalls
Cost
Start
Figure: Learning Rate
Batch
Gradient
Descent
Mini-
Batch
Gradient
Descent
Stochastic
Gradient
Descent
TYPES
6. LEARNING
CURVE
Learning curve shows how accuracy changes with varying
sample size.
Make subset in training set and train several times.
Figure: Learning Curve