Machine learning models involve a bias-variance tradeoff, where increased model complexity can lead to overfitting training data (high variance) or underfitting (high bias). Bias measures how far model predictions are from the correct values on average, while variance captures differences between predictions on different training data. The ideal model has low bias and low variance, accurately fitting training data while generalizing to new examples.
1. Machine Learning:
Bias and Variance Trade-off
Ajitkumar Shitole
Computer Engineering
International Institute of Information Technology, I²IT
www.isquareit.edu.in
2. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
Bias and Variance
• Bias: It is the amount by which Machine Learning (ML)
model predictions differ from the actual value of the target.
e = yactual - ypred
Where e=Bias Error, yactual = Actual or Target Output and
ypred= Predicted Output.
• Variance: It is the amount by which the ML model
prediction would change if we estimate it using different
training datasets.
3. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
Bias and Variance
• Suppose e1, e2, and e3 are the bias errors of the model
with three different training datasets.
•Average Bias Error = b= (e1 + e2 + e3) / 3
•Average Variance Error=[(e1 - b)2 + (e2 - b)2 + (e3 - b)2] / 3
•Total Error= Bias + Variance
Occam’s Razor Principle
• Construct the simplest ML model which gives the
acceptable accuracy on training datasets and don’t
complicate the model to over fit the training dataset.
4. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
Under fitting and Overfitting
•Under fitting: The ML model with the high bias pays very
little attention to the training dataset and leads to high error
on training as well as testing datasets.
High bias tends to under fitting
•Over fitting: The model with high variance pays a lot of
attention to the training dataset and does not generalize the
unseen data.
High variance tends to over fitting
5. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
Under fitting and Overfitting
•Low Bias and Low Variance leads to Ideal ML model
with acceptable performance.
•Linear Regression, Logistic Regression, and Linear
Discriminant Analysis are High Bias ML algorithms
•Decision Tree, Support Vector Machine, and K-Nearest
Neighbor are High Variance ML algorithms.
6. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
Under fitting and Overfitting
Figure 1 shows that over fit model covers all training samples
where as under fit model covers only very few samples. Good
balance model covers the samples with acceptable accuracy.
Figure 1. Model Complexity [1]
7. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
Bull’s Eye for Bias and Variance Tradeoff
Figure 2. Bias and Variance Tradeoff [2]
8. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
Bias and Variance Tradeoff
Figure 2 shows Bull’s Eye for Bias and Variance tradeoff.
• High Bias and Low Variance leads to Under fitting.
• Low Bias and High Variance leads to Over fitting.
• Low Bias and Low Variance leads to Ideal Model or Good
Model.
9. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
References
[1] Bias–variance tradeoff – Wikipedia
[2] Bias–variance tradeoff – Wikipedia
10. International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
THANK - YOU
International Institute of Information Technology
(I²IT)
P-14, Rajiv Gandhi Infotech Park, MIDC Phase –
1, Hinjawadi, Pune – 411057, India
http://www.isquareit.edu.in/
info@isquareit.edu.in