SlideShare une entreprise Scribd logo
1  sur  41
R for Data Science Course:
Linear Regression
Slide template by: https://www.slidescarnival.com/
I am Ivo
I am here to teach you about R.
Hi!
2
Linear Regression
Intuition
1
3
Linear Regression
Simplicity
One of the simplest algorithms to develop and interpret. With Linear Regression we
want to find the best line that fit a set of points.
4
Continuous Variables
In Linear Regression, you want to predict a continuous variable – variables that may
have a theoretical infinite number of values, some examples:
○ House prices;
○ Weight of someone;
○ Height of someone;
○ A stock Portfolio return;
Example Exercise – House
Prices
5
0.00 €
50,000.00 €
100,000.00 €
150,000.00 €
200,000.00 €
250,000.00 €
300,000.00 €
350,000.00 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Example Exercise – House
Prices
6
0.00 €
50,000.00 €
100,000.00 €
150,000.00 €
200,000.00 €
250,000.00 €
300,000.00 €
350,000.00 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
How much would this new house
cost,
If we just know that it has an area of
122 sq/mt?
Example Exercise – House
Prices
7
0.00 €
50,000.00 €
100,000.00 €
150,000.00 €
200,000.00 €
250,000.00 €
300,000.00 €
350,000.00 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
One idea is to build a line that would
represent the relationship between
the Area and the Price of the house
and then classify the new point
according to that line.
That idea is Linear Regression!
Example Exercise – House
Prices
8
0.00 €
50,000.00 €
100,000.00 €
150,000.00 €
200,000.00 €
250,000.00 €
300,000.00 €
350,000.00 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
But... How exactly do we find this
best line? There are a ton of infinite
lines we can draw.
Y = b+mx
A line can be characterized by the equation above
- where b is the intercept, m is the slope and x is
the value of the variable.
9
Algebra Recap!
Linear Regression
10
Definition
The linear regression models the relationship between an independent variable (commonly called y
variable) and the dependent variables (commonly called x’s). This relationship is modeled by a simple
linear equation based on the following – notice how it can be multivariate:
𝑦 = 𝑏0 + 𝑏1𝑥1 + 𝑏2𝑥2 + … + 𝑏𝑛𝑥𝑛
Target variable!
Bias / Intercept Coefficient for
2nd Variable (if
exists)
Variable Value
How do we model the relationship
for our “House Pricing” example?
11
Linear Regression
12
House Pricing Example
𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 𝐵𝑖𝑎𝑠 + 𝑏1 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠
For our example, where we want to predict the House Price of a house with 122 squared meters, we know
The following:
House Price = ❓ -> This is our objective, finding what’s the house price.
Bias = ❓
B1 = ❓
Squared Meters = 122 ❓
How exactly do we learn bias and
b1 to get to our house price?
13
0.00 €
50,000.00 €
100,000.00 €
150,000.00 €
200,000.00 €
250,000.00 €
300,000.00 €
350,000.00 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Linear Regression
14
First Idea, Try Random Values
𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 1000 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠
Bias = 10000, B1 =
1000
𝟏𝟑𝟐. 𝟎𝟎𝟎€ = 10000 + 1000 ∗ 122
These values don’t seem a
good fit at all, let’s try more.
Doesn’t make sense to have a
house that has 121 meters
squared to cost
approximately the same as a
70 m/sq one (assuming no
other variables have
influence).
0.00 €
50,000.00 €
100,000.00 €
150,000.00 €
200,000.00 €
250,000.00 €
300,000.00 €
350,000.00 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Linear Regression
15
Second Idea, Try More Random Values
𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 1500 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠
Bias = 10000, B1 =
1500
𝟏𝟗𝟑. 𝟎𝟎𝟎€ = 10000 + 1500 ∗ 122
We are getting closer! Let’s
raise B1 a bit more.
0.00 €
50,000.00 €
100,000.00 €
150,000.00 €
200,000.00 €
250,000.00 €
300,000.00 €
350,000.00 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Linear Regression
16
Third Idea, Try More Random Values
𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 2000 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠
Bias = 10000, B1 =
2000
𝟐𝟓𝟒. 𝟎𝟎𝟎€ = 10000 + 2000 ∗ 122
That seems a good fit!
During this trial and error
approach, we also developed
an equation!
Linear Regression
17
Equation Generated:
𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 2000 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Linear Regression – Least
Squares
18
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Visually, we want to construct a line that passes through most of the points.
Mathematically, we want to get a line that minimizes between each point and the potential line.
Generally called,
least squares
regression!
This line is
awful as the
difference
between each
point and the
line is huge.
Linear Regression – Least
Squares
19
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Visually, we want to construct a line that passes through most of the points.
Mathematically, we want to get a line that minimizes between each point and the potential line.
Generally called,
least squares
regression!
This line is
better as the
difference
between each
point and the
line is lower.
Linear Regression – Least
Squares
20
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Visually, we want to construct a line that passes through most of the points.
Mathematically, we want to get a line that minimizes between each point and the potential line.
Generally called,
least squares
regression!
This line seems
really good as
the error
between each
point and the
line is really low.
Cost Function
21
Our cost function (sometimes called loss or minimization function) is this difference between the
value of our line and each point. In regression, the most common cost function to use is the mean
squared error:
The larger this
cost function,
the worse our
line is in
predicting the
house prices!
Each y is the real
house price.
Each y~i is the
value predicted
by our line.
n is the number
of houses in our
sample.
Linear Regression – Least
Squares
22
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Why Square the
difference ? So
that the errors
don’t cancel out
each other
y1
y~1
Contribution to cost function = (𝑦1 − 𝑦~1)2
y2
y~2
Contribution to cost function = (𝑦2 − 𝑦~2)2
Cost Function
23
Each of our different lines, will produce different cost function values.
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
This line produces a cost
function value of around
22.500.000.000 of mean
squared error – our cost
function value
Cost Function
24
Each of our different lines, will produce different cost function values.
This line produces a cost
function value of around
4.900.000.000 of mean
squared error – our cost
function value
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
Cost Function
25
Each of our different lines, will produce different cost function values.
This line produces a cost
function value of around
100.000.000 of mean
squared error – our cost
function value
0 €
50,000 €
100,000 €
150,000 €
200,000 €
250,000 €
300,000 €
350,000 €
0 20 40 60 80 100 120 140 160
House
Price
House Area (in Square Meters)
House Price vs. House Area
26
Cost Function Plot
Imagining we have B0, or bias, fixed – each b1 (also called coefficient) will produce different cost function values.
0
5,000,000,000
10,000,000,000
15,000,000,000
20,000,000,000
25,000,000,000
30,000,000,000
35,000,000,000
40,000,000,000
45,000,000,000
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
3000
3200
3400
3600
3800
4000
4200
4400
4600
4800
5000
Cost
Function
Value
Value of B1 - Coefficient to Multiply with House Area
Cost Function per Value of B1
This point minimizes our
error between the line and
points!
27
Gradient Descent
Most algorithm implementations perform this search by doing gradient descent:
0
5,000,000,000
10,000,000,000
15,000,000,000
20,000,000,000
25,000,000,000
30,000,000,000
35,000,000,000
40,000,000,000
45,000,000,000
0
200
400
600
800
1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
3000
3200
3400
3600
3800
4000
4200
4400
4600
4800
5000
Cost
Function
Value
Value of B1 - Coefficient to Multiply with House Area
Cost Function per Value of B1
Randomnly initialized weights
may start here!
Closed Form Solution
28
Particularly for Linear Regression, there is a formula named Closed Form Solution that
outputs the best coefficients – this formula is only valid for Linear Regression:
Read as the inverse of the
transpose of the Features
(independent variables) – X’
multiplied by the same matrix
- X
Read as the Transpose of the
Features (independent
variables) – X’ multiplied by
The target variable
(dependent variable) (y)
To retain
◉ Regression problems are problems where we want to
predict a continuous variable.
◉ Linear Regression is one of the algorithms to solve that
problem.
◉ A Linear Regression finds the equation that best fit our
points – minimizing the error between the line and the
points.
◉ We can find the best coefficients for our line using
either Gradient Descent or Closed Form Solution.
29
How do we train linear regressions
in R?
30
lm function
◉ The lm function let us train linear regressions
really quickly in R:
◉ lm(y ~ x1 + x2 + ... + xn, data=dataframe)
◉ For our house prices example: lm(house_price ~
house_area, data = houses)
◉ You can read this as y as a function of x1 and x2 ...
and xn.
31
New example! Using the mtcars
data frame
32
A quick example using
another dataset
33
The mtcars data frame is an
R internal data set that
contains information about
different car models.
Let’s see if we can predict
consumption (mpg) as a
function of cyl, disp, hp, drat
and wt. These are all
columns that characterize
the car – number of
cylinders, horse power, etc.
You can check their
description using ?mtcars
on the R console.
Here is the function we will pass to R:
lm(mpg ~ cyl + disp + hp + drat + wt, data = mtcars)
Miles Per Gallon Prediction
34
The output is really cool, it gives us the coefficients (weights) for each variable:
Can you guess the equation that was generated with these coefficients❓
The equation generated:
Miles Per Gallon Prediction
35
𝑀𝑃𝐺 = 36.00836 − 1,10749 ∗ 𝑐𝑦𝑙 + 0.01236 ∗ 𝑑𝑖𝑠𝑝 − 0.02402 ∗ ℎ𝑝 + 0.95221 ∗ 𝑑𝑟𝑎𝑡 − 3,67329 ∗ 𝑤𝑡
The equation generated:
Miles Per Gallon Prediction
36
The cool thing is that the sign of these coefficients point towards the influence of the variable:
- The more cylinders the motor has, the less miles per gallon the car will make (negative weight)
- The more horsepower the motor has, the less miles per gallon the car will make (negative weight)
- The more displacement the car has, the more miles per gallon the car will make. (positive weight)
Are all variables relevant?
37
We can call summary to check more information about the regression – we’ll explore more of the output of the
summary command in the practical lectures:
mpg_prediction <- lm(mpg ~ cyl + disp + hp + drat + wt, data = mtcars)
summary(mpg_prediction)
Miles Per Gallon Prediction
38
These signs point us
towards the influence of the
variable – they refer the
significance level of an
hypothesis test
We want these residuals to
be near-normal distributed:
- Median near 0;
- Similar absolute value of
max and min.
Assumptions - Linear
Regression
◉ The target and features must have a linear relationship;
◉ No correlated features (extremely hard in real world
scenarios)
◉ Independence of Observations;
◉ Somewhat Normal Residuals;
◉ Homoscedasticity – Constant variance accross the errors of
the predictions;
39
You can find me at:
LinkedIn Udemy Medium
40
Want to discover more?
Join my R courses on Udemy risk-free (30 day refund policy), where you will
have the chance to learn with practical exercises:
- R for Absolute Beginners
- R for Data Science
Credits
Special thanks to all the people who made and
released these awesome resources for free:
◉ Presentation template by SlidesCarnival
41

Contenu connexe

Similaire à R For Data Science - Linear Regression

Break Even Analysis
Break Even AnalysisBreak Even Analysis
Break Even AnalysisARIES TIBAR
 
Regression Presentation.pptx
Regression Presentation.pptxRegression Presentation.pptx
Regression Presentation.pptxMuhammadMuslim25
 
Final exam jeopardy
Final exam jeopardyFinal exam jeopardy
Final exam jeopardyBrad Opfer
 
regressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdfregressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdfAdikesavaperumal
 
Number System & Data Representation
Number System & Data RepresentationNumber System & Data Representation
Number System & Data RepresentationPhillip Glenn Libay
 
Plane-and-Solid-Geometry. introduction to proving
Plane-and-Solid-Geometry. introduction to provingPlane-and-Solid-Geometry. introduction to proving
Plane-and-Solid-Geometry. introduction to provingReyRoluna1
 
Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...
Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...
Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...Jonathan Zimmermann
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/Decimals1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/DecimalsMel Anthony Pepito
 
1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/Decimals1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/DecimalsRudy Alfonso
 
Lesson 1 basic theory of information
Lesson 1   basic theory of informationLesson 1   basic theory of information
Lesson 1 basic theory of informationRoma Kimberly Erolin
 
Lesson 1 basic theory of information
Lesson 1   basic theory of informationLesson 1   basic theory of information
Lesson 1 basic theory of informationRoma Kimberly Erolin
 
1b-150720094704-lva1-app6892.pdf
1b-150720094704-lva1-app6892.pdf1b-150720094704-lva1-app6892.pdf
1b-150720094704-lva1-app6892.pdfDrBashirMSaad
 
Data Analysison Regression
Data Analysison RegressionData Analysison Regression
Data Analysison Regressionjamuga gitulho
 
2. Break Even Analysis, Systems of Linear Equations.pptx
2. Break Even Analysis, Systems of Linear Equations.pptx2. Break Even Analysis, Systems of Linear Equations.pptx
2. Break Even Analysis, Systems of Linear Equations.pptxRezoanulHaque8
 
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdfCDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdfshubhangisonawane6
 

Similaire à R For Data Science - Linear Regression (20)

Break Even Analysis
Break Even AnalysisBreak Even Analysis
Break Even Analysis
 
Regression Presentation.pptx
Regression Presentation.pptxRegression Presentation.pptx
Regression Presentation.pptx
 
Final exam jeopardy
Final exam jeopardyFinal exam jeopardy
Final exam jeopardy
 
Optimization
OptimizationOptimization
Optimization
 
regressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdfregressionanalysis-110723130213-phpapp02.pdf
regressionanalysis-110723130213-phpapp02.pdf
 
The Real Numbers
The Real NumbersThe Real Numbers
The Real Numbers
 
Chap11 simple regression
Chap11 simple regressionChap11 simple regression
Chap11 simple regression
 
Number System & Data Representation
Number System & Data RepresentationNumber System & Data Representation
Number System & Data Representation
 
Plane-and-Solid-Geometry. introduction to proving
Plane-and-Solid-Geometry. introduction to provingPlane-and-Solid-Geometry. introduction to proving
Plane-and-Solid-Geometry. introduction to proving
 
Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...
Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...
Network Analytics - Homework 3 - Msc Business Analytics - Imperial College Lo...
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/Decimals1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/Decimals
 
1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/Decimals1-4 Comparing/Ordering Whole #/Decimals
1-4 Comparing/Ordering Whole #/Decimals
 
Lesson 1 basic theory of information
Lesson 1   basic theory of informationLesson 1   basic theory of information
Lesson 1 basic theory of information
 
Lesson 1 basic theory of information
Lesson 1   basic theory of informationLesson 1   basic theory of information
Lesson 1 basic theory of information
 
Business statistics homework help
Business statistics homework helpBusiness statistics homework help
Business statistics homework help
 
1b-150720094704-lva1-app6892.pdf
1b-150720094704-lva1-app6892.pdf1b-150720094704-lva1-app6892.pdf
1b-150720094704-lva1-app6892.pdf
 
Data Analysison Regression
Data Analysison RegressionData Analysison Regression
Data Analysison Regression
 
2. Break Even Analysis, Systems of Linear Equations.pptx
2. Break Even Analysis, Systems of Linear Equations.pptx2. Break Even Analysis, Systems of Linear Equations.pptx
2. Break Even Analysis, Systems of Linear Equations.pptx
 
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdfCDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
CDS Fundamentals of digital communication system UNIT 1 AND 2.pdf
 

Dernier

Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...gajnagarg
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 

Dernier (20)

Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
Just Call Vip call girls Erode Escorts ☎️9352988975 Two shot with one girl (E...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls Palakkad Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Nandini Layout ☎ 7737669865 🥵 Book Your One night Stand
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 

R For Data Science - Linear Regression

  • 1. R for Data Science Course: Linear Regression Slide template by: https://www.slidescarnival.com/
  • 2. I am Ivo I am here to teach you about R. Hi! 2
  • 4. Linear Regression Simplicity One of the simplest algorithms to develop and interpret. With Linear Regression we want to find the best line that fit a set of points. 4 Continuous Variables In Linear Regression, you want to predict a continuous variable – variables that may have a theoretical infinite number of values, some examples: ○ House prices; ○ Weight of someone; ○ Height of someone; ○ A stock Portfolio return;
  • 5. Example Exercise – House Prices 5 0.00 € 50,000.00 € 100,000.00 € 150,000.00 € 200,000.00 € 250,000.00 € 300,000.00 € 350,000.00 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area
  • 6. Example Exercise – House Prices 6 0.00 € 50,000.00 € 100,000.00 € 150,000.00 € 200,000.00 € 250,000.00 € 300,000.00 € 350,000.00 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area How much would this new house cost, If we just know that it has an area of 122 sq/mt?
  • 7. Example Exercise – House Prices 7 0.00 € 50,000.00 € 100,000.00 € 150,000.00 € 200,000.00 € 250,000.00 € 300,000.00 € 350,000.00 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area One idea is to build a line that would represent the relationship between the Area and the Price of the house and then classify the new point according to that line. That idea is Linear Regression!
  • 8. Example Exercise – House Prices 8 0.00 € 50,000.00 € 100,000.00 € 150,000.00 € 200,000.00 € 250,000.00 € 300,000.00 € 350,000.00 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area But... How exactly do we find this best line? There are a ton of infinite lines we can draw.
  • 9. Y = b+mx A line can be characterized by the equation above - where b is the intercept, m is the slope and x is the value of the variable. 9 Algebra Recap!
  • 10. Linear Regression 10 Definition The linear regression models the relationship between an independent variable (commonly called y variable) and the dependent variables (commonly called x’s). This relationship is modeled by a simple linear equation based on the following – notice how it can be multivariate: 𝑦 = 𝑏0 + 𝑏1𝑥1 + 𝑏2𝑥2 + … + 𝑏𝑛𝑥𝑛 Target variable! Bias / Intercept Coefficient for 2nd Variable (if exists) Variable Value
  • 11. How do we model the relationship for our “House Pricing” example? 11
  • 12. Linear Regression 12 House Pricing Example 𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 𝐵𝑖𝑎𝑠 + 𝑏1 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠 For our example, where we want to predict the House Price of a house with 122 squared meters, we know The following: House Price = ❓ -> This is our objective, finding what’s the house price. Bias = ❓ B1 = ❓ Squared Meters = 122 ❓
  • 13. How exactly do we learn bias and b1 to get to our house price? 13
  • 14. 0.00 € 50,000.00 € 100,000.00 € 150,000.00 € 200,000.00 € 250,000.00 € 300,000.00 € 350,000.00 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area Linear Regression 14 First Idea, Try Random Values 𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 1000 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠 Bias = 10000, B1 = 1000 𝟏𝟑𝟐. 𝟎𝟎𝟎€ = 10000 + 1000 ∗ 122 These values don’t seem a good fit at all, let’s try more. Doesn’t make sense to have a house that has 121 meters squared to cost approximately the same as a 70 m/sq one (assuming no other variables have influence).
  • 15. 0.00 € 50,000.00 € 100,000.00 € 150,000.00 € 200,000.00 € 250,000.00 € 300,000.00 € 350,000.00 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area Linear Regression 15 Second Idea, Try More Random Values 𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 1500 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠 Bias = 10000, B1 = 1500 𝟏𝟗𝟑. 𝟎𝟎𝟎€ = 10000 + 1500 ∗ 122 We are getting closer! Let’s raise B1 a bit more.
  • 16. 0.00 € 50,000.00 € 100,000.00 € 150,000.00 € 200,000.00 € 250,000.00 € 300,000.00 € 350,000.00 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area Linear Regression 16 Third Idea, Try More Random Values 𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 2000 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠 Bias = 10000, B1 = 2000 𝟐𝟓𝟒. 𝟎𝟎𝟎€ = 10000 + 2000 ∗ 122 That seems a good fit! During this trial and error approach, we also developed an equation!
  • 17. Linear Regression 17 Equation Generated: 𝐻𝑜𝑢𝑠𝑒 𝑃𝑟𝑖𝑐𝑒 = 10000 + 2000 ∗ 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝑀𝑒𝑡𝑒𝑟𝑠 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area
  • 18. Linear Regression – Least Squares 18 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area Visually, we want to construct a line that passes through most of the points. Mathematically, we want to get a line that minimizes between each point and the potential line. Generally called, least squares regression! This line is awful as the difference between each point and the line is huge.
  • 19. Linear Regression – Least Squares 19 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area Visually, we want to construct a line that passes through most of the points. Mathematically, we want to get a line that minimizes between each point and the potential line. Generally called, least squares regression! This line is better as the difference between each point and the line is lower.
  • 20. Linear Regression – Least Squares 20 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area Visually, we want to construct a line that passes through most of the points. Mathematically, we want to get a line that minimizes between each point and the potential line. Generally called, least squares regression! This line seems really good as the error between each point and the line is really low.
  • 21. Cost Function 21 Our cost function (sometimes called loss or minimization function) is this difference between the value of our line and each point. In regression, the most common cost function to use is the mean squared error: The larger this cost function, the worse our line is in predicting the house prices! Each y is the real house price. Each y~i is the value predicted by our line. n is the number of houses in our sample.
  • 22. Linear Regression – Least Squares 22 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area Why Square the difference ? So that the errors don’t cancel out each other y1 y~1 Contribution to cost function = (𝑦1 − 𝑦~1)2 y2 y~2 Contribution to cost function = (𝑦2 − 𝑦~2)2
  • 23. Cost Function 23 Each of our different lines, will produce different cost function values. 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area This line produces a cost function value of around 22.500.000.000 of mean squared error – our cost function value
  • 24. Cost Function 24 Each of our different lines, will produce different cost function values. This line produces a cost function value of around 4.900.000.000 of mean squared error – our cost function value 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area
  • 25. Cost Function 25 Each of our different lines, will produce different cost function values. This line produces a cost function value of around 100.000.000 of mean squared error – our cost function value 0 € 50,000 € 100,000 € 150,000 € 200,000 € 250,000 € 300,000 € 350,000 € 0 20 40 60 80 100 120 140 160 House Price House Area (in Square Meters) House Price vs. House Area
  • 26. 26 Cost Function Plot Imagining we have B0, or bias, fixed – each b1 (also called coefficient) will produce different cost function values. 0 5,000,000,000 10,000,000,000 15,000,000,000 20,000,000,000 25,000,000,000 30,000,000,000 35,000,000,000 40,000,000,000 45,000,000,000 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 5000 Cost Function Value Value of B1 - Coefficient to Multiply with House Area Cost Function per Value of B1 This point minimizes our error between the line and points!
  • 27. 27 Gradient Descent Most algorithm implementations perform this search by doing gradient descent: 0 5,000,000,000 10,000,000,000 15,000,000,000 20,000,000,000 25,000,000,000 30,000,000,000 35,000,000,000 40,000,000,000 45,000,000,000 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 5000 Cost Function Value Value of B1 - Coefficient to Multiply with House Area Cost Function per Value of B1 Randomnly initialized weights may start here!
  • 28. Closed Form Solution 28 Particularly for Linear Regression, there is a formula named Closed Form Solution that outputs the best coefficients – this formula is only valid for Linear Regression: Read as the inverse of the transpose of the Features (independent variables) – X’ multiplied by the same matrix - X Read as the Transpose of the Features (independent variables) – X’ multiplied by The target variable (dependent variable) (y)
  • 29. To retain ◉ Regression problems are problems where we want to predict a continuous variable. ◉ Linear Regression is one of the algorithms to solve that problem. ◉ A Linear Regression finds the equation that best fit our points – minimizing the error between the line and the points. ◉ We can find the best coefficients for our line using either Gradient Descent or Closed Form Solution. 29
  • 30. How do we train linear regressions in R? 30
  • 31. lm function ◉ The lm function let us train linear regressions really quickly in R: ◉ lm(y ~ x1 + x2 + ... + xn, data=dataframe) ◉ For our house prices example: lm(house_price ~ house_area, data = houses) ◉ You can read this as y as a function of x1 and x2 ... and xn. 31
  • 32. New example! Using the mtcars data frame 32
  • 33. A quick example using another dataset 33 The mtcars data frame is an R internal data set that contains information about different car models. Let’s see if we can predict consumption (mpg) as a function of cyl, disp, hp, drat and wt. These are all columns that characterize the car – number of cylinders, horse power, etc. You can check their description using ?mtcars on the R console.
  • 34. Here is the function we will pass to R: lm(mpg ~ cyl + disp + hp + drat + wt, data = mtcars) Miles Per Gallon Prediction 34 The output is really cool, it gives us the coefficients (weights) for each variable: Can you guess the equation that was generated with these coefficients❓
  • 35. The equation generated: Miles Per Gallon Prediction 35 𝑀𝑃𝐺 = 36.00836 − 1,10749 ∗ 𝑐𝑦𝑙 + 0.01236 ∗ 𝑑𝑖𝑠𝑝 − 0.02402 ∗ ℎ𝑝 + 0.95221 ∗ 𝑑𝑟𝑎𝑡 − 3,67329 ∗ 𝑤𝑡
  • 36. The equation generated: Miles Per Gallon Prediction 36 The cool thing is that the sign of these coefficients point towards the influence of the variable: - The more cylinders the motor has, the less miles per gallon the car will make (negative weight) - The more horsepower the motor has, the less miles per gallon the car will make (negative weight) - The more displacement the car has, the more miles per gallon the car will make. (positive weight)
  • 37. Are all variables relevant? 37
  • 38. We can call summary to check more information about the regression – we’ll explore more of the output of the summary command in the practical lectures: mpg_prediction <- lm(mpg ~ cyl + disp + hp + drat + wt, data = mtcars) summary(mpg_prediction) Miles Per Gallon Prediction 38 These signs point us towards the influence of the variable – they refer the significance level of an hypothesis test We want these residuals to be near-normal distributed: - Median near 0; - Similar absolute value of max and min.
  • 39. Assumptions - Linear Regression ◉ The target and features must have a linear relationship; ◉ No correlated features (extremely hard in real world scenarios) ◉ Independence of Observations; ◉ Somewhat Normal Residuals; ◉ Homoscedasticity – Constant variance accross the errors of the predictions; 39
  • 40. You can find me at: LinkedIn Udemy Medium 40 Want to discover more? Join my R courses on Udemy risk-free (30 day refund policy), where you will have the chance to learn with practical exercises: - R for Absolute Beginners - R for Data Science
  • 41. Credits Special thanks to all the people who made and released these awesome resources for free: ◉ Presentation template by SlidesCarnival 41