SlideShare une entreprise Scribd logo
1  sur  78
Télécharger pour lire hors ligne
Linearna Regresija
ILI - KAKO OBJASNITI METODU NAJMANJEG KVADRANTA TOTALNOM DEBILU
dr Dragoljub

mojja verzija na" Indijanskom Egleskom"
Learning Objectives
In this chapter, you learn:
 How to use regression analysis to predict the value of
a dependent variable based on an independent
variable
 The meaning of the regression coefficients b 0 and b1
 How to evaluate the assumptions of regression
analysis and know what to do if the assumptions are
violated
 To make inferences about the slope and correlation
coefficient
 To estimate mean values and predict individual values
13-2
Correlation vs. Regression




A scatter plot can be used to show the
relationship between two variables

DCOVA

Correlation analysis is used to measure the
strength of the association (linear relationship)
between two variables


Correlation is only concerned with strength of the
relationship



No causal effect is implied with correlation



Scatter plots were first presented in Ch. 2



Correlation was first presented in Ch. 3
13-3
Introduction to
Regression Analysis
DCOVA


Regression analysis is used to:




Predict the value of a dependent variable based on
the value of at least one independent variable
Explain the impact of changes in an independent
variable on the dependent variable

Dependent variable:

the variable we wish to
predict or explain
Independent variable: the variable used to predict
or explain the dependent
variable
13-4
Simple Linear Regression
Model
DCOVA





Only one independent variable, X
Relationship between X and Y is
described by a linear function
Changes in Y are assumed to be related
to changes in X

13-5
Types of Relationships
DCOVA
Linear relationships
Y

Curvilinear relationships
Y

X
Y

X
Y

X

X
13-6
Types of Relationships
DCOVA
(continued)
Strong relationships
Y

Weak relationships
Y

X
Y

X
Y

X

X
13-7
Types of Relationships
DCOVA
(continued)
No relationship
Y

X
Y

X
13-8
Simple Linear Regression
Model
DCOVA

Population
Y intercept
Dependent
Variable

Population
Slope
Coefficient

Independent
Variable

Random
Error
term

Yi  β0  β1Xi  ε i
Linear component

Random Error
component

13-9
Simple Linear Regression
DCOVA
Model

(continued)

Y

Yi  β0  β1Xi  ε i

Observed Value
of Y for Xi

εi
Predicted Value
of Y for Xi

Slope = β1
Random Error
for this Xi value

Intercept = β0

Xi

X
13-10
Simple Linear Regression
Equation (Prediction Line) DCOVA
The simple linear regression equation provides an
estimate of the population regression line
Estimated
(or predicted)
Y value for
observation i

Estimate of
the regression
intercept

Estimate of the
regression slope

ˆ b b X
Yi
0
1 i

Value of X for
observation i

13-11
The Least Squares Method
DCOVA

b0 and b1 are obtained by finding the values of
that minimize the sum of the squared
ˆ
differences between Y and Y :

ˆ )2  min  (Y  (b  b X ))2
min  (Yi Yi
i
0
1 i

13-12
Finding the Least Squares
Equation
DCOVA


The coefficients b0 and b1 , and other
regression results in this chapter, will be
found using Excel

Formulas are shown in the text for those
who are interested

13-13
Interpretation of the
Slope and the Intercept
DCOVA




b0 is the estimated average value of Y
when the value of X is zero
b1 is the estimated change in the
average value of Y as a result of a
one-unit increase in X

13-14
Simple Linear Regression
Example
DCOVA




A real estate agent wishes to examine the
relationship between the selling price of a home
and its size (measured in square feet)
A random sample of 10 houses is selected
 Dependent variable (Y) = house price in $1000s
 Independent variable (X) = square feet

13-15
Simple Linear Regression
Example: Data
House Price in $1000s
(Y)

Square Feet
(X)

245

1400

312

1600

279

1700

308

1875

199

1100

219

1550

405

2350

324

2450

319

1425

255

DCOVA

1700

13-16
Simple Linear Regression Example:
Scatter Plot
House price model: Scatter Plot

DCOVA

13-17
Simple Linear Regression Example:
Using Excel Data Analysis Function
1. Choose Data

DCOVA
2. Choose Data Analysis
3. Choose Regression

13-18
Simple Linear Regression Example:
Using Excel Data Analysis Function
(continued)
Enter Y’s and X’s and desired options

DCOVA

13-19
Simple Linear Regression
Example: Using PHStat
Add-Ins: PHStat: Regression: Simple Linear Regression

13-20
Simple Linear Regression Example:
Excel Output
DCOVA
Regression Statistics
Multiple R

0.76211

R Square

0.58082

Adjusted R Square

0.52842

Standard Error

The regression equation is:
house price  98.24833  0.10977 (square feet)

41.33032

Observations

10

ANOVA
df

SS

MS

Regression

1

18934.9348

18934.9348

Residual

8

13665.5652

9

32600.5000

Coefficients
Intercept
Square Feet

Standard Error

t Stat

11.0848

Significance F

1708.1957

Total

F

P-value

0.01039

Lower 95%

Upper 95%

98.24833

58.03348

1.69296

0.12892

-35.57720

232.07386

0.10977

0.03297

3.32938

0.01039

0.03374

0.18580

13-21
Simple Linear Regression Example:
Graphical Representation
DCOVA

House price model: Scatter Plot and Prediction Line

Slope
= 0.10977

Intercept
= 98.248

house price  98.24833  0.10977 (square feet)
13-22
Simple Linear Regression
Example: Interpretation of bo

DCOVA

house price  98.24833  0.10977 (square feet)




b0 is the estimated average value of Y when the
value of X is zero (if X = 0 is in the range of
observed X values)
Because a house cannot have a square footage
of 0, b0 has no practical application

13-23
Simple Linear Regression
Example: Interpreting b1

DCOVA

house price  98.24833  0.10977 (square feet)


b1 estimates the change in the average
value of Y as a result of a one-unit
increase in X


Here, b1 = 0.10977 tells us that the mean value of a
house increases by .10977($1000) = $109.77, on
average, for each additional one square foot of size

13-24
Simple Linear Regression
Example: Making Predictions
Predict the price for a house
with 2000 square feet:

DCOVA

house price  98.25  0.1098 (sq.ft.)
 98.25  0.1098(200 0)
 317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
13-25
Simple Linear Regression
Example: Making Predictions
DCOVA


When using a regression model for prediction,
only predict within the relevant range of data
Relevant range for
interpolation

Do not try to
extrapolate
beyond the range
of observed X’s
13-26
Measures of Variation
DCOVA


Total variation is made up of two parts:

SST 

SSR 

Total Sum of
Squares

Regression Sum
of Squares

SST   ( Yi  Y )2

ˆ
SSR   ( Yi  Y )2

SSE
Error Sum of
Squares

ˆ
SSE   ( Yi  Yi )2

where:

Y = Mean value of the dependent variable
Yi = Observed value of the dependent variable

ˆ
Yi = Predicted value of Y for the given X i value

13-27
Measures of Variation
(continued)

DCOVA


SST = total sum of squares




Measures the variation of the Y i values around their
mean Y

SSR = regression sum of squares (Explained Variation)




(Total Variation)

Variation attributable to the relationship between X
and Y

SSE = error sum of squares (Unexplained Variation)


Variation in Y attributable to factors other than X

13-28
Measures of Variation
(continued)

DCOVA

Y
Yi


SSE = (Yi - Yi )2


Y

_


Y

SST = (Yi - Y)2
 _
SSR = (Yi - Y)2

_

Y

Xi

_
Y

X
13-29
Coefficient of Determination, r2
DCOVA





The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
The coefficient of determination is also called
r-squared and is denoted as r2
SSR regression sum of squares
r 

SST
total sum of squares
2

note:

0 r 1
2

13-30
Examples of Approximate
r2 Values
DCOVA
Y
r2 = 1

r2 = 1

X

100% of the variation in Y is
explained by variation in X

Y

r2

=1

Perfect linear relationship
between X and Y:

X
13-31
Examples of Approximate
r2 Values
DCOVA
Y
0 < r2 < 1

X

Weaker linear relationships
between X and Y:
Some but not all of the
variation in Y is explained
by variation in X

Y

X
13-32
Examples of Approximate
r2 Values
DCOVA
r2 = 0

Y

No linear relationship
between X and Y:

r2 = 0

X

The value of Y does not
depend on X. (None of the
variation in Y is explained
by variation in X)

13-33
Simple Linear Regression Example:
Coefficient of Determination, r2 in Excel
DCOVA
SSR 18934.9348
r 

 0.58082
SST 32600.5000
2

Regression Statistics
Multiple R

0.76211

R Square

0.58082

Adjusted R Square

0.52842

Standard Error

58.08% of the variation in
house prices is explained by
variation in square feet

41.33032

Observations

10

ANOVA
df

SS

MS

Regression

1

18934.9348

18934.9348

Residual

8

13665.5652

9

32600.5000

Coefficients
Intercept
Square Feet

Standard Error

t Stat

11.0848

Significance F

1708.1957

Total

F

P-value

0.01039

Lower 95%

Upper 95%

98.24833

58.03348

1.69296

0.12892

-35.57720

232.07386

0.10977

0.03297

3.32938

0.01039

0.03374

0.18580

13-34
Standard Error of Estimate
DCOVA


The standard deviation of the variation of
observations around the regression line is
estimated by
n

SSE
S YX 

n2



ˆ
(Yi  Yi ) 2

i 1

n2

Where
SSE = error sum of squares
n = sample size
13-35
Simple Linear Regression Example:
Standard Error of Estimate in Excel
DCOVA
Regression Statistics
Multiple R

0.76211

R Square

0.58082

Adjusted R Square

S YX  41.33032

0.52842

Standard Error

41.33032

Observations

10

ANOVA
df

SS

MS

Regression

1

18934.9348

18934.9348

Residual

8

13665.5652

9

32600.5000

Coefficients
Intercept
Square Feet

Standard Error

t Stat

11.0848

Significance F

1708.1957

Total

F

P-value

0.01039

Lower 95%

Upper 95%

98.24833

58.03348

1.69296

0.12892

-35.57720

232.07386

0.10977

0.03297

3.32938

0.01039

0.03374

0.18580

13-36
Comparing Standard Errors
SYX is a measure of the variation of observed
Y values from the regression line
Y

DCOVA

Y

small SYX

X

large SYX

X

The magnitude of SYX should always be judged relative to the
size of the Y values in the sample data
i.e., SYX = $41.33K is moderately small relative to house prices
in the $200K - $400K range
13-37
Assumptions of Regression
L.I.N.E
DCOVA








Linearity
 The relationship between X and Y is linear
Independence of Errors
 Error values are statistically independent
Normality of Error
 Error values are normally distributed for any given
value of X
Equal Variance (also called homoscedasticity)
 The probability distribution of the errors has constant
variance
13-38
Residual Analysis
ˆ
ei  Yi  Yi




The residual for observation i, e i, is the difference
between its observed and predicted value
Check the assumptions of regression by examining the
residuals


Examine for linearity assumption



Evaluate independence assumption



Evaluate normal distribution assumption





DCOVA

Examine for constant variance for all levels of X
(homoscedasticity)

Graphical Analysis of Residuals


Can plot residuals vs. X
13-39
Residual Analysis for Linearity
DCOVA
Y

Y

x

x

Not Linear

residuals

residuals

x

x



Linear
13-40
Residual Analysis for
Independence
DCOVA
Not Independent

X

residuals

residuals

X

residuals



Independent

X

13-41
Checking for Normality
DCOVA






Examine the Stem-and-Leaf Display of the
Residuals
Examine the Boxplot of the Residuals
Examine the Histogram of the Residuals
Construct a Normal Probability Plot of the
Residuals

13-42
Residual Analysis for Normality
DCOVA
When using a normal probability plot, normal
errors will approximately display in a straight line
Percent

100

0
-3

-2

-1

0

1

2

3

Residual
13-43
Residual Analysis for
Equal Variance
DCOVA
Y

Y

x

x
Non-constant variance

residuals

residuals

x

x



Constant variance
13-44
Simple Linear Regression Example:
Excel Residual Output
DCOVA
RESIDUAL OUTPUT
Predicted
House Price

Residuals

1

251.92316

-6.923162

2

273.87671

38.12329

3

284.85348

-5.853484

4

304.06284

3.937162

5

218.99284

-19.99284

6

268.38832

-49.38832

7

356.20251

48.79749

8

367.17929

-43.17929

9

254.6674

64.33264

10

284.85348

-29.85348

Does not appear to violate
any regression assumptions
13-45
Measuring Autocorrelation:
The Durbin-Watson Statistic

DCOVA





Used when data are collected over time to
detect if autocorrelation is present
Autocorrelation exists if residuals in one
time period are related to residuals in
another period

13-46
Autocorrelation
DCOVA



Autocorrelation is correlation of the errors
(residuals) over time

Here, residuals show a
cyclic pattern, not
random. Cyclical
patterns are a sign of
positive autocorrelation

Violates the regression assumption that residuals
are random and independent
13-47
The Durbin-Watson Statistic
DCOVA


The Durbin-Watson statistic is used to test for
autocorrelation
H0: residuals are not correlated
H1: positive autocorrelation is present
n

D

 (e  e
i 2

i

i1

2

)

The possible range is 0 ≤ D ≤ 4
D should be close to 2 if H 0 is true

n

ei2

i1

D less than 2 may signal positive
autocorrelation, D greater than 2 may
signal negative autocorrelation
13-48
Testing for Positive
Autocorrelation
H0: positive autocorrelation does not exist

DCOVA

H1: positive autocorrelation is present
Calculate the Durbin-Watson test statistic = D
(The Durbin-Watson Statistic can be found using Excel or Minitab)

Find the values dL and dU from the Durbin-Watson table
(for sample size n and number of independent variables k)

Decision rule: reject H0 if D < dL
Reject H0

0

Inconclusive

dL

Do not reject H0

dU

2
13-49
Testing for Positive
Autocorrelation

(continued)

DCOVA


Suppose we have the following time series data:



Is there autocorrelation?
13-50
Testing for Positive
Autocorrelation


(continued)

DCOVA

Example with n = 25:

Excel/PHStat output:
Durbin-Watson Calculations
Sum of Squared
Difference of Residuals

3296.18

Sum of Squared
Residuals

3279.98

Durbin-Watson
Statistic

1.00494
n

D

(ei  ei1 )2

i 2

n

e
i1

2



3296.18
 1.00494
3279.98

i
13-51
Testing for Positive
Autocorrelation

(continued)



DCOVA
Here, n = 25 and there is k = 1 one independent variable



Using the Durbin-Watson table, dL = 1.29 and dU = 1.45



D = 1.00494 < dL = 1.29, so reject H0 and conclude that
significant positive autocorrelation exists
Decision: reject H0 since
D = 1.00494 < d L
Reject H0

0

Inconclusive

dL=1.29

Do not reject H0

dU=1.45

2
13-52
Inferences About the Slope
DCOVA


The standard error of the regression slope
coefficient (b1) is estimated by

S YX
Sb1 

SSX

S YX
(Xi  X)2


where:

Sb1

= Estimate of the standard error of the slope

S YX

SSE
= Standard error of the estimate

n2
13-53
Inferences About the Slope:
t Test
DCOVA


t test for a population slope




Null and alternative hypotheses





Is there a linear relationship between X and Y?
H0: β1 = 0
H1: β1 ≠ 0

(no linear relationship)
(linear relationship does exist)

Test statistic

t STAT 

b1  β 1
Sb

1

d.f.  n  2

where:
b1 = regression slope
coefficient
β1 = hypothesized slope
Sb1 = standard
error of the slope
13-54
Inferences About the Slope:
t Test Example
DCOVA
House Price
in $1000s
(y)

Square Feet
(x)

245

1400

312

1600

279

1700

308

1875

199

1100

219

1550

405

2350

324

2450

319

1425

255

Estimated Regression Equation:

1700

house price  98.25  0.1098 (sq.ft.)

The slope of this model is 0.1098
Is there a relationship between the
square footage of the house and its
sales price?

13-55
Inferences About the Slope:
t Test Example
From Excel output:
Coefficients
Intercept
Square Feet

DCOVA

H0: β1 = 0
H1: β1 ≠ 0
Standard Error

t Stat

P-value

98.24833

58.03348

1.69296

0.12892

0.10977

0.03297

3.32938

0.01039

b1

Sb1

t STAT 

b1  β 1
Sb



0.10977  0
 3.32938
0.03297

1

13-56
Inferences About the Slope:
t Test Example
DCOVA

Test Statistic: tSTAT = 3.329

H0: β1 = 0
H1: β1 ≠ 0

d.f. = 10- 2 = 8
/2=.025

Reject H0

/2=.025

Do not reject H0

-tα/2
-2.3060

0

Reject H0

tα/2
2.3060

3.329

Decision: Reject H0
There is sufficient evidence
that square footage affects
house price

13-57
Inferences About the Slope:
t Test Example
DCOVA

H0: β1 = 0
H1: β1 ≠ 0
From Excel output:
Coefficients
Intercept
Square Feet

Standard Error

t Stat

P-value

98.24833

58.03348

1.69296

0.12892

0.10977

0.03297

3.32938

0.01039

Decision: Reject H0, since p-value < α

p-value

There is sufficient evidence that
square footage affects house price.
13-58
F Test for Significance
DCOVA


MSR
F Test statistic: F
STAT 
MSE

where

MSR 

SSR
k

SSE
MSE 
n  k 1
where FSTAT follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom
(k = the number of independent variables in the regression model)
13-59
F-Test for Significance
Excel Output
DCOVA
Regression Statistics
Multiple R

0.76211

R Square

0.58082

Adjusted R Square

0.52842

Standard Error

FSTAT 

MSR 18934.9348

 11.0848
MSE 1708.1957

41.33032

Observations

10

With 1 and 8 degrees
of freedom

p-value for
the F-Test

ANOVA
df

SS

MS

Regression

1

18934.9348

18934.9348

Residual

8

13665.5652

9

11.0848

Significance F

1708.1957

Total

F

0.01039

32600.5000

13-60
F Test for Significance
(continued)

DCOVA

Test Statistic:

H0: β1 = 0
H1: β1 ≠ 0
 = .05
df1= 1
df2 = 8

FSTAT 

Decision:
Reject H0 at  = 0.05

Critical
Value:
F = 5.32

Conclusion:

 = .05

0

Do not
reject H0

Reject H0

MSR
 11.08
MSE

F

There is sufficient evidence that
house size affects selling price

F.05 = 5.32
13-61
Confidence Interval Estimate
for the Slope
DCOVA

Confidence Interval Estimate of the Slope:

b1  t α / 2 S b

1

d.f. = n - 2

Excel Printout for House Prices:
Coefficients
Intercept
Square Feet

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

98.24833

58.03348

1.69296

0.12892

-35.57720

232.07386

0.10977

0.03297

3.32938

0.01039

0.03374

0.18580

At 95% level of confidence, the confidence interval for
the slope is (0.0337, 0.1858)
13-62
Confidence Interval Estimate
(continued)
for the Slope
DCOVA
Coefficients
Intercept

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

98.24833

Square Feet

58.03348

1.69296

0.12892

-35.57720

232.07386

0.10977

0.03297

3.32938

0.01039

0.03374

0.18580

Since the units of the house price variable is
$1000s, we are 95% confident that the average
impact on sales price is between $33.74 and
$185.80 per square foot of house size
This 95% confidence interval does not include 0.
Conclusion: There is a significant relationship between
house price and square feet at the .05 level of significance
13-63
t Test for a Correlation Coefficient
DCOVA





Hypotheses
H0: ρ = 0 (no correlation between X and Y)
H1: ρ ≠ 0 (correlation exists)
Test statistic

t STAT 

r -ρ
2

1 r
n2

(with n – 2 degrees of freedom)

where
r   r 2 if b1  0
r   r 2 if b1  0
13-64
t-test For A Correlation Coefficient
(continued)

Is there evidence of a linear relationship
between square feet and house price at the
.05 level of significance?
H0: ρ = 0
H1: ρ ≠ 0

DCOVA

(No correlation)
(correlation exists)

 =.05 , df = 10 - 2 = 8

t STAT 

r ρ
1 r2
n2



.762  0
1  .762 2
10  2

 3.329

13-65
t-test For A Correlation Coefficient
(continued)

DCOVA

t STAT 

r ρ
1 r2
n2



.762  0
1  .762 2
10  2

 3.329

Conclusion:
There is
evidence of a
linear association
at the 5% level of
significance

d.f. = 10-2 = 8
/2=.025

Reject H0

-tα/2
-2.3060

/2=.025

Do not reject H0

0

Decision:
Reject H0

Reject H0

tα/2
2.3060

3.329

13-66
Estimating Mean Values and
Predicting Individual Values
DCOVA

Goal: Form intervals around Y to express
uncertainty about the value of Y for a given X i
Confidence
Interval for
the mean of
Y, given Xi

Y


Y


Y = b0+b1Xi

Prediction Interval
for an individual Y,
given Xi

Xi

X

13-67
Confidence Interval for
the Average Y, Given X

DCOVA

Confidence interval estimate for the
mean value of Y given a particular Xi
Confidence interval for μ Y|X X :
i

ˆ
Y  t α / 2 S YX hi
Size of interval varies according
to distance away from mean, X

1 (Xi  X)2 1
(Xi  X)2
hi  
 
n
SSX
n  (Xi  X)2
13-68
Prediction Interval for
an Individual Y, Given X

DCOVA

Confidence interval estimate for an
Individual value of Y given a particular Xi
Confidence interval for YX X :
i

ˆ
Y  t α / 2 S YX 1  hi

This extra term adds to the interval width to reflect
the added uncertainty for an individual case

13-69
Estimation of Mean Values:
Example
DCOVA
Confidence Interval Estimate for μY|X=X i
Find the 95% confidence interval for the mean price
of 2,000 square-foot houses

Predicted Price Yi = 317.85 ($1,000s)
1
ˆ t
Y 0.025S YX

n

(X i  X) 2

 (X i  X)

2

 317.85  37.12

The confidence interval endpoints are 280.66 and 354.90,
or from $280,660 to $354,900
13-70
Estimation of Individual Values:
Example
DCOVA
Prediction Interval Estimate for YX=X i
Find the 95% prediction interval for an individual
house with 2,000 square feet

Predicted Price Yi = 317.85 ($1,000s)
1
ˆ t
Y 0.025S YX 1  
n

(X i  X) 2

 (X i  X)

2

 317.85  102.28

The prediction interval endpoints are 215.50 and 420.07,
or from $215,500 to $420,070
13-71
Finding Confidence and
Prediction Intervals in Excel
DCOVA


From Excel, use
PHStat | regression | simple linear regression …


Check the
“confidence and prediction interval for X= ”
box and enter the X-value and confidence level
desired

13-72
Finding Confidence and
Prediction Intervals in Excel

(continued)

DCOVA
Input values


Y
Confidence Interval Estimate for μY|X=Xi
Prediction Interval Estimate for Y X=Xi
13-73
Pitfalls of Regression Analysis









Lacking an awareness of the assumptions
underlying least-squares regression
Not knowing how to evaluate the assumptions
Not knowing the alternatives to least-squares
regression if a particular assumption is violated
Using a regression model without knowledge of
the subject matter
Extrapolating outside the relevant range

13-74
Strategies for Avoiding
the Pitfalls of Regression




Start with a scatter plot of X vs. Y to observe
possible relationship
Perform residual analysis to check the
assumptions




Plot the residuals vs. X to check for violations of
assumptions such as homoscedasticity
Use a histogram, stem-and-leaf display, boxplot,
or normal probability plot of the residuals to
uncover possible non-normality

13-75
Strategies for Avoiding
the Pitfalls of Regression






(continued)

If there is violation of any assumption, use
alternative methods or models
If there is no evidence of assumption violation,
then test for the significance of the regression
coefficients and construct confidence intervals
and prediction intervals
Avoid making predictions or forecasts outside
the relevant range

13-76
Chapter Summary









Introduced types of regression models
Reviewed assumptions of regression and
correlation
Discussed determining the simple linear
regression equation
Described measures of variation
Discussed residual analysis
Addressed measuring autocorrelation

13-77
Chapter Summary
(continued)







Described inference about the slope
Discussed correlation -- measuring the strength
of the association
Addressed estimation of mean values and
prediction of individual values
Discussed possible pitfalls in regression and
recommended strategies to avoid them

13-78

Contenu connexe

Tendances

Regression Analysis
Regression AnalysisRegression Analysis
Regression AnalysisSalim Azad
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysisFarzad Javidanrad
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsEugene Yan Ziyou
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)Harsh Upadhyay
 
Kolmogorov smirnov test
Kolmogorov smirnov testKolmogorov smirnov test
Kolmogorov smirnov testRamesh Giri
 
Normal distribution
Normal distributionNormal distribution
Normal distributionCamilleJoy3
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionMohit Asija
 
Uniform Distribution
Uniform DistributionUniform Distribution
Uniform Distributionmathscontent
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionKen Plummer
 
Basis of statistical inference
Basis of statistical inferenceBasis of statistical inference
Basis of statistical inferencezahidacademy
 
Partial and multiple correlation and regression
Partial and multiple correlation and regressionPartial and multiple correlation and regression
Partial and multiple correlation and regressionKritika Jain
 
Multiple regression presentation
Multiple regression presentationMultiple regression presentation
Multiple regression presentationCarlo Magno
 
Basic Descriptive statistics
Basic Descriptive statisticsBasic Descriptive statistics
Basic Descriptive statisticsAjendra Sharma
 
Linear models for data science
Linear models for data scienceLinear models for data science
Linear models for data scienceBrad Klingenberg
 
Regression analysis
Regression analysisRegression analysis
Regression analysissaba khan
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JMJapheth Muthama
 

Tendances (20)

Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)
 
Kolmogorov smirnov test
Kolmogorov smirnov testKolmogorov smirnov test
Kolmogorov smirnov test
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Uniform Distribution
Uniform DistributionUniform Distribution
Uniform Distribution
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Bivariate
BivariateBivariate
Bivariate
 
Basis of statistical inference
Basis of statistical inferenceBasis of statistical inference
Basis of statistical inference
 
Partial and multiple correlation and regression
Partial and multiple correlation and regressionPartial and multiple correlation and regression
Partial and multiple correlation and regression
 
Multiple regression presentation
Multiple regression presentationMultiple regression presentation
Multiple regression presentation
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Basic Descriptive statistics
Basic Descriptive statisticsBasic Descriptive statistics
Basic Descriptive statistics
 
Linear models for data science
Linear models for data scienceLinear models for data science
Linear models for data science
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Regression analysis by Muthama JM
Regression analysis by Muthama JMRegression analysis by Muthama JM
Regression analysis by Muthama JM
 

Similaire à metod linearne regresije

linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and AnnovaMansi Rastogi
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsBabasab Patil
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Neeraj Bhandari
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysisRabin BK
 
Presentasi Tentang Regresi Linear
Presentasi Tentang Regresi LinearPresentasi Tentang Regresi Linear
Presentasi Tentang Regresi Lineardessybudiyanti
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regressiondessybudiyanti
 
lecture No. 3a.ppt
lecture No. 3a.pptlecture No. 3a.ppt
lecture No. 3a.pptHamidUllah50
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptxShivankAggatwal
 
20 ms-me-amd-06 (simple linear regression)
20 ms-me-amd-06 (simple linear regression)20 ms-me-amd-06 (simple linear regression)
20 ms-me-amd-06 (simple linear regression)HassanShah124
 
Simple regression model
Simple regression modelSimple regression model
Simple regression modelMidoTami
 
Correlation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxCorrelation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxUnfold1
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).pptMuhammadAftab89
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.pptRidaIrfan10
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.pptkrunal soni
 

Similaire à metod linearne regresije (20)

linear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annovalinear Regression, multiple Regression and Annova
linear Regression, multiple Regression and Annova
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
 
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )Correlation by Neeraj Bhandari ( Surkhet.Nepal )
Correlation by Neeraj Bhandari ( Surkhet.Nepal )
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Statistics-Regression analysis
Statistics-Regression analysisStatistics-Regression analysis
Statistics-Regression analysis
 
Presentasi Tentang Regresi Linear
Presentasi Tentang Regresi LinearPresentasi Tentang Regresi Linear
Presentasi Tentang Regresi Linear
 
Simple Linier Regression
Simple Linier RegressionSimple Linier Regression
Simple Linier Regression
 
lecture No. 3a.ppt
lecture No. 3a.pptlecture No. 3a.ppt
lecture No. 3a.ppt
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Regression
RegressionRegression
Regression
 
20 ms-me-amd-06 (simple linear regression)
20 ms-me-amd-06 (simple linear regression)20 ms-me-amd-06 (simple linear regression)
20 ms-me-amd-06 (simple linear regression)
 
Simple regression model
Simple regression modelSimple regression model
Simple regression model
 
Correlation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxCorrelation and Regression Analysis.pptx
Correlation and Regression Analysis.pptx
 
Regression
RegressionRegression
Regression
 
Reg
RegReg
Reg
 
Corr-and-Regress (1).ppt
Corr-and-Regress (1).pptCorr-and-Regress (1).ppt
Corr-and-Regress (1).ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Cr-and-Regress.ppt
Cr-and-Regress.pptCr-and-Regress.ppt
Cr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 
Corr-and-Regress.ppt
Corr-and-Regress.pptCorr-and-Regress.ppt
Corr-and-Regress.ppt
 

Plus de univerzitet u beogradu (9)

Osobine Upravnog Akta
Osobine Upravnog AktaOsobine Upravnog Akta
Osobine Upravnog Akta
 
osobine upravnih akata
osobine upravnih akataosobine upravnih akata
osobine upravnih akata
 
Prekrsajno pravo
Prekrsajno pravo Prekrsajno pravo
Prekrsajno pravo
 
Prekršajno pravo
Prekršajno pravo Prekršajno pravo
Prekršajno pravo
 
Prekrsajno pravo FVL
 Prekrsajno pravo  FVL Prekrsajno pravo  FVL
Prekrsajno pravo FVL
 
Uputstva za pisanje master rada
Uputstva za pisanje master radaUputstva za pisanje master rada
Uputstva za pisanje master rada
 
Kriminologija
KriminologijaKriminologija
Kriminologija
 
Dobra pp-prezentacija (1)
Dobra pp-prezentacija (1)Dobra pp-prezentacija (1)
Dobra pp-prezentacija (1)
 
Meтеорологија увод
Meтеорологија уводMeтеорологија увод
Meтеорологија увод
 

Dernier

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Shubhangi Sonawane
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIShubhangi Sonawane
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 

Dernier (20)

Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 

metod linearne regresije

  • 1. Linearna Regresija ILI - KAKO OBJASNITI METODU NAJMANJEG KVADRANTA TOTALNOM DEBILU dr Dragoljub mojja verzija na" Indijanskom Egleskom"
  • 2. Learning Objectives In this chapter, you learn:  How to use regression analysis to predict the value of a dependent variable based on an independent variable  The meaning of the regression coefficients b 0 and b1  How to evaluate the assumptions of regression analysis and know what to do if the assumptions are violated  To make inferences about the slope and correlation coefficient  To estimate mean values and predict individual values 13-2
  • 3. Correlation vs. Regression   A scatter plot can be used to show the relationship between two variables DCOVA Correlation analysis is used to measure the strength of the association (linear relationship) between two variables  Correlation is only concerned with strength of the relationship  No causal effect is implied with correlation  Scatter plots were first presented in Ch. 2  Correlation was first presented in Ch. 3 13-3
  • 4. Introduction to Regression Analysis DCOVA  Regression analysis is used to:   Predict the value of a dependent variable based on the value of at least one independent variable Explain the impact of changes in an independent variable on the dependent variable Dependent variable: the variable we wish to predict or explain Independent variable: the variable used to predict or explain the dependent variable 13-4
  • 5. Simple Linear Regression Model DCOVA    Only one independent variable, X Relationship between X and Y is described by a linear function Changes in Y are assumed to be related to changes in X 13-5
  • 6. Types of Relationships DCOVA Linear relationships Y Curvilinear relationships Y X Y X Y X X 13-6
  • 7. Types of Relationships DCOVA (continued) Strong relationships Y Weak relationships Y X Y X Y X X 13-7
  • 9. Simple Linear Regression Model DCOVA Population Y intercept Dependent Variable Population Slope Coefficient Independent Variable Random Error term Yi  β0  β1Xi  ε i Linear component Random Error component 13-9
  • 10. Simple Linear Regression DCOVA Model (continued) Y Yi  β0  β1Xi  ε i Observed Value of Y for Xi εi Predicted Value of Y for Xi Slope = β1 Random Error for this Xi value Intercept = β0 Xi X 13-10
  • 11. Simple Linear Regression Equation (Prediction Line) DCOVA The simple linear regression equation provides an estimate of the population regression line Estimated (or predicted) Y value for observation i Estimate of the regression intercept Estimate of the regression slope ˆ b b X Yi 0 1 i Value of X for observation i 13-11
  • 12. The Least Squares Method DCOVA b0 and b1 are obtained by finding the values of that minimize the sum of the squared ˆ differences between Y and Y : ˆ )2  min  (Y  (b  b X ))2 min  (Yi Yi i 0 1 i 13-12
  • 13. Finding the Least Squares Equation DCOVA  The coefficients b0 and b1 , and other regression results in this chapter, will be found using Excel Formulas are shown in the text for those who are interested 13-13
  • 14. Interpretation of the Slope and the Intercept DCOVA   b0 is the estimated average value of Y when the value of X is zero b1 is the estimated change in the average value of Y as a result of a one-unit increase in X 13-14
  • 15. Simple Linear Regression Example DCOVA   A real estate agent wishes to examine the relationship between the selling price of a home and its size (measured in square feet) A random sample of 10 houses is selected  Dependent variable (Y) = house price in $1000s  Independent variable (X) = square feet 13-15
  • 16. Simple Linear Regression Example: Data House Price in $1000s (Y) Square Feet (X) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 DCOVA 1700 13-16
  • 17. Simple Linear Regression Example: Scatter Plot House price model: Scatter Plot DCOVA 13-17
  • 18. Simple Linear Regression Example: Using Excel Data Analysis Function 1. Choose Data DCOVA 2. Choose Data Analysis 3. Choose Regression 13-18
  • 19. Simple Linear Regression Example: Using Excel Data Analysis Function (continued) Enter Y’s and X’s and desired options DCOVA 13-19
  • 20. Simple Linear Regression Example: Using PHStat Add-Ins: PHStat: Regression: Simple Linear Regression 13-20
  • 21. Simple Linear Regression Example: Excel Output DCOVA Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error The regression equation is: house price  98.24833  0.10977 (square feet) 41.33032 Observations 10 ANOVA df SS MS Regression 1 18934.9348 18934.9348 Residual 8 13665.5652 9 32600.5000 Coefficients Intercept Square Feet Standard Error t Stat 11.0848 Significance F 1708.1957 Total F P-value 0.01039 Lower 95% Upper 95% 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 13-21
  • 22. Simple Linear Regression Example: Graphical Representation DCOVA House price model: Scatter Plot and Prediction Line Slope = 0.10977 Intercept = 98.248 house price  98.24833  0.10977 (square feet) 13-22
  • 23. Simple Linear Regression Example: Interpretation of bo DCOVA house price  98.24833  0.10977 (square feet)   b0 is the estimated average value of Y when the value of X is zero (if X = 0 is in the range of observed X values) Because a house cannot have a square footage of 0, b0 has no practical application 13-23
  • 24. Simple Linear Regression Example: Interpreting b1 DCOVA house price  98.24833  0.10977 (square feet)  b1 estimates the change in the average value of Y as a result of a one-unit increase in X  Here, b1 = 0.10977 tells us that the mean value of a house increases by .10977($1000) = $109.77, on average, for each additional one square foot of size 13-24
  • 25. Simple Linear Regression Example: Making Predictions Predict the price for a house with 2000 square feet: DCOVA house price  98.25  0.1098 (sq.ft.)  98.25  0.1098(200 0)  317.85 The predicted price for a house with 2000 square feet is 317.85($1,000s) = $317,850 13-25
  • 26. Simple Linear Regression Example: Making Predictions DCOVA  When using a regression model for prediction, only predict within the relevant range of data Relevant range for interpolation Do not try to extrapolate beyond the range of observed X’s 13-26
  • 27. Measures of Variation DCOVA  Total variation is made up of two parts: SST  SSR  Total Sum of Squares Regression Sum of Squares SST   ( Yi  Y )2 ˆ SSR   ( Yi  Y )2 SSE Error Sum of Squares ˆ SSE   ( Yi  Yi )2 where: Y = Mean value of the dependent variable Yi = Observed value of the dependent variable ˆ Yi = Predicted value of Y for the given X i value 13-27
  • 28. Measures of Variation (continued) DCOVA  SST = total sum of squares   Measures the variation of the Y i values around their mean Y SSR = regression sum of squares (Explained Variation)   (Total Variation) Variation attributable to the relationship between X and Y SSE = error sum of squares (Unexplained Variation)  Variation in Y attributable to factors other than X 13-28
  • 29. Measures of Variation (continued) DCOVA Y Yi  SSE = (Yi - Yi )2  Y _  Y SST = (Yi - Y)2  _ SSR = (Yi - Y)2 _ Y Xi _ Y X 13-29
  • 30. Coefficient of Determination, r2 DCOVA   The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable The coefficient of determination is also called r-squared and is denoted as r2 SSR regression sum of squares r   SST total sum of squares 2 note: 0 r 1 2 13-30
  • 31. Examples of Approximate r2 Values DCOVA Y r2 = 1 r2 = 1 X 100% of the variation in Y is explained by variation in X Y r2 =1 Perfect linear relationship between X and Y: X 13-31
  • 32. Examples of Approximate r2 Values DCOVA Y 0 < r2 < 1 X Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X Y X 13-32
  • 33. Examples of Approximate r2 Values DCOVA r2 = 0 Y No linear relationship between X and Y: r2 = 0 X The value of Y does not depend on X. (None of the variation in Y is explained by variation in X) 13-33
  • 34. Simple Linear Regression Example: Coefficient of Determination, r2 in Excel DCOVA SSR 18934.9348 r    0.58082 SST 32600.5000 2 Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error 58.08% of the variation in house prices is explained by variation in square feet 41.33032 Observations 10 ANOVA df SS MS Regression 1 18934.9348 18934.9348 Residual 8 13665.5652 9 32600.5000 Coefficients Intercept Square Feet Standard Error t Stat 11.0848 Significance F 1708.1957 Total F P-value 0.01039 Lower 95% Upper 95% 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 13-34
  • 35. Standard Error of Estimate DCOVA  The standard deviation of the variation of observations around the regression line is estimated by n SSE S YX   n2  ˆ (Yi  Yi ) 2 i 1 n2 Where SSE = error sum of squares n = sample size 13-35
  • 36. Simple Linear Regression Example: Standard Error of Estimate in Excel DCOVA Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square S YX  41.33032 0.52842 Standard Error 41.33032 Observations 10 ANOVA df SS MS Regression 1 18934.9348 18934.9348 Residual 8 13665.5652 9 32600.5000 Coefficients Intercept Square Feet Standard Error t Stat 11.0848 Significance F 1708.1957 Total F P-value 0.01039 Lower 95% Upper 95% 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 13-36
  • 37. Comparing Standard Errors SYX is a measure of the variation of observed Y values from the regression line Y DCOVA Y small SYX X large SYX X The magnitude of SYX should always be judged relative to the size of the Y values in the sample data i.e., SYX = $41.33K is moderately small relative to house prices in the $200K - $400K range 13-37
  • 38. Assumptions of Regression L.I.N.E DCOVA     Linearity  The relationship between X and Y is linear Independence of Errors  Error values are statistically independent Normality of Error  Error values are normally distributed for any given value of X Equal Variance (also called homoscedasticity)  The probability distribution of the errors has constant variance 13-38
  • 39. Residual Analysis ˆ ei  Yi  Yi   The residual for observation i, e i, is the difference between its observed and predicted value Check the assumptions of regression by examining the residuals  Examine for linearity assumption  Evaluate independence assumption  Evaluate normal distribution assumption   DCOVA Examine for constant variance for all levels of X (homoscedasticity) Graphical Analysis of Residuals  Can plot residuals vs. X 13-39
  • 40. Residual Analysis for Linearity DCOVA Y Y x x Not Linear residuals residuals x x  Linear 13-40
  • 41. Residual Analysis for Independence DCOVA Not Independent X residuals residuals X residuals  Independent X 13-41
  • 42. Checking for Normality DCOVA     Examine the Stem-and-Leaf Display of the Residuals Examine the Boxplot of the Residuals Examine the Histogram of the Residuals Construct a Normal Probability Plot of the Residuals 13-42
  • 43. Residual Analysis for Normality DCOVA When using a normal probability plot, normal errors will approximately display in a straight line Percent 100 0 -3 -2 -1 0 1 2 3 Residual 13-43
  • 44. Residual Analysis for Equal Variance DCOVA Y Y x x Non-constant variance residuals residuals x x  Constant variance 13-44
  • 45. Simple Linear Regression Example: Excel Residual Output DCOVA RESIDUAL OUTPUT Predicted House Price Residuals 1 251.92316 -6.923162 2 273.87671 38.12329 3 284.85348 -5.853484 4 304.06284 3.937162 5 218.99284 -19.99284 6 268.38832 -49.38832 7 356.20251 48.79749 8 367.17929 -43.17929 9 254.6674 64.33264 10 284.85348 -29.85348 Does not appear to violate any regression assumptions 13-45
  • 46. Measuring Autocorrelation: The Durbin-Watson Statistic DCOVA   Used when data are collected over time to detect if autocorrelation is present Autocorrelation exists if residuals in one time period are related to residuals in another period 13-46
  • 47. Autocorrelation DCOVA  Autocorrelation is correlation of the errors (residuals) over time Here, residuals show a cyclic pattern, not random. Cyclical patterns are a sign of positive autocorrelation Violates the regression assumption that residuals are random and independent 13-47
  • 48. The Durbin-Watson Statistic DCOVA  The Durbin-Watson statistic is used to test for autocorrelation H0: residuals are not correlated H1: positive autocorrelation is present n D  (e  e i 2 i i1 2 ) The possible range is 0 ≤ D ≤ 4 D should be close to 2 if H 0 is true n ei2  i1 D less than 2 may signal positive autocorrelation, D greater than 2 may signal negative autocorrelation 13-48
  • 49. Testing for Positive Autocorrelation H0: positive autocorrelation does not exist DCOVA H1: positive autocorrelation is present Calculate the Durbin-Watson test statistic = D (The Durbin-Watson Statistic can be found using Excel or Minitab) Find the values dL and dU from the Durbin-Watson table (for sample size n and number of independent variables k) Decision rule: reject H0 if D < dL Reject H0 0 Inconclusive dL Do not reject H0 dU 2 13-49
  • 50. Testing for Positive Autocorrelation (continued) DCOVA  Suppose we have the following time series data:  Is there autocorrelation? 13-50
  • 51. Testing for Positive Autocorrelation  (continued) DCOVA Example with n = 25: Excel/PHStat output: Durbin-Watson Calculations Sum of Squared Difference of Residuals 3296.18 Sum of Squared Residuals 3279.98 Durbin-Watson Statistic 1.00494 n D (ei  ei1 )2  i 2 n e i1 2  3296.18  1.00494 3279.98 i 13-51
  • 52. Testing for Positive Autocorrelation (continued)  DCOVA Here, n = 25 and there is k = 1 one independent variable  Using the Durbin-Watson table, dL = 1.29 and dU = 1.45  D = 1.00494 < dL = 1.29, so reject H0 and conclude that significant positive autocorrelation exists Decision: reject H0 since D = 1.00494 < d L Reject H0 0 Inconclusive dL=1.29 Do not reject H0 dU=1.45 2 13-52
  • 53. Inferences About the Slope DCOVA  The standard error of the regression slope coefficient (b1) is estimated by S YX Sb1   SSX S YX (Xi  X)2  where: Sb1 = Estimate of the standard error of the slope S YX SSE = Standard error of the estimate  n2 13-53
  • 54. Inferences About the Slope: t Test DCOVA  t test for a population slope   Null and alternative hypotheses    Is there a linear relationship between X and Y? H0: β1 = 0 H1: β1 ≠ 0 (no linear relationship) (linear relationship does exist) Test statistic t STAT  b1  β 1 Sb 1 d.f.  n  2 where: b1 = regression slope coefficient β1 = hypothesized slope Sb1 = standard error of the slope 13-54
  • 55. Inferences About the Slope: t Test Example DCOVA House Price in $1000s (y) Square Feet (x) 245 1400 312 1600 279 1700 308 1875 199 1100 219 1550 405 2350 324 2450 319 1425 255 Estimated Regression Equation: 1700 house price  98.25  0.1098 (sq.ft.) The slope of this model is 0.1098 Is there a relationship between the square footage of the house and its sales price? 13-55
  • 56. Inferences About the Slope: t Test Example From Excel output: Coefficients Intercept Square Feet DCOVA H0: β1 = 0 H1: β1 ≠ 0 Standard Error t Stat P-value 98.24833 58.03348 1.69296 0.12892 0.10977 0.03297 3.32938 0.01039 b1 Sb1 t STAT  b1  β 1 Sb  0.10977  0  3.32938 0.03297 1 13-56
  • 57. Inferences About the Slope: t Test Example DCOVA Test Statistic: tSTAT = 3.329 H0: β1 = 0 H1: β1 ≠ 0 d.f. = 10- 2 = 8 /2=.025 Reject H0 /2=.025 Do not reject H0 -tα/2 -2.3060 0 Reject H0 tα/2 2.3060 3.329 Decision: Reject H0 There is sufficient evidence that square footage affects house price 13-57
  • 58. Inferences About the Slope: t Test Example DCOVA H0: β1 = 0 H1: β1 ≠ 0 From Excel output: Coefficients Intercept Square Feet Standard Error t Stat P-value 98.24833 58.03348 1.69296 0.12892 0.10977 0.03297 3.32938 0.01039 Decision: Reject H0, since p-value < α p-value There is sufficient evidence that square footage affects house price. 13-58
  • 59. F Test for Significance DCOVA  MSR F Test statistic: F STAT  MSE where MSR  SSR k SSE MSE  n  k 1 where FSTAT follows an F distribution with k numerator and (n – k - 1) denominator degrees of freedom (k = the number of independent variables in the regression model) 13-59
  • 60. F-Test for Significance Excel Output DCOVA Regression Statistics Multiple R 0.76211 R Square 0.58082 Adjusted R Square 0.52842 Standard Error FSTAT  MSR 18934.9348   11.0848 MSE 1708.1957 41.33032 Observations 10 With 1 and 8 degrees of freedom p-value for the F-Test ANOVA df SS MS Regression 1 18934.9348 18934.9348 Residual 8 13665.5652 9 11.0848 Significance F 1708.1957 Total F 0.01039 32600.5000 13-60
  • 61. F Test for Significance (continued) DCOVA Test Statistic: H0: β1 = 0 H1: β1 ≠ 0  = .05 df1= 1 df2 = 8 FSTAT  Decision: Reject H0 at  = 0.05 Critical Value: F = 5.32 Conclusion:  = .05 0 Do not reject H0 Reject H0 MSR  11.08 MSE F There is sufficient evidence that house size affects selling price F.05 = 5.32 13-61
  • 62. Confidence Interval Estimate for the Slope DCOVA Confidence Interval Estimate of the Slope: b1  t α / 2 S b 1 d.f. = n - 2 Excel Printout for House Prices: Coefficients Intercept Square Feet Standard Error t Stat P-value Lower 95% Upper 95% 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 At 95% level of confidence, the confidence interval for the slope is (0.0337, 0.1858) 13-62
  • 63. Confidence Interval Estimate (continued) for the Slope DCOVA Coefficients Intercept Standard Error t Stat P-value Lower 95% Upper 95% 98.24833 Square Feet 58.03348 1.69296 0.12892 -35.57720 232.07386 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580 Since the units of the house price variable is $1000s, we are 95% confident that the average impact on sales price is between $33.74 and $185.80 per square foot of house size This 95% confidence interval does not include 0. Conclusion: There is a significant relationship between house price and square feet at the .05 level of significance 13-63
  • 64. t Test for a Correlation Coefficient DCOVA   Hypotheses H0: ρ = 0 (no correlation between X and Y) H1: ρ ≠ 0 (correlation exists) Test statistic t STAT  r -ρ 2 1 r n2 (with n – 2 degrees of freedom) where r   r 2 if b1  0 r   r 2 if b1  0 13-64
  • 65. t-test For A Correlation Coefficient (continued) Is there evidence of a linear relationship between square feet and house price at the .05 level of significance? H0: ρ = 0 H1: ρ ≠ 0 DCOVA (No correlation) (correlation exists)  =.05 , df = 10 - 2 = 8 t STAT  r ρ 1 r2 n2  .762  0 1  .762 2 10  2  3.329 13-65
  • 66. t-test For A Correlation Coefficient (continued) DCOVA t STAT  r ρ 1 r2 n2  .762  0 1  .762 2 10  2  3.329 Conclusion: There is evidence of a linear association at the 5% level of significance d.f. = 10-2 = 8 /2=.025 Reject H0 -tα/2 -2.3060 /2=.025 Do not reject H0 0 Decision: Reject H0 Reject H0 tα/2 2.3060 3.329 13-66
  • 67. Estimating Mean Values and Predicting Individual Values DCOVA Goal: Form intervals around Y to express uncertainty about the value of Y for a given X i Confidence Interval for the mean of Y, given Xi Y  Y  Y = b0+b1Xi Prediction Interval for an individual Y, given Xi Xi X 13-67
  • 68. Confidence Interval for the Average Y, Given X DCOVA Confidence interval estimate for the mean value of Y given a particular Xi Confidence interval for μ Y|X X : i ˆ Y  t α / 2 S YX hi Size of interval varies according to distance away from mean, X 1 (Xi  X)2 1 (Xi  X)2 hi     n SSX n  (Xi  X)2 13-68
  • 69. Prediction Interval for an Individual Y, Given X DCOVA Confidence interval estimate for an Individual value of Y given a particular Xi Confidence interval for YX X : i ˆ Y  t α / 2 S YX 1  hi This extra term adds to the interval width to reflect the added uncertainty for an individual case 13-69
  • 70. Estimation of Mean Values: Example DCOVA Confidence Interval Estimate for μY|X=X i Find the 95% confidence interval for the mean price of 2,000 square-foot houses  Predicted Price Yi = 317.85 ($1,000s) 1 ˆ t Y 0.025S YX  n (X i  X) 2  (X i  X) 2  317.85  37.12 The confidence interval endpoints are 280.66 and 354.90, or from $280,660 to $354,900 13-70
  • 71. Estimation of Individual Values: Example DCOVA Prediction Interval Estimate for YX=X i Find the 95% prediction interval for an individual house with 2,000 square feet  Predicted Price Yi = 317.85 ($1,000s) 1 ˆ t Y 0.025S YX 1   n (X i  X) 2  (X i  X) 2  317.85  102.28 The prediction interval endpoints are 215.50 and 420.07, or from $215,500 to $420,070 13-71
  • 72. Finding Confidence and Prediction Intervals in Excel DCOVA  From Excel, use PHStat | regression | simple linear regression …  Check the “confidence and prediction interval for X= ” box and enter the X-value and confidence level desired 13-72
  • 73. Finding Confidence and Prediction Intervals in Excel (continued) DCOVA Input values  Y Confidence Interval Estimate for μY|X=Xi Prediction Interval Estimate for Y X=Xi 13-73
  • 74. Pitfalls of Regression Analysis      Lacking an awareness of the assumptions underlying least-squares regression Not knowing how to evaluate the assumptions Not knowing the alternatives to least-squares regression if a particular assumption is violated Using a regression model without knowledge of the subject matter Extrapolating outside the relevant range 13-74
  • 75. Strategies for Avoiding the Pitfalls of Regression   Start with a scatter plot of X vs. Y to observe possible relationship Perform residual analysis to check the assumptions   Plot the residuals vs. X to check for violations of assumptions such as homoscedasticity Use a histogram, stem-and-leaf display, boxplot, or normal probability plot of the residuals to uncover possible non-normality 13-75
  • 76. Strategies for Avoiding the Pitfalls of Regression    (continued) If there is violation of any assumption, use alternative methods or models If there is no evidence of assumption violation, then test for the significance of the regression coefficients and construct confidence intervals and prediction intervals Avoid making predictions or forecasts outside the relevant range 13-76
  • 77. Chapter Summary       Introduced types of regression models Reviewed assumptions of regression and correlation Discussed determining the simple linear regression equation Described measures of variation Discussed residual analysis Addressed measuring autocorrelation 13-77
  • 78. Chapter Summary (continued)     Described inference about the slope Discussed correlation -- measuring the strength of the association Addressed estimation of mean values and prediction of individual values Discussed possible pitfalls in regression and recommended strategies to avoid them 13-78