Two variable linear model

Chapter 1

The Two Variable Linear Model

1.1 The Basic Linear Model
The goal of this section is to build a simple model for the non-exact relationship between
two variables Y and X, related by some economic theory. For example, consumption and
income, quantity consumed and price, etc.
The proposed model:

Yi = α + βXi + ui , i = 1, . . . , n (1.1)

where α and β are unknown parameters which are the purpose of the estimation. What we
will call ‘data’ are the n realizations of (Xi , Yi ). We are abusing notation a bit by using the
same letters to refer to random variables and their realizations.
ui is an unobserved random variable which represents the fact that the relationship be-
tween Y and X is not exactly linear. We will momentarily assumet that ui has expected
value zero. Note that if ui = 0, then the relationship between Yi and Xi would be exactly
linear, so it is the presence of ui what breaks this exact nature of the relationship. Y is usu-
ally reffered to as the explained or dependent variable, X is the explanatory or independent
variable.
We will refer to ui as the ‘error term’, which is a terminology more appropriate in
the experimental sciences, where a cause x (say the dose of a drug) is administered to
different subjects and then an effect y is measured (say, body temperature). In this case ui
might be a measurement error due to the erratic behavior of a measurement instrument (for
example, a thermometer). In a social science like economics, ui represents a broader notion
of ‘ignorance’ that represents whatever is not observed (by ignorance, ommision, etc.) that
affects y besides x.

[ FIGURE 1: SCATTER DIAGRAM ]

The first goal will be to find reasonable estimates for α and β based solely on the data,
that is (Xi , Yi ), i = 1, . . . , n.

1

CHAPTER 1. THE TWO VARIABLE LINEAR MODEL 2

1.2 The Least Squares Method
ˆ
Let us denote with α and β the estimates of α and β in the simple linear model. Let us
ˆ
also define the following quantities. The first one is an estimate of Y :

ˆ ˆ ˆ
Yi ≡ α + βXi

Intuitively, we have replaced α and β by its estimates, and treated ui as if the relationship
were exactly linear, i.e., as if ui were zero. This will be undersood as an estimate of Yi .
Then it is natural to define a notion of estimation error as follows:
ˆ
ei ≡ Yi − Yi

which measures the difference between Yi and its estimate.
ˆ
A natural goal is to find α and β so as ei ’s are ‘small’ in some sense. It is interesting to
ˆ
see how the problem works from a graphical perspective. Data will correspond to n points
scattered in a (X, Y ) plane. The presence of a linear relationship like (1.1) is consistent
with points scatterd around an imaginary straight line. Note that if ui where indeed zero,
all points will lie along the same line, consistent with an exact linear relationship. As
mentioned above, it is the presence of ui what breaks this exact relationship.
ˆ
Now note that for any given values of α and β, the points determined by the fitted
ˆ
model:
ˆ ˆ ˆ
Y ≡ α + βX
ˆ
correspond to a line in the (X, Y ) plane. Hence different values of α and β correspond
ˆ
to different estimated lines, which implies that choosing particular values is equivalent to
choosing a specific line on the plane. For the i-th observation, the estimation errors ei can
ˆ
be seen graphically as the vertical distance between the points (Xi , Yi ) and (Xi , Yi ), that
ˆ
is, between (Xi , Yi ) and the fitted line. So, intuitively, we want values of α and β so as the
ˆ
fitted line they induce passes as close as possible to all the points in the scatter so errors
are as small as possible.

[ FIGURE 2: SCATTER DIAGRAM WITH ‘CANDIDATE’ LINE]

Note that if we had only two observations, the problem has a very simple solution, and
ˆ
reduces to finding the only two values of α and β that make estimation errors exactly equal
ˆ
to zero. Graphically, this is possible since this is equivalent to finding the only straight
line that passes through the two observations available. Trivially, in this extreme case all
estimation errors will be zero.
The more realistic case appears when we have more than two observations, not all of
them lying on a single line. Obviously, a line cannot pass through more than two non-
aligned points, so we cannot make all errors equal to zero. So now the problem is to find
ˆ
values of α and β that determine a line that passes the closest as posible to all the points,
ˆ
so estimation errors are, in the aggregate, small. For this we need to introduce a criterion


of what do we mean by the line being close or far from the points. Let us define a penalty
function, which consists in adding all the estimation errors squared, so as positive and
ˆ
negative errors matter alike. For any α and β, this will give us an idea of how large is the
ˆ
aggregate estimation error:
n
α ˆ
SSR(ˆ , β) = e2 =
i ˆ ˆ
(Yi − α − βXi )2
i=1

SSR stands for sum of squared residuals. Note that given the observations Yi and Xi ,
ˆ ˆ
this is a function that depends on α and β, that is, different values of α and β correspond
ˆ ˆ
to different lines that pass through the data points, implying different estimation errors. It
ˆ
is now natural to look for α and β so as to make this aggregate error as small as possible.
ˆ
The values of α and β
ˆ ˆ that minimize the sum of squared residuals are:

¯ ¯
Xi Yi − nY X
ˆ
β=
Xi ¯
2 − nX 2

and
¯ ˆ¯
α = Y − βX
ˆ
which are known as the least squares estimators of β and α.

Derivation of the Least Squares Estimators
The next paragraphs show how to obtain these estimators. Fortunately, it is easy to
α ˆ
show that SRC(ˆ , β) is globally concave and differentiable, so first order conditions for
a local minimum are:

α ˆ
∂SRC(ˆ , β)
= 0
∂α
ˆ
α ˆ
∂SRC(ˆ , β)
= 0
ˆ
∂β

The first order condition is:
∂ e2
= −2 ˆ ˆ
(Yi − α − βXi ) = 0 (1.2)
∂α
ˆ
Dividing by minus 2 and distributing the summations:

α ˆ
Yi = nˆ + β Xi (1.3)
This last expression is very important, and we will return to it frequently. From the
second first order condition:


∂ e2
= −2 ˆ ˆ
Xi (Yi − α − βXi ) = 0 (1.4)
ˆ
∂β
Dividing by -2 and distributing the summations:

Xi Yi = α
ˆ ˆ
Xi + β 2
Xi (1.5)

ˆ ˆ
(1.3) and (1.5) form a system of two linear equations with two unknowns (α y β) known
as the normal equations.
Dividing (1.3) by n and solving for α we get:
ˆ

¯ ˆ¯
α = Y − βX
ˆ (1.6)

Replacing in (1.5):

Xi Yi ¯ ˆ¯
= (Y − β X) ˆ
Xi + β Xi 2

Xi Yi ¯
= Y ˆ¯
Xi − β X ˆ
Xi + β Xi 2
¯
Xi Yi − Y Xi ˆ
= β ¯
Xi 2 − X Xi

¯
Xi Yi − Y Xi
ˆ
β=
Xi 2−X ¯ Xi

¯
Note that: X = Xi /n then ¯
Zi = Zn. Replacing, we get:

¯ ¯
Xi Yi − nY X
ˆ
β= (1.7)
Xi ¯
2 − nX 2

¯ ¯
It will be useful to adopt the following notation. xi = Xi − X, and yi = Yi − Y , so
lowercase letters denote the observations as deviations from their sample means.
Using this notation:

xi yi = ¯ ¯
(Xi − X)(Yi − Y )
= ¯ ¯ ¯¯
(Xi Yi − Xi Y − XYI + X Y )
= ¯
Xi Yi − Y ¯
Xi − X ¯¯
Yi + nX Y
= ¯ ¯ ¯¯ ¯¯
Xi Yi − nY X − nX Y + nX Y
= ¯ ¯
Xi Yi − nY X


corresponds to the numerator of (1.7). Making a similar operation in the denominator of
(1.7) we get the following alternative expression for the least squares estimate of β:

ˆ xi yi
β=
x2i

[ FIGURE 3: SCATTER DIAGRAM AND OLS LINE ]

1.3 Algebraic Properties of Least Squares Estimators
By algebraic properties of the estimator we mean those that are a direct consequence of
the minimizacion process, stressing the difference with statistical properties, which will be
studied in the next section.

• Property 1: ei = 0 From the first normal equation (1.2), dividing by minus 2 and
replacing by the definition of ei we easily verify that as a consequence of minimizing
the sum of squared residuals, the sum of the residuals, and consequently their average,
is equal to zero.
• Property 2: Xi ei = 0. This can be checked by dividing by minus 2 in the second
normal equation (1.4). The covariance between X and e is given by:
1 ¯
Cov(X, e) = (Xi − X)(ei − e)
¯
n−1
1 ¯ ¯¯
= X i ei − e
¯ Xi − X ei + Xe
n−1
1
= X i ei
n−1
since from the previous property ei and hence e are equal to zero. Then, this
¯
property says that as a consequence of using the method of least squares the sample
covariance between the explanatory variable X and the error term e is zero, or, which
is the same, the residuals are linearly unrelated to the explanatory variable.
ˆ ˆ ˆ
• Property 3: The estimated regression line corresponds to the function Y (X) = α+ βX
ˆ as parameters, so as Y is a function that depends on X. Consider
where e take α and β
ˆ ˆ
¯
what happens when we evaluate this function at X, the mean of X:

ˆ ¯ ˆ ˆ¯
Y (X) = α + β X

But from (1.6):

ˆ ˆ¯ ¯
α + βX = Y
ˆ ¯ ¯
Then Y (X) = Y , this is, the estimated regression line by the method of least squares
passes through the point of means.


• Property 4: Relationship between regression and correlation: Remember that the
sample correlation coefficient between X and Y for a sample of n observations (Xi , Yi ),
i = 1, 2, . . . , n is defined as:

Cov(X, Y )
rXY =
SX SY
ˆ
The following result establishes the relationship between rXY and β.

ˆ xi yi
β =
x2i
xi yi
=
x2
i x2
i
2
yi
xi yi
=
x2
i x2
i
2
yi
2
√
xi yi yi / n
= √
x2
i
2
yi x2 / n
i

ˆ SY
β=r
SX

ˆ
If r = 0 then β = 0.Note that if both variables have the same sample variance, then
ˆ
the correlation coefficient is equal to the regression coefficient β. We can also see
ˆ is not invariant to changes in scales or unit
that, unlike the correlation coefficient, β
of measurement.
ˆ ˆ
• Property 5: The sample means of Yi and Yi are the same. By definition, Yi = Yi + ei
for i = 1, . . . , n. Then, summing for every i:

Yi = ˆ
Yi + ei

and dividing by n:
Yi ˆ
Yi
=
n n
since ei = 0 from the first order conditions. Then:

¯ ¯
ˆ
Y =Y

which is the desired result.


ˆ ˆ ˆ
• Property 6: β is a linear function of the Yi ’s. This is, β can be written as β = wi Yi ,
where the wi ’s are real numbers not all of them equal to zero.
ˆ
This is easy to prove. Let us start by writing β as follows:

ˆ xi
β= yi
x2i

and call wi = xi / x2 . Note that:
i

xi = ¯
(Xi − X) = ¯
Xi − nX = 0

which implies wi = 0. From the previous result:

ˆ
β = wi yi
= ¯
wi (Yi − Y )
= ¯
wi Yi − Y wi
= wi Yi

which gives the desired result.

This does not have much intuitive meaning so far, but it will be a useful for later
results.

1.4 The Two-Variable Linear Model under the Classical As-
sumptions

Yi = α + βXi + ui , i = 1, . . . , n

In addition the the linear relationhips beteween Y and X we will assume:

1. E(ui ) = 0, i = 1, 2, . . . , n. ‘On average’ the relationship between Y and X is linear.

2. V ar(ui ) = E[(ui − E(ui ))2 ] = Eu2 = σ 2 i = 1, 2, . . . , n. The variance of the error
i
term is constant for all observations. We will say that the error term is homoskedastic.

3. Cov(ui , uj ) = 0 ∀i = j. The error term for an observation i is not linearly related
to the error term of any other diﬀerent observation j. If variables are measured over
time, i.e., i = 1980, 1981 . . . , 1997 we will say that there is no autocorrelation. In
general, we will say that there is no serial correlation. Note that since E(ui ) = 0,
assuming Cov(ui , uj ) = 0 is equivalent to assuming E(ui uj ) = 0.

4. The values of Xi are non-stochastic and not all of them equal.


The classical assumptions provide a basic probabilistic structure to study the linear
model. Most assumptions are of a pedagogic nature and we will study later on how they
can be relaxed. Nevertheless, they provide a simple framework to explore the nature of
least squares estimator.

1.5 Statistical Properties of Least Squares Estimators
Actually, the problem is to find good estimates of α, β and σ 2 . The previous section
presents estimates of the first two based on the principle of least squares so, trivially, these
estimates are ‘good’ in the sense that they minimize certain notion of fit: they make the
sum of squared residuals as small as possible. It is relevant to remark that in obtaining the
least squares estimators we have made no use of the classical assumptions described above.
Hence, the natural step is to explore whether we can deduce additional properties satisfied
by the least squares estimator, so we can say that it is good in a sense that goes beyond
that implicit in the least squares criterion. The following are called statistical properties
since they arise as a consequence of the statistical structure of the model.
We will use repeatedly the following expressions for the LS estimators:

ˆ xi yi
β=
x2i

¯ ˆ¯
α = Y − βX
ˆ
ˆ
We will first explore the main properties of β in detail, and leave the analysis of α as
ˆ
ˆ
exercises. The starting conceptual point is to see that β depends explicitely on the Yi ’s
ˆ
which, in turn, depend on the ui ’s which are, by construction, random variables. Then β is
a random variable and then it makes sense to talk about its moments (mean and variance,
for example) and its distribution.
It is easy to verify that:

yi = xi β + u∗
i

where u∗ = ui − u, and, according to the classical assumptions, E(u∗ ) = 0 and, consequently,
i ¯ i
E(yi ) = xi β. This is known as the classical two-variables linear model in deviations form
the means.

ˆ ˆ
• β is an unbiased estimator, that is: E(β) = β

To prove the result, from the linearity property of the previous section

ˆ
β = wi yi
ˆ
E(β) = wi E(yi ) (wi ’s are non-stochastic)

= wi xi β


= β wi xi
= β x2 /(
i x2 )
i
= β

ˆ
• The variance of β is σ 2 / x2
i

ˆ
From the linearity property, β = wi Yi , then

ˆ
V (β) = V wi Yi

Now note two things. First:

V (Yi ) = V (α + βXi + ui ) = V (ui ) = σ 2

since Xi is non-stochastic. Second, note that E(Yi ) = α + βXi , so

Cov(Yi , Yj ) = E [(Yi − E(Yi ))(Yj − E(Yj ))]
= E(ui uj ) = 0

by the no serial correlation assumption. Then V ( wi Yi ) is the variance of
(weighted) sum of uncorrelated terms. Hence

ˆ
V (β) = V wi Yi
2
= wi V (Yi )

= σ2 2
wi
2
= σ2 (x2 )/
i x2
i

= σ2 / 2
xi

ˆ
• Gauss-Markov Theorem: under the classical assumptions, β, the LS estimator of β,
has the smallest variance among the class of linear and unbiased estimators. More
formally, if β ∗ is any linear and unbiased estimator of β then:

ˆ
V (β ∗ ) ≥ V (β)

The proof of a more general version of this result will be postponed until Chapter 3.
Discussion: BLUE, best does not mean good, we want minimum variance unbiased
(without ‘linear’), ‘linear’ is not an interesting class, etc. If we drop any assumption,
the OLS estimate is no longer BLUE. This justiﬁes the use of OLS when all the
asumptions are correct.


Estimation of σ 2
So far we have concentrated the analysis on α and β. As an estimate for σ 2 we will propose:

e2
S2 = i
n−2
We will later show that S 2 provides and unbiased estimator for σ 2 .

1.6 Goodness of fit
After estimating the parameters of the regression line, it is interesting to check how well
does the estimated model fit the data. We want a measure of how well does the fitted line
represent the observations of the variables of the model.
To look for such measure of goodness of fit, we start from the definition of fitted value
ˆ
ei = Yi − Yi , solve for Yi and substract in both members the sample mean of Yi to obtain:
¯
Yi − Y ˆ ¯
= Yi − Y + ei
y i = y i + ei
ˆ

¯ ¯
ˆ
using the notation defined before and noting that from Property 4, Y = Y . Taking the
square of both sides and summing over all the observations:

yi = (î + ei )2
2
y
ˆ2
= yi + ei + 2î ei
y
2
yi = ˆ2
yi + e2 + 2
i y i ei
ˆ

The next step is to show that yi ei = 0:
ˆ

y i ei =
ˆ α ˆ
(ˆ + βXi )ei
= α
ˆ ˆ
ei + β Xi ei
= 0+0

from the first order conditions. Then we get the following important decomposition:
2
yi = yi 2 +
ˆ e2
i
T SS = ESS + RSS

This is a key result that indicates that when the we use the least squares method, the total
variability of the dependent variable (TSS) around its sample mean can be decomposed
ˆ
as the sum of two factors. The first one corresponds to the variability of Y (ESS) and
represents the variability explained by the fitted model. The second term represents the
variability not explained by the model (RSS), associated to the error term.


For a given model, the best situation arises when errors are all zero, in which case
the total variability (TSS) conincides with the explained varaibility (ESS). The worst case
corresponds to the situation in which the fitted model does not explain anything of the total
variability, in which case TSS coincides with RSS. From this observation, it is natural to
suggest the following goodness of fit measure, known as R2 , or coefficient of determination:
SCE SCR
R2 = =1−
SCT SCT
It can be shown (we will do it in the exercises) that R2 = r2 . Consequently, 0 ≤ R2 ≤ 1.
When R2 = 1 |r| = 1, which corresponds to the case in which the relationship between
Y and X is exactly linear. On the other hand, R2 = 0 is equivalent to r = 0, which
corresponds to the case in which Y and X are linearly unrelated. It is interesting to note
that T SS does not depend on the estimated model, that is, it does not depend on β nor ˆ
α. Then, if β
ˆ ˆ and α are choosen so as to minimize SSR then they automatically maximize
ˆ
R2 . This implies that, for a given model, the least squares estimate maximizes R2 .
The R2 is, arguably, the most used and abused measure of quality of a regression model.
A detailed analysis of the extent to which a high R2 can be taken as representative of a
‘good’ model will be undertaken in Chapter 4.

1.7 Inference in the two-variable linear model
The methods discussed so far provide reasonably good point estimates of the parameters
of interest α, β and σ 2 but usually we will be interested in evaluating hypotheses involving
the parameters, or constructing confidence intervals for them. For example, consider the
case of a simple consumption function where consumption is specified as a simple linear
function of income. We could be interested in evaluating whether the marginal propensity
to consume is equal to, say, 0.75, or that autonomous consumption is equal to zero.
In general terms, a hypothesis about a parameter of the model is a conjecture about
it, that can be either false or true. The central problem is that in order to check whether
such statement is true or false we do not have the chance to observe such a parameter.
Instead, based on the available data, we have an estimate of it. As an example, suppose
we are interested in evaluating the, rather strong, null hypothesis that income is not an
explanatory factor of consumption, against the hypothesis that it is a relevant factor. In
our simple setup this corresponds to H0 : β = 0 against HA : β = 0. The logic we will use is
the following: if the null hypothesis were in fact true β would be exactly zero. Realizations
ˆ ˆ
of β can potentially take any value, since β is, by construction, a random variable. But if
βˆ is a ‘good’ estimator of β, when the null hypothesis is true it should take values close
ˆ
to zero. On the other hand, if the null hypothesis were false, the realizations of β should
be significantly different from zero. Then, the procedure consists in computing β ˆ from the
data, and reject the null if the obtained value is significantly different from zero, or accept
otherwise.


Of course, the central concept behind this procedure lies in specifying what do we mean
ˆ
by ‘very close’ or ‘very far’, given that β is a random variable. More specifically, we need
to know the distribution of β ˆ under the null hypothesis so we can define precisely the
notion of ‘significantly different from zero’. In this context such a statement is necessarily
probabilistic, that is, we will take as the rejection region a set of values that lie ‘far away’
from zero, or, a set of values that under the null hypothesis appear with very low probability.
The properties discussed in the previous section are informative about certain moments
ˆ or α (for example, their means and variances) but they are not enough for the purposes
of β ˆ
of knowing their distrubutions. Consequently, we need to introduce an additional assump-
tion. We will assume that ui is normally distributed, for i = 1, . . . , n. Given that we have
already assumed that ui has zero mean and constant variance equal to σ 2 , we have:

ui ∼ N (0, σ 2 )

Given that Yi = α + βXi + ui and that the Xi ’s are non-stochastic, we immediately
see that the Yi ’s are also normally distributed since linear transformations of normal ran-
dom variables are also normal. In particular, given that the normal distibution can be
characterized by its mean and variance only, we get:

Yi ∼ N (α + βXi , σ 2 )
ˆ
, for every i = 1 . . . , n. In a similar fashion β is also normally distributed since by Property
1 it is a linear combination of the Yi ’s, that is:

ˆ
β ∼ N (β, σ 2 / x2 )
i

If σ 2 were known we could use this result to test simple hypothesis like:

Ho : β = βo vs. HA : β = βo
ˆ
Substracting from β its expected value and dividing by its standard deviation we get:
ˆ
β − βo
z= ∼ N (0, 1)
σ/ x2
i

Hence, if the null hypothesis is true, z should take values that are small in absolute value, and
large otherwise. As you should remember from a basic statistics course, this is acomplished
by defining a rejection region and an acceptance region as follows. The acceptance region
includes values that lie close to the one corresponding to the null hypothesis. Let c < 1 and
zc be a number such that:

P r(−zc ≤ z ≤ zc ) = 1 − c

Replacing z by its definition:


P r βo − zc σ/ ˆ
x2 ≤ β ≤ βo + zc σ/ x2 =1−c
i i

Then the acceptance region is given by the interval:

βo ± zc (σ/ x2 )
i

ˆ
so we accept the null hypothesis if the observed realization of β lies within this interval and
reject otherwise. The number c is specified in advance and it is usually a small number. It
is called the significance of the test. Note that it gives the probability that we reject the
null hypohtesis when it is correct. Under the normality assumptions, the value zc can be
easily obtained from a table of percentiles of the standard normal distribution.
As you should also remember from a basic statistics class, a similar logic can be applied
to construct a confidence interval for β0 . Note that:

ˆ
P r β − zc (σ/ ˆ
x2 ) ≤ βo ≤ β + zc (σ/ x2 ) = 1 − c
i i

Then a 1 − c confidence interval for β0 will be given by:

ˆ
β ± zc σ/ x2
i

The practical problem with the previous procedures is that they require that we know
σ 2 , which is usually not available. Instead, we can compute its estimated version S 2 . Define
t as:
ˆ
β−β
t= √
S/ x2

t is simply z where we have replaced σ 2 by S 2 . A very important result is that by doing
this replacement we have:

t ∼ tn−2

that is, the ‘t-statistic’ has the so-called ‘t-distribution with n − 2 degrees of freedom’.
Hence, when we use the estimated version of the variance we obtain a different distribution
for the statistic used to test simple hypotheses and construct confidence intervals.
Consequently, applying once again the same logic, in order to test the null hypothesis
Ho : β = βo against HA : β = βo we use the t-statistic:
ˆ
β − βo
t= ∼ tn−2
S/ x2
i


and a 1 − c confidence interval for β0 will be given by:
ˆ
β ± tc (S/ x2 )
i

where now tc is a percentile of the ‘t’ distribution with n − 2 degrees of freedom, which is
usually tabulated in basic statistics and econometrics textbooks.
An important particular case is the insignificance hypothesis, that is Ho : βo = 0 against
HA : β0 = 0. Under the null X does not help explain Y , and under the alternative, X is
linearly related to Y . Replacing βo by 0 above we get:
ˆ
β
tI = ∼ tn−2
S/ x2
i

which is usually reported as a standard outcome in most regression packages.
Another alternative to check for the significance of the linear relationship is to look at
how large is the explained sum of squares ESS. Recall that if the model has an intercept
we have that:
T SS = ESS + RSS
If there is no linear relationship between Y and X, ESS should be very close to zero.
Consider the following statistic, which is just a ‘standardized’ version of the ESS:
ESS
F =
RSS/(n − 2)
It can be shown that under the normality assumption, F has the F − distribution with
1 degree of freedom in the numerator, and n − 2 degrees of freedom in the denominator,
which is usually labeled as F (1, n − 2). Note that if X does not help explain Y in a linear
sense, ESS should be very small, which would make F very small. Then, we should reject
the null hypothesis that X does not help explain Y is the F statistic computed from the
data takes a large value, and accept otherwise.
Note that by definition R2 = ESS/T SS = 1 − RSS/T SS. Divide both the numerator
of the F statistic by T SS. Solving for ESS and RSS and replacing above we can write the
F statistic in terms of the R2 coefficient as:
R2
F =
(1 − R2 )/(n − 2)
Then, the F test is actually looking at whether the R2 is significantly high. As it is expected,
there is a close relationship between the F statistic and the ‘t’ statistic for the insignificance
hypothesis (tI ). In fact, when there is no linear relationship between Y and X, ESS is zero,
or β0 = 0. In fact, it can be easily shown that:

F = t2
I

We will leave the proof as an excercise.

Two variable linear model

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (19)

En vedette

En vedette (6)

Similaire à Two variable linear model

Similaire à Two variable linear model (20)

Dernier

Dernier (20)

Two variable linear model