3 es timation-of_parameters[1]

ESTIMATION OF PARAMETERS
Rajender Parsad
I.A.S.R.I., Library Avenue, New Delhi-110 012, India
1. Introduction
Statistics is a science which deals with collection, presentation, analysis and
interpretation of results. The procedures involved with collection, presentation, analysis
and interpretation of results can be classified into two broad categories viz.:
1. Descriptive Statistics: It deals with the methods of collecting, presenting and
describing a set of data so as to yield meaningful information.
2. Statistical Inference: It comprises of those methods which are concerned with the
analysis of a subset of data leading to predictions or inferences about the entire set of
data. It is also known as the art of evaluating information to draw reliable inferences
about the true value of the phenomenon under study.
The main purpose of statistical inference is
1. to estimate the parameters(the quantities that represent a particular characteristic of a
population) on the basis of sample observations through a statistic ( a function of
sample values which does not involve any parameter). (Theory of Estimation).
2. to compare these parameters among themselves on the basis of observations and their
estimates ( testing of hypothesis)
To distinguish clearly between theory of estimation and testing of hypothesis, let us
consider the following examples:
Example 1.1: A storage bin contains 1,00,000 seeds. These seeds after germination will
produce either red or white coloured flowers. We want to know the percentage of seeds
that will produce red coloured flowers. This is the problem of estimation.
Example 1.2: It is known that the chir pine trees can yield an average of 4 kg of resin per
blaze per season. On some trees the healed up channels were treated with a chemical.
We want to know whether the chemical treatment of healed up channels enhances the
yields of resin. This is the problem of testing of hypothesis.
In many cases like above, it may not be possible to determine and test about the value of
a population parameter by analysing the entire set of population values. The process of
determining the value of the parameter may destroy the population units or it may simply
be too expensive in money and /or time to analyse the units. Therefore, for making
statistical inferences, the experimenter has to have one or more samples of observations
from one or more variables. These observations are required to satisfy certain
assumptions viz. the observations should belong to a population having some specified
probability distribution and that they are independent. For example in case of example 1,

II-48
it will not be natural to compute the percentage of seeds after germination of all the
seeds, as we may not be willing to use all the seeds at one time or there may be lack of
resources for maintaining the whole bulk at a time. In example 2, it will not be possible
to apply the chemical treatment to the healed up channels of all the chir pine trees and
hence we have to test our hypothesis on the basis of a sample of chir pine trees. More
specifically, the statistical inference is the process of selecting and using a sample
statistic (a function of sample observations) to draw inferences about population
parameters(s){a function of population values}.. Before proceeding further, it will not be
out of place to describe the meaning of parameter and statistic.
Parameter: Any value describing a characteristic of the population is called a parameter.
For example, consider the following set of data representing the number of errors made
by a secretary on 10 different pages of a document: 1, 0, 1, 2, 3, 1, 1, 4, 0 and 2. Let us
assume that the document contains exactly 10 pages so that the data constitute of a small
finite population. A quick study of this population leads to a number of conclusions. For
instance, we could make the statement that the largest number of typing errors on any
single page was 4, or we might say that the arithmetic mean of the 10 numbers is 1.5. The
4 and 1.5 are descriptive properties of the population and are called as parameters.
Customarily, the parameters are represented by Greek letters. Therefore, population mean
of the typing errors is µ = 1.5. It may be noted hat the parameter is a constant value
describing the population.
Statistic: Any numerical value describing a characteristics of a sample is called a
statistic. For example, let us suppose that the data representing the number of typing
errors constitute a sample obtained by counting the number of errors on 10 pages
randomly selected from a large manuscript. Clearly, the population is now a much larger
set of data about which we only have partial information provided by the sample. The
numbers 4 and 1.5 are now descriptive measures of sample and are called statistic. A
statistic is usually represented by ordinary letters of the English alphabet. If the statistics
happens to be the sample mean, we denote it by x . For our random sample of typing
errors we have x =1.5. Since many random samples are possible from the same
population, we would expect the statistic to vary from sample to sample.
Now coming back to the problem of statistical inference: Let nxxx ,,, 21 K be a random
sample from a population which is distributed in a form which is completely known
except that it contains some unknown parameters and probability density function (pdf)
or probability mass function (pmf) of the population is given by f(X,θ). Therefore, in this
situation the distribution is not known completely until we know the values of the
unknown parameters. For simplicity let us take the case of single unknown parameter.
The unknown parameter θ have some admissible values which lie on a real line in case of
a single parameter, in a plane for two parameters, three dimensional plane in case of three
parameters and so on. The set of all possible values of the parameter(s) θ is called the
parameteric space and is denoted by Θ. If Θ is the parameter space, then the set of all
{f(X,θ) ; θ∈Θ } is called the family of pdf’s of X if X is continuos and the family of
pmf’s of X if X is discrete. To be clearer, let us consider the following examples.

Estimation of Parameters
49
Example 1.3: Let X∼ B(n,p) and p is unknown. Then Θ = {p: 0<p<1} and {B(n,p):
0<p<1} is the family of pmf’s of X.
Example 1.4: Let X ∼ N(µ, σ2
), if both µ and σ2
are unknown then Θ = {(µ, σ2
) :
∞<µ<∞, σ2
>0} and if µ = µ0 , say and σ2
is unknown, then Θ = {(µ0, σ2
) : σ2
>0}.
On the basis of a random sample nxxx ,,, 21 K from a population, our aim is to estimate
the unknown parameter θ. Henceforth, we shall discuss only the theory of estimation.
The estimation of unknown population parameter(s) through sample values can be done
in two ways:
1. Point Estimation
2. Interval Estimation
In the first case we are required to determine a number which can be taken as the value of
θ, where as in the second, we are required to determine an interval (a, b) in which the
unknown parameter, θ is expected to lie. For example, if the population is normal, then a
possible point estimate of population mean is done through sample mean and a possible
interval estimate of mean is ( sxsx ),3,3 +− where x = ∑
n
1=i
ix
n
1
is the sample mean and
s2
= ∑
=
−
−
n
1i
2
i )x(x
1n
1
is the sample variance.
Estimator: It is a function of sample observations whose value at a given realization of
the observations gives the estimate of the population parameter. On the other hand, an
estimate means the numerical value of the estimator of a given sample. Thus, an
estimator is a random variable calculated from the sample data that supplies either
interval estimates or point estimates for population parameters.
It is essential to distinguish between an estimator and an estimate. The distinction
between an estimator and estimate is same as that between a function ‘f’ regarded as
defined for a range of a variable X and the particular value which the function assumes ,
say f(a) for a specified value of X= a. For instance, if the sample mean x is used to
estimate a population mean (µ), and the sample mean is 15, the estimator used is the
sample mean whereas the estimate is 15. Thus the statistic which is used to estimate a
parameter is an estimator whereas the numerical value of the estimator is called an
estimate.
2. Point Estimation
A point estimator is a random variable calculated from the sample data that
supplies the point estimates for population parameters.

II-50
Let nxxx ,,, 21 K be a random sample from a population with pdf or pmf as
Θθθ ∈),,(Xf , where θ is unknown. We want to estimate θ or τ(θ). Then
),,,( 21 nn xxxft K= is said to be point estimator of θ or τ(θ) if nt is close to θ or τ(θ).
In general there can be several alternative procedures that can be adopted to obtain the
point estimate of the population parameter. For instance, we may compute arithmetic
mean or median or geometric mean to estimate the population mean. As we never know
the true value of the parameter, therefore, it does not make sense that our estimate is
correct. Therefore, if there are more than one estimator then the question arises, which
among them is better. This means that we must stipulate some criteria which can be
applied to decide whether one estimator is better than the other i.e. although an estimator
is not expected to estimate the population parameter without error. Simultaneously we
don’t expect them to be very far off. Following are the criteria which should be satisfied
by a good estimator. These are:
1. Unbiasedness
2. Consistency
3. Efficiency
4. Sufficiency
Unbiasedness: An estimator (T) is said to be unbiased if the expected value of the
estimator is equal to the population parameter (θ) being estimated. For example, if the
same estimator is used repeatedly for all possible samples and we average these values
we would expect the average to be same as true values of the parameter. For instance, if
sample mean ( x ) is an unbiased estimator for population mean (µ) then we must get the
expected value of sample mean equal to population mean. In symbols, E ( x ) = µ.
Similarly variance of the sample observations s
n
x xi
i
n
2 2
1
1
1
=
−
−
=
∑( ) is an unbiased
estimate of population variance ( σ2
) because E (s2
) = σ2
.
Steps to check whether a given estimator is unbiased or not
1. Draw all possible samples of a given size from the population
2. Calculate the value of given estimator for all these samples separately.
3. Take the average of all these values obtained in Step 2.
If this average is equal to the population parameter, then the estimator is unbiased and if
this average is more than the population parameter, then the estimator is said to be
positively biased and when this average is less than the population parameter, it is said to
be negatively biased.
Consistency: An estimator (T) is said to be consistent estimator of parameter (θ) if, as
the sample size ‘n’ is increased, the estimator (T) converges to θ in probability. It is also
intuitively an appealing characteristic for an estimator to possess. For it says that as the
sample size increases (which should mean, in most reasonable circumstances that more
information becomes available) the estimate becomes “better” in the sense indicated.

51
Steps to check whether a given estimator is consistent or not
1. Show that T is an unbiased estimator of θ.
2. Obtain the variance of T i.e. variance of all the values of T obtained from all possible
samples of a particular size.
3. Increase the sample size and repeat the above two steps.
4. If variance of T decreases as sample size (n) increases and approaches to zero as n
becomes infinitely large, then T is said to be consistent estimator of θ.
Efficiency: It is sometimes possible to find more than one estimator which are unbiased
and consistent for population parameter. For instance, in case of a normal population
N(µ, σ2
) sample mean ( x ) and sample median (xmd) are unbiased and consistent
estimators for population mean (µ). However it can easily be seen that Var(xmd)
.1
2
)(
2
22
>
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛
=>
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛
≈
πσπσ
as
n
xVar
n
As the variance of a random variable measures the variability of the random variable
about its expected value. Hence, it intuitively appeals that an unbiased estimator with
smaller variance is preferable to an unbiased estimator with larger variance. Therefore,
in above example, sample mean is preferable to sample median. Thus there is a necessity
of some further criterion which will enable us to choose the best estimator. Such a
criterion which is based on the concept of variance is known as efficiency.
Minimum Variance Unbiased Estimator (MVUE): An estimator (T) is said to be an
MVUE for population parameter θ if T is unbiased and has the smallest variance among
all the unbiased estimators of θ. This is also called as most efficient estimator of θ as it
has the smallest variance among all the unbiased estimators of θ.
The ratio of variance of an MVUE and variance of a given unbiased estimator is
termed as efficiency of the given unbiased estimator.
There exists some general techniques viz. Cramer-Rao inequality, Rao-Blackwell
theorem, Lehmann-Scheffe Theorem for finding minimum variance unbiased estimators.
Best Linear Unbiased Estimator: An estimator (T) is said to be Best Linear Unbiased
Estimation (BLUE) of θ if
1. T is unbiased for θ.
2. T is a linear function of sample observations.
3. T has the minimum variance among all unbiased estimators of θ which are linear
functions of the sample observations.
Example 2.1: It is claimed that when a particular chemical is applied to some ornamental
plants will increase the height of the plant rapidly in a period of one week. The increase
in heights of 5 plants to which this chemical was applied are given below:

II-52
Plants 1 2 3 4 5
Increase in Height (cm) 5 7 8 9 6
Assuming that the distribution of increase in heights is normal, draw a sample of size 3
with replacement and show that sample mean ( x) and sample median ( xmd) are unbiased
estimators for the population mean.
Solution:
Step 1: Obtain the population mean.
Population mean = cm.7
5
35
5
69875
==
++++
Step 2: Draw all possible samples of size three with replacement and obtain their sample
mean and sample median.
Sample Mean Frequency Sample Median Frequency
15/3 1 5 13
16/3 3 6 31
17/3 6 7 37
18/3 10 8 31
19/3 15 9 13
20/3 18
21/3 19
22/3 18
23/3 15
24/3 10
25/3 6
26/3 3
27/3 1
Step 3: Obtain the mean of the sample means.
Mean of the Sample Means= 7 cm. Variance of the sample means=0.667 cm2
Step 4: Obtain the mean of sample medians.
Mean of the sample medians=7cm; Variance of the Sample Medians = 1.328 cm2
The median =7 cm.
Therefore, we can see that both the sample mean and sample median are unbiased
estimators for population mean in case of normal population.
Sufficiency: An estimator (T) is said to be sufficient for the parameter θ, if it contains
all the information in the sample regarding the parameter. This criterion has a practical
importance in the sense that after the data is collected either from a sample survey or a
designed experiment, the job of a statistician is to draw some statistically valid

53
conclusions about the population under investigation. The raw data by themselves
besides being costly to store are not suitable for this purpose. Therefore, the statistician
would like to condense the data by computing some statistic from them and to base his
analysis on these statistic, provided that there is no loss of information in doing so. In
many statistical problems of statistical inference a function of the observations contain as
much information about the unknown parameter as do all the observed values. To make
it clearer, let us consider the following example:
Example 2.2: Suppose you wish to play a coin tossing game against an adversary who
supplies the coin. If the coin is fair and you win a dollar if you predict the outcome of a
toss correctly and lose a dollar otherwise, then your net expected gain is zero. Since your
adversary supplies the coin you may want to check if the coin is fair before you start
playing the game, i.e., to test H0: p=0.5 against H1: p≠0.5. You toss the coin ‘n’ times,
should you record the outcome of each trial or is it enough to know the total number of
heads in ‘n’ tosses to test H0? Intuitively it seems clear that the number of heads in ‘n’
trials contains all the information about the unknown parameter ‘p’ and precisely this is
the information which we have used so far in the problems of inference concerning ‘p’.
Writing xi=1, if the ith
toss results in a head and zero otherwise and setting T = T(x1, ...,
xn) = x
i 1
n
i
=
∑ . We note that T is the number of heads in ‘n’ trials. Clearly, there is a
substantial reduction in data collection and storage if we record the value of T=t rather
than the observation vector (x1, ..., xn) because t can take only n+1 values whereas the
vector can take values numbering 2n
. Therefore, whatever decision we make about H0
should depend on the value of t.
It can easily be seen that a trivial statistic T(x1, ..., xn)= (x1, ..., xn) is always a sufficient
but does not provide any reduction in data collection. Hence, is not preferable as our aim
is to condense the data and simultaneously retaining all the information about the
parameter contained in the sample. A sufficient statistic which reduces the data most is
called minimal sufficient statistic.
One of the ways to check whether a given statistic is sufficient or not is that the
conditional distribution of x1, ..., xn given T ( given sufficient statistic) is independent of
population parameter.
Until now, we have discussed several properties of good estimators like unbiasedness,
consistency, efficiency and sufficiency that seems desirable in the context of point
estimation. Thus, we would like to check whether a proposed estimator satisfies all or
some of these criteria. However, if we are faced with a point estimation problem, the
question arises where can we start to look for the estimator. Therefore, it would be
convenient to have one (or several) intuitively reasonable methods of generating possibly
good estimators to study our problem. The principal methods of obtaining point
estimators are:

II-54
1. Method of moments
2. Method of minimum chi-square
3. Method of least squares
4. Method of maximum likelihood.
The application of the above mentioned methods in particular cases lead to estimators
which may differ and hence possess different attributes of goodness. The most important
method of point estimation is the method of maximum likelihood which provides
estimators with desirable properties.
Method of Maximum Likelihood: To introduce the method of maximum likelihood,
consider a very simple estimation problem. Suppose that an urn contains a number of
black and a number of white balls and suppose that it is known that the ratio of the
numbers is 3:1 but that it is not known whether the black or white balls are more
numerous, i.e., the probability of drawing a black ball is either 1/4 or 3/4. If ‘n’ balls are
drawn with replacement from the urn. The distribution of X, the number of black balls is
binomial distribution and its probability mass function is given by
f(X, p) = n
X
X n X
C p q −
for X = 0,1,...,n
where q = 1 - p and p is the probability of drawing a black ball, here p = 1/4 or 3/4. We
shall draw a sample of three balls, i.e., n = 3 with replacement and attempt to estimate the
unknown parameter ‘p’ of the distribution. The estimation problem is particularly simple
in this case because we have only to choose between the two numbers 1/4 and 3/4. The
possible outcomes of the sample and their probabilities are given below:
Outcome : X 0 1 2 3
f(X;3/4) 1/64 9/64 27/64 27/64
f(X;1/4) 27/64 27/64 9/64 1/64
In the present example, if we found that X=0, the estimate 1/4 for ‘p’ would be preferred
over 3/4 because the probability 27/64 is greater than 1/64, i.e., because a sample with
X=0 is more likely ( in the sense of having larger probability ) to arise from a population
with p=1/4 than from one with p=3/4. In general, we substitute ‘p’ by 1/4 when X=0 or
1 and by 3/4 when X=2 or 3. The estimator may thus be defined as
$ $( )p p X= =
⎩
⎨
⎧
=
=
3,2,4/3
10,4/1
forX
orforX
.
The estimator thus selects for every possible value of X, the value of p, say $p such that
f X p f x p( ; $) ( , )> ′
where ′p is any other value of p, 0<p<1.
Let us consider another experimental situation. A lion turned man eater. The lion has
three possible states of activity each night; they are “very active” (denoted by θ1),

55
“moderately active” (denoted by θ2) and “lethargic” (denoted by θ3). This lion eats ‘i’
people with probability P(i/θ), θ ∈ Θ = {θ1, θ2, θ3}. The numerical values are given in
the table below:
Lion’s Appetite Distribution
i 0 1 2 3 4
P(i/θ1) .00 .05 .05 .80 .10
P(i/θ2) .05 .05 .80 .10 .00
P(i/θ3) .90 .08 .02 .00 .00
If we are told that X=x0 people were eaten last night and asked to estimate the lion’s
activity state θ1, θ2 or θ3. One seemingly reasonable method is to estimate θ as that θ ∈
Θ that provides the largest probability of observing what we did observe. It can easily be
seen that $( ) , $( ) $( ) $ $θ θ θ θ θ θ θ θ θ θ0 1 23 3 2 1= = = =, , (3) and (4) = 1 . Thus maximum
likelihood estimator ( $θ ) of population parameter is that the value of θ which maximizes
the likelihood function, i.e., the joint pdf/pmf of sample observations taken as a function
of θ .
MLE for population mean : The MLE of population mean µ, based on a random
sample of size n is the sample mean x and if the variance of the population units Xi’s is
σ2
, then variance of x is σ2
/n. Therefore, it can easily be seen that x is unbiased,
consistent, sufficient and efficient estimator of θ .
MLE for proportion : The MLE of the proportion ‘p’ in a binomial experiment is given
by $p = x/n, where x represents the number of successes in ‘n’ trials. Therefore, the
sample proportion $p =x/n is MLE for the parameter ‘p’. The variance of $p is p(1-p)/n
and E( $p )=X/n is unbiased, consistent and sufficient estimator of ‘p’.
MLE for population variance: In case of large samples from any population or small
samples from a normal population the MLE of the population variance σ2
when
population mean is unknown is given by
S2
=
1 2
1n
x xi
i
n
( )−
=
∑ where x1, ...,xn are the sample observations and x is the sample
mean.
E(S2
) = (1-
1 2
n
)σ and variance of S2
is Var S
n n
( ) ( )2
4
1
1 2
= −
σ
. It can easily be seen
that as n → ∞, S2
is consistent estimator for population mean, it can also be proved that
it is asymptotically unbiased and asymptotically efficient estimator for the population
variance. However an exact unbiased estimator for population variance is s2
=
1
1
2
1n
x xi
i
n
−
−
=
∑( ) . Therefore, it can be inferred that MLE’s are not in general unbiased.

II-56
However, quite often bias may be removed by multiplying by an appropriate constant as
in the above case if we multiply S2
by n/(n-1) we get s2
, an unbiased estimator for σ2
.
Point estimators, however, are not good estimators of the population parameters in the
sense that even an MVUE is unlikely to estimate the population parameter exactly. It is
true that our accuracy increases with large samples, but still there is no reason why we
should expect a point estimate from a given sample to be exactly equal to the population
parameter it is supposed to estimate. Point estimators fail to throw light on how close
we can expect such an estimator to the population parameter, we wish to estimate.
Thus, we cannot associate a probability statement with point estimators. Therefore, it
would be desirable to determine an interval within which we should expect to find the
value of the parameter with some probability statement associated with it. This is done
through the interval estimation.
3. Interval Estimation
An interval estimator is a formula that tells us how to use sample data to calculate an
interval that estimates a population parameter.
Let x1, x2,..., xn be a sample from a population with pdf or pmf as f (x,θ), θ ∈ Θ. Our aim
is to find two estimators T1 = T1 (x1,...,xn) and T2 = T2(x1...,xn) such that
P{T ≤ θ ≤ T2} = 1-α.
Then the interval (T1, T2) is called the 100(1-α)% confidence interval (CI) estimator with
confidence coefficient 100(1-α)% as the confidence coefficient. Therefore, the
confidence coefficient is the probability that an interval estimator encloses the population
parameter if the estimator is used repeatedly a large number of times. T1, T2 are the lower
and upper bounds of the CI where for a particular application we substitute the
appropriate numerical values for the confidence, and the lower and upper bounds. The
above statement reflects our confidence in the process rather than in the particular
interval formed. We know that 100 (1-α)% of the resulting intervals will contain the
population parameter. There is usually no way to determine whether a particular interval
is one of those which contain the population parameter or one that does not. However,
unlike point estimators, confidence intervals have some measure of reliability, the
confidence coefficient, associated with them, and for that reason preferred to point
estimators.
Thus to obtain a 100 (1-α)% confidence interval if α = .05, we have a 95% confidence
interval, and when α = .01, we obtain a wider 99% confidence interval. The wider the
confidence interval is, the more confident we can be that the given interval contains the
unknown parameter. Of course, it is better to be 95% confident that the average life of a
machine is between 12 to 15 years than to be 99% confident that it is between 8 to 18
years. Ideally, we prefer a short interval with a high degree of confidence. Sometimes,
restrictions on the size of our sample prevent us from achieving short intervals without
sacrificing some of our degree of confidence.

57
Confidence Interval from population Mean
Consider a sample has been selected from a normal population or failing this if ‘n’ is
sufficiently large. Let the population mean is µ and population variance is σ2
.
Confidence Interval for µ, σ known
If x is the mean of a random sample of size n from a population with known variance σ2
,
a 100 (1-α)% confidence interval for µ is
x - Z
n
< < x +
n
Z/2 /2α α
σ
µ
σ
where Zα/2 is the Z- value with an area α/2 to its right.
The 100(1-α)% provides an estimate of accuracy of our point estimate. If x is used as an
estimate of µ, we can then be 100 (1-α)% confident that the error will not exceed
Z /2α
σ
n
.
Frequently, we wish to know, how large a sample is necessary to ensure that the error in
estimating the population mean µ will not exceed a specified amount e. Therefore, by
using the above, we must choose n such that e
n
Z /2 =
σ
α .
Sample size for estimating µ
If x is used as an estimate of µ, we can be 100(1-α)% confident that the errors by ‘e’
above or ‘e’ below or width of the interval will not exceed W=2e when the sample size
is
n =
Z
e
or n = 4
Z
W
/2 /2
2
2α ασ
σ
⎛
⎝
⎜
⎞
⎠
⎟
⎛
⎝
⎜
⎞
⎠
⎟
2 2
when solving for the sample size, n, all fractional values are rounded up to the next whole
number.
When the value of σ is unknown and sample size is large, then, it can be replaced by
sample standard deviation S, where S2
=
1
n
(x - x)i
2
i=1
n
∑ and the above formulae
can be used.
Example 3.1: Unoccupied seats on flights cause the airlines to lose revenue. Suppose a
large airline wants to estimate its average number of unoccupied seats per flight over the
past year. To accomplish this, the records of 225 flights are randomly selected and the
number of unoccupied seats is noted for each of the sample flights. The sample mean
and standard deviation are

II-58
x = 11.6 seats S = 4.1 seats
Estimate µ, the mean number of unoccupied seats per flight during the past year using a
90% confidence interval.
Solution: For 90% confidence interval α = 0.10. The general form for large sample 90%
confidence interval for a population mean is
x ± ± = ±Z
S
n
= 11.6 1.645.
4.1
225
11.6 .45/2α
(11.5, 12.05). That is the airline can be 90% confident that the mean number of
unoccupied flights was between (11.15, 12.05) during the sampled year.
In this example, we are 90% confident that the sample mean x differs from the true mean
by no more than 0.45. If in the above example, we want to know the sample size, so that
our estimate µ is not off by more than 0.05 seats. Then we can obtain:
n/1.4645.105.0 ×= which implies
2
05.0
1.4645.1
⎟
⎠
⎞
⎜
⎝
⎛ ×
=n =18195.3121. However, if we can
have an error margin of 1 flight, than the sample size 4649.45
1
1.4645.1
2
≈=⎟
⎠
⎞
⎜
⎝
⎛ ×
=n is
enough.
Exercise 1: The mean and standard deviation for quality grade-point averages of a
random sample of 36 college seniors are calculated to be 2.6 and 0.3, respectively. Obtain
95% and 99% confidence intervals for the entire senior class. (Z0.05=1.96 and
Z0.01=2.575).
Small sample confidence interval for µ, σ unknown
If x and s are mean and standard deviations of a random sample of size n < 30 from an
approximate normal population with unknown variance σ2
, a 100 ((1-α)% confidence
interval for µ is
x - t
s
n
< x + t
s
n
/2 /2α αµ<
where tα/2 is the t-value with n - 1 degrees of freedom leaving an area of α/2 to the right.
Estimating the difference between two population means
Confidence Interval for µ1 - µ2 , σ1
2
and σ2
2
known: If x1 and x2 are the means of
independent random samples of size n1 and n2 from populations with known variances
σ1
2
and σ2
2
respectively, a 100(1-α)% confidence interval for µ1-µ2 is given
( )x x x1
2
2
1 2
1
2
1
2
2
2
- x - Z
n n
- + Z
n n
2 /2
1
2
1 2
1 2 /2α α
σ σ
µ µ
σ σ
+ < < − +( )

59
The above CI for estimating the difference between two means is applicable if σ1
2
and
σ2
2
are known or can be estimated from large samples. If the sample sizes are small i.e.
n1 and n2 are small (<30) and σ1
2
and σ2
2
are unknown, the above interval will not be
reliable.
Small-sample Confidence Interval for µ1-µ2; σ1
2
= σ2
2
= σ2
unknown : If x1 and
x2 are the means of small independent random samples of sizes n1 and n2 respectively,
from approximate normal populations with unknown but equal variances a 100(1-α)%.
CI for µ1-µ2 is given by
( ) ( )x
n
x
n
1
2
1 1
2
1 1
- x t s
1
n
- - x t s
1
n
2 /2 p
1
2 2 /2 p
1
− + < < + +α αµ µ
where sp is the pooled estimate of the population standard deviation and
s
s
p
2 2
2
1 1
2
=
− −
+ −
(n s + (n
n n
1 1
2
2
1 2
) )
and tα/2 is the t-value with n1+n2-2 degrees of freedom, leaving an area of α/2 to the right.
Small sample confidence interval for µ1-µ2; σ1
2
≠ σ2
2
unknown : If x1 and s1
2
and x2
and s2
2
, are the means and variances of small independent small samples of size n1 and n2
respectively, from approximate normal distributions, with unknown and unequal
variances, an approximate 100(1-α)% confidence interval for µ1-µ2 is given by
( ) ( )
n
s
n
s
tx-x<
n
s
n
s
tx-x
2
2
1
2
1
/22
2
2
1
2
1
/22
2
121
2
1 ++−<+− αα µµ
where tα/2 is the t-value with
( ) ( )s
n
s
n
s n
n
+
s n
n
1
2
1
2
2
2
1
2
1
1
2
2
2
2
+
⎛
⎝
⎜
⎞
⎠
⎟
−
⎡
⎣
⎢
⎢
⎤
⎦
⎥
⎥ −
⎡
⎣
⎢
⎢
⎤
⎦
⎥
⎥
⎧
⎨
⎪
⎩⎪
⎫
⎬
⎪
⎭⎪
2 2 2
1 1
/
( )
/
( )
degrees of freedom, leaving an area α/2 to the right.
Confidence Interval for µD = µ1 -µ2 for paired observations
If d and sd are the mean and standard deviation of the differences of n random pairs of
measurements, a 100(1-α)% confidence interval for µD = µ1 - µ2 is
d - t
s
n
d + t
s
n
/2
d
/2
d
α αµ< <d
where tα/2 is the t-value with n-1 degrees of freedom, leaving an area of α/2 to the right.
Example 3.2: A random sample of size 30 were taken from an apple orchard.
Distribution of weights of apples is given below:

II-60
Wt in (gms): 125 150 175 200 225 250 275 300 325 350
frequency : 1 4 3 5 4 7 4 1 1 0
Construct a 95% confidence interval for population mean i.e. average weight of apples if
i) The population variance is given to be 46.875gm
ii) If the population variance is unknown.
Solution :
Step 1: Obtain the sample mean x
xi
=
f
f
i
i
Σ
Σ
= 220.833
Step 2: As α = .05, Zα/2 = Z.025 = 1.96
Step 3: Obtain the Interval as follow
x - Z
n
, x + Z
n
/2 /2α α
σ σ⎛
⎝
⎜
⎞
⎠
⎟ = (218.38, 223.28)
ii) Step 1: Obtain sample variance
s
1
n -1
(x x)2
i
i=1
n
2
= −∑Σ = 2503.592
Step 2: See the value of t29 (.025) = 2.045
Step 3 : Obtain the confidence interval as
x - t
s
n
x + t
s
n
n-1, /2 n-1, /2α α,
⎛
⎝
⎜
⎞
⎠
⎟ =(202.152, 239.512)
Large - Sample Confidence Interval for p: If $p is the proportion of successes in a
random sample of size n, and $q = 1 - $p , an approximate 100(1 - α) % confidence
interval for the binomial parameter p is given by
$p - Zα/2
$ $pq
n
< p < $p + Zα/2
$ $pq
n
,
where Zα/2 is the Z value leaving an area of α/2 to the right.
The method for finding a confidence interval for the binomial parameter p is also
applicable when the binomial distribution is being used to approximate the
hypergeometric distribution, that is, when n is small relative to N, population size.
Error in Estimating p: If $p is used as an estimate of p, then we can be 100(1 - α)%
confident that the error will not exceed Zα/2 $ $ / .pq n

61
Sample Size for Estimating p: If $p is used as an estimate of p, then we can be 100(1 -
α) % confident that the error will not exceed a specified amount e above or below when
the sample size is
n =
Z pq
e
α /
$ $
2
2
2
.
The above result is somewhat misleading in the sense that we must use $p to determine
the sample size n, but $p is computed from the sample. If a crude estimate of p can be
made without taking a sample, we could use this value for $p and then determine n.
Lacking such an estimate, we could take a preliminary sample of size n ≥30 to provide an
estimate of p. Then using the above result regarding the sample size, we could determine
approximately how many observations are needed to provide the desired degree of
accuracy. Once again, all fractional values of n are rounded up to the next whole
number.
Therefore, if we substitute $p = 1/2 into the formula for n. When, in fact, p actually
differs from 1/2 then n will turn out to be larger than necessary for the specified degree of
confidence and as a result our degree of confidence will increase.
If $p is used as an estimate of p, we can be at least 100(1-α)% confident that the error
will not exceed a specified amount e when the sample size is
n =
Z
4e
/2
2
2
α
Large - Sample Confidence Interval for p1 - p2
If $p 1 and $p 2 are the proportion of success in random samples of size n1 and n2 ,
respectively, $q 1 = 1 - $p 1 and , $q 2 = 1 - $p 2 , an approximate 100(1-α)% confidence
interval for difference to two binomial parameters , p1 - p2 , is given by
$ $p p1 2− -Zα/2
$ $ $ $p q
n
p q
n
1 1
1
2 2
2
+ < p1 - p2 < $ $p p1 2− + Zα/2
$ $ $ $p q
n
p q
n
1 1
1
2 2
2
+ ,
where Zα/2 is the Z value leaving an area of α/2 to the right.
Confidence Interval for σ2
If s2
is the variance of a random sample of size n from a normal population, a 100(1-α)%
confidence interval σ2
is given by
( ) ( )
/
n s n−
<
−1 12
2
2
χ
σ
χα α
<
s2
2
1- /2
2
where χ χ χα α/2
2
and are1- /2
2 2
values with n-1 degrees of freedom leaving areas of α/2
and 1-α/2, respectively to the right and left. A 100(1-α)% confidence interval for σ is
obtained by taking the square root of each endpoint of the interval for σ2
.

II-62
Example 3.3: The following are the volumes, in deciliters, of 10 cans of peaches
distributed by a certain company: 46.4, 46.1, 45.8, 47.0, 46.1, 45.9, 45.8, 46.9, 45.2 and
46.0. Find a 95% confidence interval for the variance of all such cans of peaches
distributed by this company, assuming volume to be a normally distributed variable.
Solution:
Step1: Find the sample variance s2
= 0.286.
Step 2: To obtain a 95% confidence interval, we choose α = 0.05. Then the value of
χ χ χ χα α/ . .( ) . ( ) . .2
2
0 025
2
0 975
2
19 023 2 700= =and 1- /2..
2
Step 3: Substitute the above values in the formula
( ) ( )
/
n s n−
<
−1 12
2
2
χ
σ
χα α
<
s2
2
1- /2
2
We get the 95% confidence interval
( )( . )
. .
,
9 0286
19 023 2 700
<
(9)(0.286)2
σ <
or simply 0.135 < σ2
< 0.953.
Confidence Interval for
σ
σ
1
2
2
2
If s1
2
and s2
2
are the variances of independent samples of size n1 and n2, respectively, from
normal populations, then a 100(1-α)% confidence interval for
σ
σ
1
2
2
2
is
s
s v
v
1
2
2
2
1 2
1
1
F
< <
s
s
F
/2, (v
1
2
2
2
1
2
2
2 /2, (v2,
α
α
σ
σ, )
, ),
where Fα/2,(v1,v2) is an F value with v1 = n1-1 and v2 = n2-1 degrees of freedom leaving
an area of α/2 to the right, and Fα/2,(v2,v1) is a similar F value with v2 = n2 - 1 and v1 = n1
- 1 degrees of freedom.
Example 3.4: A standardized placement test in mathematics was given to 25 boys and
16 girls. The boys made an average grade of 82 with a standard deviation of 8, while the
girls made an average grade of 78 with a standard deviation of 7. Find a 98% confidence
for
σ
σ
1
2
2
2
and
σ
σ
1
2
, where
σ
σ
1
2
2
2
are the variances of populations of grades for all boys and
girls, respectively, who at some time have taken or will take this test. Assume the
populations to be normally distributed.
Solution: We have n1 = 25, n2 = 16, s1 = 8 and s2 = 7.

63
Step 1 : For a 98% confidence interval, α = 0.02 and F0.01 (24,15) = 3.29 and F0.0r
(15,24) = 2.89.
Step 2 : Substituting these in the formula
s
s v
v
1
2
2
2
2
1
1
F
< <
s
s
F
/2, (v
1
2
2
2
1
2
2
2 /2, (v2,
α
α
σ
σ, )
, ),
we obtain the 98% confidence interval
6 4
4 9 3 2 9
2 8 9
1
< <
6 4
4 9
1
2
2
2
.
( . )
σ
σ
which simplifies to 0 397 1
2
2
2
. <
σ
σ
< 3.775.
Step 3 : take square roots of the confidence limits, a 98% confidence interval for
σ
σ
1
2
is
0.630 <
σ
σ
1
2
< 1.943.
References
Grewal, P.S. (1990). Methods of Statistical Analysis : second edition. Sterling Publishers
Pvt. Ltd., New Delhi.
Meyer, P.L. (1970). Introductory probability and statistical applications. Oxford & IBH
Publishing Co., New Delhi.
Snedecor, G.W. and Cochran, W.G. (1967). Statistical Methods: Sixth edition. Oxford &
IBH Publishing Co., New Delhi.
Walpole, R.E. (1982). Introduction to Statistics: Third Edition. Macmillan Publishing
Co., Inc., New York.

3 es timation-of_parameters[1]

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to 3 es timation-of_parameters[1]

Similar to 3 es timation-of_parameters[1] (20)

Recently uploaded

Recently uploaded (20)

3 es timation-of_parameters[1]