10. sampling and hypotehsis

Census & Sample-
 Census/Complete Enumeration survey
method, data are collected for each
and every unit of the population/
universe which is the complete set of
items which are of interest in any
particular situation.
 Sample is used to describe a portion
chosen from sample.

Sampling
 Study of sample is sampling.
 In sampling technique instead of
every unit of the population only a
part of the population is studied and
the conclusions are drawn on that
basis for the entire universe.
 A process of learning about the
population on the basis of a sample
drawn from it.

Methods of Sampling
Probability/Random Non-Prob/Non-random
-Simple/
Unrestricted
-Stratified
-Systematic
-Cluster/Multistage
-Judgment/
Purposive
-Quota
-Convenience

Statistic & Parameter-
 Statistics: describes the characteristics
of a sample.
 The values obtained from the study of
sample such as mean, median,standard
deviation etc.
 Parameter: describes the characteristics
of a population.
 The values obtained from the population
such as mean, median, standard deviation
etc.

POPULATION SAMPLE
DEFINITION Collection of
items being
considered
Part or
portion of the
population
chosen for
study
PROPERTIES Parameters statistics
SYMBOLS Population
size=N
Sample size=n
Population
mean=
Sample mean=
Population s.d
= 
x
Sample s.d= s

Sampling Error-
 The difference between the result of
studying a sample and inferring a result
about the population , and the result of the
census of the whole population.
 The error arising due to drawing
inferences about the population on the
basis of few observations(sampling).
 Two types:
 Biased Errors
 Unbiased Errors

 Sampling error is non-existent in
complete enumeration survey.
 Non-sampling errors: errors that
occur in acquiring, recording or
tabulating statistical data that cant
be ascribed to sampling error.
 They may arise in either a census or
sample.

Statistical hypothesis
A statement about the population
parameter.
Assertion or assumption, that we make
about a population parameter, which may
or may not be valid, but is used as a basis
for reasoning.
Hypothesis testing: A process of testing
a statement or belief about a population
parameter by the use of information
collected from a sample(s).

Hypothesis Testing
 Null Hypothesis: It means that there
is no real difference in the sample
and the population in the particular
matter under consideration.
 Denoted by Ho.

Null hypothesis
• States that the “null” condition exists
• There is nothing new happening
• The old standard is correct
• The old theory is still true
• The system is in control.
Key word- difference is not significant

Alternative hypothesis
Alternative hypothesis is complementary
to null hypothesis and specifies those
values that the researcher believes to
hold true.
Denoted by Ha
The two hypothesis are such that if one
is accepted, the other is rejected.

Alternative hypothesis
 The new theory is true
 Something is happening.
 There are new standards
 The system is out-of-control,
Key word- difference is significant
i.e results of the experiment is unlikely due to
chance, reject null hypothesis.

CASELETS
 Flour packaged by a manufacturer is
supposed to weigh on an average 40
ounces.
 The manufacturer wants to test the
packaging process
Null hypothesis: the average weight of the
Packages is 40 ounces(no problem).
Alternative hypothesis: the average is not 40
ounces (process is out-of control)

 A Company has found mean life time of
fluorescent light bulbs are 1600 hrs.
 Due to improvement in technical effort,
officials believe that now, life time of bulbs
is greater than 1600 hrs.
NULL HYPOTHESIS:
Life time of bulbs is still 1600 hrs (OLD IDEA)
ALTERNATE HYPOTHESIS:
Life time of bulbs is greater than 1600 hrs.
(NEW THEORY)

• You are investigating the effects of a
new pain reliever.
• Hope the new drug relieves pains longer
than the leading pain reliever.
Null hypothesis: the new pain reliever is
no better than the leading pain reliever.
Alternate hypothesis: the new pain reliever
lasts longer than the leading pain reliever.

•Automobile manufacturer claims a new model gets
at least 27 miles per gallon. A consumer group
disputes this claim and would like to show the mean
miles per gallon is lower.
( H0:   27 and Ha: < 27)
•A freezer is set to cool food to 10o. If
temperature is higher, the food could spoil, and if
it is lower, the freezer is wasting energy. Random
freezers are selected and tested as they come off
the assembly line. The assembly line is stopped if
there is any evidence to suggest improper cooling.
H0:  = 10 and Ha:   10

Level of Significance
 To test the validity of Ho against that of
Ha at as certain level of significance.
 The risk with which an experimenter
rejects or retains- a null hypothesis
depends upon the significance level
adopted.
 5%: Prob. of rejecting the null hypothesis
if it is true.
 Denoted by  - is specified before the
samples are drawn.

Accept the null hypothesis if the sample
statistic falls in this region
Rejection
/Critical Region
Acceptance
Region
Reject the null hypothesis if the sample
statistic falls in these two regions.

Errors in sampling
 Type I Error :
Reject Ho when it is true
P{ Reject Ho/ Ho is true}
= 
 Type II Error:
Accept Ho when it is false.
P{Accept Ho/Ha is true}
= 

Decision Ho is true Ho is false
Accept Ho Correct
Decision with
confidence
(1- )
Error-II()
Reject Ho Error-I( ) Correct
Decision
(1- )
P{Reject a lot when it is good) = 
(Producer's Risk)
P{Accept a lot when it is bad} = 
(Consumer’s risk)

Note:
1. Would like  and  to be as small as possible.
2.  and  are inversely related.
3. Usually set  (as .01 or .05)
4. 1-  : known as confidence coefficient/
degree of confidence.
5. 1 -  : THE POWER OF THE STATISTICAL
TEST.
A measure of the ability of a hypothesis
test to reject a false null hypothesis.
6. Regardless of the outcome of a hypothesis
test, we never really know for sure if we
have made the correct decision.

Two-tailed Test
 The alternative hypothesis states that the
population parameter may be either less
than or greater than the value stated in
Ho.
 Ho:  = o Ha:   o
–The
rejection
region is
located in
both the
tails.

One-tailed Test
 The alternative hypothesis states
that the population parameter differs
from the value stated in H0 in one
particular direction.
 Ho:   o Ha:  > o (Right Tailed test)
 Ho:  o Ha:  < o (Left Tailed test)
– The critical region is located only in one
tail of the sampling distribution.

One-tailed Test
 Right/Upper-tail
Critical
 Left/Lower-tail
Critical

Critical values/Significant values
Value of test statistic which separates
The critical (rejection) region and the
Acceptance region. Depends on
1) Level of significance
2) Type of tail

Standard Normal Distribution
z
2
z
2
(1 )

2
--5 --4 --3 -2 --1 0 1 2 3 4 5
.
.
.
.
Z

2

Summary of certain Critical
Values for Sample Statistic z
Level of Significance
Rejection
Region
=0.10 =0.05 =0.01 =0.005
One-tailed 1.28 1.645 2.33 2.58
Two-tailed 1.645 1.96 2.58 2.81

Sampling Distribution of
a Statistic
Probability distribution of all possible
values statistic may assume, when
computed from random samples of same
size, drawn from a specified population.

• Draw ‘k’ samples of size n from given
finite population of size N.
• Compute some statistic like mean,
variance etc.. for each of these k
samples.
• The set of the values of the statistic
so obtained (one for each sample)
constitutes the ‘sampling distribution
of the statistic’

Properties of Sampling
Distribution of Mean-
 The arithmetic mean of the sampling
distribution of means is equal to the
mean of the population from which
sample were drawn.
x  
•Sampling distribution of means is normally
distributed ( irrespective of the distribution of
the universe)

The sampling distribution of mean has a
Standard deviation ( A Standard Error)
equal to the population standard
deviation divided by square root of sample
size.
n
S E X

.  
Standard error
The standard deviation of the sampling distribution of
a statistic about population parameter is known as its
standard error

Test of Significance for
single mean (Large Samples)
 If xi, i= 1,2,..,n is a random sample of size
‘n’ from a normal population with mean 
and standard deviation  , then the sample
mean is distributed normally with mean 
and standard deviation

n
2

x N( , )
n


 For large samples, standard normal variate
is
 Test statistic=
 value of sample statistic-value of
hypothesized population parameter
Standard error of statistic
x
z



n


Procedure in hypothesis
testing
 Formulate a Hypothesis
 Set up suitable significance level
 Select test criterion
 Compute ‘z’
 Make decisions.

Case-Let
 A company manufacturing automobile tyres
finds that tyre life is normally distributed
with a mean of 40,000 km and standard
deviation of 3000 km. It is believed that a
change in the production process will result
in a better product and the company has
developed a new tyre. A sample of 100 new
tyres has been selected.The company has
found that the mean life of these new
tyres is 40,900 km.Can it be concluded
that the new tyre is significantly better
than the old one, using the significance
level of 0.01?

Solution-
1. Null hypothesis: H0 :  = 40,000
 Alternate Hypo: Ha :  > 40,000
 Level of significance () = 0.01
 Test criterion: z-test
 Computation :
x

n
 
x
z


 = 40,900-40,000 = 3

n
300

 At 0.01 level, the critical value of z is
2.33.
 Zcal=3
.01
2.33
As computed value
falls in rejection
region, we reject
the null
hypothesis.
i.e. alternate hypothesis that  is
greater than 40,000 km is accepted.
3
Z tab > Z cal
Accept

Case let
 An ambulance service claims that it takes,
on the average 8.9 minutes to reach its
destination in emergency calls.To check on
this claim, the agency which licenses
ambulance services has then timed on 50
emergency calls, getting a mean of 9.3
minutes with a standard deviation of 1.8
minutes.Does this constitute evidence that
the figure claimed is not right at 1% level
of significance?
Hint: Ho: = 8.9; Ha: 8.9, Zcal = 1.574 ;
Ho accepted.

 A random sample of boots worn by 40
combat soldiers in a desert region showed
an average life of 1.08 yrs with a standard
deviation of 0.05.Under the standard
conditions,the boots are known to have an
average life of 1.28 yrs.Is there reason to
assert at a level of significance of 0.05
that use in the desert causes the mean life
of such boots to decrease?
Hint: Ho:  = 1.28, Ha: <1.28 ,Zcal= -28.57
Ho rejected.

 Hinton Press hypothesizes that the
average life of its largest web press
is 14,500 hrs.They know that the
standard deviation of press life is
2100 hrs.From a sample of 25
presses, the company finds a sample
mean of 13000 hrs. At a 0.01
significance level, should the company
conclude that the average life of the
presses is less than the hypothesized
14,500 hours?
 Ans: Ho rejected.

 ABC company is engaged in the
packaging of a superior quality tea in
jars of 500 gm each.The company is
of the view that as long as jars
contain 500 gm of tea, the process is
in control.The standard deviation is
50 gm.A sample of 225 jars is taken
at random and the sample average is
found to be 510 gm.Has the process
gone out of control?
Hint:  =500, 500; Zcal = 3;
Ho rejected

Case let
 American Theaters knows that a
certain hit movie ran an average of
84 days in each city, and the
corresponding standard deviation was
10 days.The manager of the
southeastern district was interested
in comparing the movie;s popularity in
his region with that in all of
American’s other theaters. He
randomly chose 75 theaters in his
region and found that they ran the
movie an average of 81.5 days.

 State appropriate hypothesis for
testing whether there was a
significant difference in the length of
the picture’s run between theaters in
the southeastern district and all of
American’s other theaters.
 At a 1% significance level, test these
hypothesis.
 (Ans: Accept Ho)

Test of significance of
difference of means
 Here, we study two populations.
 Let x
1 be the mean of a sample of size n1
from a population with mean 1 and variance
and let be the mean of an independent
random sample of size n2 from another
population with mean 2 and variance .
2 x 2
1
2
2 
2
x N n
x N n
(  
/ )
(  
/ )
1 1, 1 1
2
2 2, 2 2

 The mean of the sampling distribution of
the difference between sample mean is
symbolically
x1  x 2 x1x 2 12 x1  x 2 x1x 2 12
 The standard deviation of the sampling
distribution of the difference between the
sample means is called the STANDARD
ERROR of the difference between two
means.
 1  2
 
2 2
1 2
 
1 2
x x
n n

 When n>30(Large samples), test
statistic is
(x  x )  ( 
)
 
1 2 1 2
 
 As Ho: 1=2
2 2
1 2
1 2
z
n n


(x 1 
x 2
)
2 2
 
1 2
1 2
z
n n



 The means of two single large samples
of 1000 and 2000 members are 6.75
inches and 68.0 inches respectively.
Can the samples be regarded as
drawn from the same population of
standard deviation 2.5 inches. Test at
5% level of significance.
 Sol: n1= 1000, n2 = 2000,
 x1 = 67.5 inches, x2
= 68.0 inches,
 1= 2=2.5 inches

 Ho: 1=2 (the samples are drawn
from same population )
 Ha: 12 (Two-tailed)
 Level of significance () = 0.05
 Z-test
(x 1 
x 2
)
2 2
 
1 2
1 2
z
n n


67.5 68.0
1 1
2.5
1000 2000
z


=-5.1
 
  
 

 Zcal = -5.1
 Z tab= 1.96( at 5%, two tailed)
As z tab < z cal so,we
Reject Null Hypothesis.i.e samples drawn are
certainly not from the same population
with standard deviation 2.5.
Z cal value lies in
rejection region
-5.1

 In a survey of buying habits,400 women
shoppers are chosen at random in super
market ’A’ located in a certain section of
the city.Their average weekly food
expenditure is Rs. 250 with a standard
deviation of Rs 40. For 400 women
shoppers chosen at random in super market
‘B’ in another section of the city, the
average weekly food expenditure is Rs 220
with a standard deviation of Rs 55. Test at
1% level of significance whether the
average weekly food expenditure of the
two populations of shoppers are equal.
Ans: Ho: 1=2 ;Ha: 12 (Two-tailed)
Ho rejected

 The average hourly wage of a sample of
150 workers in a plant ‘A’ was Rs 2.56 with
a standard deviation of Rs 1.08. The
average hourly wage of a sample of 200
workers in plant ‘B’ was Rs 2.87 with a
standard deviation of Rs 1.28.Can an
applicant safely assume that the hourly
wages paid by plant ‘B’ are higher than
those paid by plant ‘A’?
Ans: Ho: 1=2 ;Ha: 1<2 (Left-tailed)
Ho rejected

 In 1993,Financial accounting Standard
Boards(FASB) was considering a proposal to
require companies to report the potential effect
of employees stock options on earning per
share(EPS). A random sample of 41 High tech
firms revealed that the new proposal will reduce
EPS by an average of 13.8 %,with a s.d. of 18.9%.A
random sample of 35 producers of consumer goods
would reduce EPS by 9.1% on average, with a s.d.
of 8.7%.On the basis of these samples,is it
reasonable to conduct (at 10% level of
significance) that the FASB proposal will cause
greater reduction in EPS for high-tech firms than
for producers of consumer goods?
Ans: Ho: 1=2 ;Ha: 1>2 (Right-tailed)
Ho rejected

Que: Two independent samples of
observations were collected.for the first
sample of 60 elements,the mean was 86
and the standard deviation 6.The second
sample of 75 elements had a mean of
82and a standard deviation of 9.
 Compute the standard error of the
difference between means.
 Using =0.01,test whether the two samples
can reasonably be considered to have come
from populations with the same mean.

 A potential buyer wants to decide which of
the two brands of electric bulbs he should
buy as he has to buy them in bulk.As a
specimen, he buys 100 bulbs of each of the
two brands-A and B.On using these bulbs,
he finds that brand A has a mean life of
1200 hrs with a standard deviation of 50
hrs and Brand B has a mean life of 1150
hrs with a s.d. of 40 hrs.Do the two brands
differ significantly in quality? (Use  =
0.05)
Ans: Ho: 1=2 ;Ha: 12 (Two-tailed)
Ho rejected

Test of Significance for Single
Proportion-
 Suppose we take a sample of n
persons from a population and if x of
these persons are possessing a
particular characteristic(say
educated) then the sample proportion
p =x/n
 Population proportion = P

 Mean of sampling distribution of
proportions is
p = P
 Standard deviation of sampling distribution
of proportion is STANDARD ERROR of
proportion
(1 )
P
PQ P P
n n


 
 The standard normal variable is
z = p – P
PQ
n

 Q: A company engaged in the
manufacture of superior quality
diaries, which are primarily meant for
senior executives in the corporate
world.It claims that 75% of the
executives employed in Delhi use its
diaries. A random sample of 800
executives was taken and it was found
that 570 executives did use its diary
when the survey was
undertaken.Verify the company’s
claim, using 5% level of significance

 Ho: P = .75
 Ha: P  .75
 Level of significance () = .05
z = p – P
PQ
n
p = 570/800 = .7125
.7125 0.75
0.75(1 .75)
800
z



= -2.45

 Z cal=-2.45
 Z tab= 1.96( at 5%, two tailed)
 Ho rejected. This implies the claim of
the company is exaggerated and is
not supported by our test.
Zcal value lies
in rejection
region
-2.45

 Fifty people were attacked by the
disease and only 45 survived. Will you
reject the hypothesis that the
survival rate , if attacked by disease
,is 85% in favor of the hypothesis
that it is more,at 5% level.
Hint: p=50/45 Ho: P=.85
Ha: P>.85(R.T)
Accept Ho.

 A ketch-up manufacturer is in the process
of deciding whether to produce a new
extra-spicy brand.The company’s marketing
research department used a national
telephone survey of 6000 households and
found that the extra-spicy ketchup would
be purchased by 335 of them. A much more
extensive study made 2 years ago showed
that 5% of the households would purchase
the brand then.At a 2% significance level,
should the company conclude that there is
an increased interest in the extra-spicy
flavor?
Hint: p=.055 Ho: P=0.05 Ha:P>0.05
Ho rejected.

 A manufacturer claims that at least
95% of the equipments which he
supplied to a factory conformed to
the specification.An examination of
the sample of 200 pieces of
equipment revealed that 18 were
faulty.Test the claim of the
manufacturer.
Hint: Ho:P=.95 Ha:P<.95 p=1-18/100=.91
Ho rejected.

10. sampling and hypotehsis

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à 10. sampling and hypotehsis

Similaire à 10. sampling and hypotehsis (20)

Plus de Karan Kukreja

Plus de Karan Kukreja (20)

Dernier

Dernier (20)

10. sampling and hypotehsis