SlideShare une entreprise Scribd logo
1  sur  32
Télécharger pour lire hors ligne
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

1

UNIT 3

SAMPLE AND SAMPLE DISTRIBUTIONS

OBJECTIVES

General Objective
 To understand and the concept of sampling and sample distributions

Specific Objectives
At the end of the unit you should be able to:
 Define the sampling distribution concept which is the base for inferential
statistics.
 Express the relationship between statistical samples and population
parameters.
 Explain the concept of sampling distribution of sample means based on
random sample taken with and without replacement from a population.
 Calculate the mean, variance and standard deviation of the distribution of
the sample means taken with or without replacement from a population.
 State the criteria for big samples (n>30).
 Study the characteristics of the distributions of the means of samples
taken from a population.
 Use the central limit theorem to solve the probability problems involving
distribution of sample means for large number of samples.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

2

INPUT

3.0

INTRODUCTION

As an engineer, you are required to find out the mean value of the service life for
newly developed light bulbs. One of the approaches is to randomly pick out, say
50 light bulbs from the whole population of thousand bulbs produced and have
them tested. In doing so, you can approximate the mean value for the bulbs. This
method is known as sampling.
3.1

SAMPLE DISTRIBUTIONS

Every sample is a subset from a population. By studying the sample, it is
possible to find out the characteristics of the sample and eventually determine
the characteristics of the whole population. It would be ideal if the sample were a
perfect miniature of the population in all characteristics. This ideal, however, is
impossible to achieve. The best that can be done is to select a sample that will
be representative with respect to some characteristics, preferably those
pertaining to the study.
For a sample to be a random sample, every member of the population must
have an equal chance to be selected. If selected without being biased, it will
become the representative of the population.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

3

3.1.1 SAMPLE STATISTICS AND POPULATION PARAMETERS
Probability distribution concept can be applied for sample statistics. An example
of sample statistics is the measurement for central tendency for a given sample


such as the mean (x) or the variation such as standard deviation, S. The
population mean,  and the population standard deviation,  are the
measurement for the central tendency of a sample. Below is a table for sample
statistics and population parameters:
Quantity
Size
Mean

Sample
statistics
N


Variance
Standard
deviation
Proportion

Population parameters
N



x
s2
S

2


p^

p

3.1.2 DISTRIBUTION OF SAMPLE MEANS
If we select 100 samples of a specific size from a large population and compute
the mean of the same variable for each 100 samples. The sample means,

x1 , x 2 ... x 100 , constitute a sampling distribution of sample means.
If the samples are randomly selected with replacement, the sample means, for
most part, will be somewhat different from the population mean  . These
differences are caused by sampling error.
Properties of the distribution of sample Means
1. The mean of the sample means will be the same as the
population mean.
2. The standard deviation of the sample means will be smaller than
the standard deviation of the population, and will be equal to the
population standard deviation divided by the square root of the
sample size
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

4

Example 3.1
1.

Suppose a lecturer gave an eight point quiz to a small class of four
students. The results of the quiz were 2, 6, 4, and 8. Assume the four
students constitute the population.
Find i) The population  ,  and draw the graph of the sample means.
ii)  x ,  x of the sample means

2.

Assume that we have a population consisting of three numbers 1, 2, and
3. The probability distributions for these numbers are
X
P(x)
Find

1
1/3

2
1/3

3
1/3

i) The population means, variance and standard deviation
ii) Now, if all samples of size 2 are taken with replacement, and the
mean of each sample is found, find:
a) The probability distribution for sample means, x , draw a table
b) The mean for the sample means
c) The variance and standard deviation for sample means

Solution to Example 3.1
1.

The mean of the population is



2648
5 ,
4

The standard deviation of the population is



(2  5) 2  (6  5) 2  (4  5) 2  (8  5) 2
 2.236
4
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

5

frequency, 1

Below is the graph of the sample means. The graph appears to be
somewhat normal, even though it is a histogram.

1

2

3

4

score

Now, if all samples of size 2 are taken with replacement, and the mean of each
sample is found, the distribution is shown next. (You can draw a tree diagram if
you wish)
Sample
2, 2
2, 4
2, 6
2, 8
4, 2
4, 4
4, 6
4, 8

Mean
2
3
4
5
3
4
5
6

Sample
6, 2
6, 4
6, 6
6, 8
8, 2
8, 4
8, 6
8, 8

Mean
4
5
6
7
5
6
7
8

A frequency distribution of sample means is as follows.


X
2
3
4
5
6
7
8

F
1
2
3
4
3
2
1
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

6

Below is the graph of the sample means. The graph appears to be somewhat
normal, even though it is a histogram.
5
4
frequency

3
2
1
0
2

3

4

5

6

7

8

sample mean

The mean of the sample means, denoted by   
x

2  3  ...8 80

 5 which is the
16
16

same as the population mean. Hence  x  
The standard deviation of the sample means denoted by

(2  5) 2  (3  5) 2  ..(8  5) 2
 1.581 which the same as the population

x
16
2.236
standard deviation is divided by 2 :   
 1.581
x
2
Note: if all possible sample of size n are taken with replacement from the same
population, the mean of the sample means, denoted by   , equals to the

 

x

population mean  ; and the standard deviation of the sample means, denoted
by   , equals  .
n
x
2. Population mean,
  E ( X )   xp( x)  1(1 / 3)  2(1 / 3)  3(1 / 3)  1 / 3(1  2  3)  6 / 3  2
Population variance, s 2  E ( X 2 )  [ E ( X 2 )]
= 12 ( 1 )  2 2 ( 1 )  32 ( 1 )
3
3
3
1
= 3 (1  4  9)
= 14
3

 2  14  2 2 
3
Therefore



2
3

2
3
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

7

ii)
Sample
1, 1
1, 2
1, 3
2, 1
2, 2
2, 3
3, 1
3, 2
3, 3



Mean, x
1.0
1.5
2.0
1.5
2.0
2.5
2.0
2.5
3.0

The probability distribution for sample means, x
x

1.0
1/9

P( x )

1.5 2.0 2.5
2/9 3/9 2/9

3.0
1/9





You can draw a histogram for sample means, x against P( x ) as in ‘activity 3A’
and then find the mean for the sample means,  x  E ( x )   x p ( x ) =
3
2
2
1( 1 )  1.5( 9 )  2( 9 )  2.5( 9 )  3( 1 )
3
9
= 18
9
=2
=  (population mean)
Variance for the sample means, x is
 2 x  ( E ( x ) 2  [ E ( x )]2   x 2 p ( x )
3
2
= 12 ( 1 )  1.5 2 ( 9 )  2.0 2 ( 9 )
9
=

13
3

 2 x  13  2 2 
3

1
3
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

Standard deviation for sample means, x :  x 
Look:  x 

1
3

=

2
3

2



2
n

, 2  2 & n  2 
3


n

1
3

8
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

9

ACTIVITY 3A

TEST YOUR UNDERSTANDING BEFORE PROCEEDING TO THE NEXT
INPUT…!
1. Let the population consist of the digits 1, 2 and 3. Find the population
mean and the population standard deviation.
2. 10000 female students are found to have a mean weight of 63 kg with a
standard deviation of 7 kg. 100 samples of size 36 are taken, without
replacement, from the above. Estimate the mean and standard deviation
of the sample-means.
SAMPLE AND SAMPLE DISTRIBUTIONS

FEEDBACK TO ACTIVITY 3A

1.   2,  x =

2
3

2.  = 63 and  x = 1.17

CN303/3/

10
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

11

INPUT

3.2

THE CENTRAL LIMIT THEOREM

As the sample size n increases, the shape of the distribution of the sample
means taken with replacement from a population with mean  and standard
deviation  will approach the normal distribution. As previously shown, this
distribution will have a mean  and a standard deviation n
The central limit theorem can be used to answer questions about sample means
in the same manner that the normal distribution can be used to answer questions
about individual values. The only difference is that a new formula must be used
for the z values.
z

X 



Notice that X is the sample mean, and the denominator is the

n
standard error of the mean. It is important to remember two things when using
the central limit theorem:

When the original variable is normally distributed, the distribution of the sample
means will be normally distributed, for any sample size n.
When the distribution of the original variable departs from normality, a sample
size of 30 or more is needed to use the normal distribution to approximate the
distribution of the sample means. The larger the sample, the better the
approximation will be.
……………………………………………………………………………………………..
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

12

NOTE
Since the sample size is 30 or larger, the normality assumption is not necessary,
X 
X 
as in the example above. When do we use z 
or z 
?

/ n
The formula z 

X 

/ n

should be used to gain information about a sample mean

whereas the formula z 

X 



is used to gain information about an individual

data value obtained from the population. See the example below.
……………………………………………………………………………………………

Example 3.2
1. Students in semester 1 and 2 in Polytechnics spend an average of 25
hours sleeping in a week. Assume the variable is normally distributed and
the standard deviation is 3 hours. If 20 students from semester 1 and 2
are randomly selected, find the probability that the mean of the number of
hours they sleep will be greater than 26.3 hours.
2. The average age of motorcycles registered in polytechnics is 8 years, or
96 months. Assume the standard deviation is 16 months. If a random
sample of 36 motorcycles is selected, find the probability that the mean of
their ages is between 90 and 100 months.
3. The average number of pounds of meat a person consumes a year is
218.4 pounds. Assume that the standard deviation is 25 pounds and the
distribution is approximately normal.
i)
Find the probability that a person selected at random consumes
less than 224 pounds per year.
ii)
If a sample of 40 individuals is selected, find the probability that
the mean of the sample will be less than 224 pounds per year.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

13

Solution to Example 3.2

1. Since the variable is approximately normally distributed, the distribution of
sample means will be approximately normal, with a mean of 25. The
standard deviation of the sample means is

x 


n

3



20

 0.671

25

26.3

The distribution of the means is shown above, with the appropriate area shaded.
The z-value is z 

X 



n

=

26.3  25
1 .3

 1.94
3
0.671
20

The area between 0 and 1.94 is 0.4738. Since the desired area is in the tail,
subtract 0.4738 from 0.5000. Hence 0.5000 – 0.4738 = 0.0262, or 2.62%.
One can conclude that the probability of obtaining a sample mean larger than
26.3 hours is 2.62% (i.e., P ( X ) 26.3)  2.62% )
SAMPLE AND SAMPLE DISTRIBUTIONS

2.

CN303/3/

14

The desired area is shown in the figure below:

90

96

100

The two z-values are

z1 

90  96
16 / 36

 2.25 and z 2 

100  96
16 / 36

 1.50

The two areas corresponding to the z values 0f -2.25 and 1.50, respectively, are
0.4878 and 0.4332. Since the z-values are on opposite sides of the mean, find
the probability of adding the areas: 0.478 + 0.4332 = 0.921, or 92.1%.
Hence, the probability of obtaining a sample mean between 90 and 100 months
is 92.1% i.e., P(90< X <100) = 92.1%.
SAMPLE AND SAMPLE DISTRIBUTIONS

3.

(i)

CN303/3/

15

Since the question asks about an individual person, the formula
X 
is used. The distribution is shown in the figure below.
z



218.4 224
Distribution of individual data
values for the population
The z value is z 

X 





224  218.4
 0.22
25

The area between 0 and 0.22 is 0.0871; this area must be added to 0.5000 to get
the total area to the left of z = 0.22.
0.0871 + 0.5000 = 0.5871
Hence, the probability of selecting an individual who consumes less than 224
pounds of meat per year is 0.5871, or 58.71% ( i.e., P(X<224) = 0.5871.
SAMPLE AND SAMPLE DISTRIBUTIONS

(ii)

CN303/3/

Since the question concerns the mean of a sample with a size of
X 
40, the formula z 
is used. The area is shown in the figure
/ n
below:

218.4

224

The z value is

X 

224  218.4
 1.42
25
/ n
40
The area between z = 0 and z = 1.42 is 0.422; this value must be added to
0.5000 to get the total area.
0.422 + 0.5000 = 0.9222
z

16



Hence, the probability that the mean of a sample of 40 individuals is less than
224 pounds per year is 0.9222, or 92.22%. That is P( X  224)  0.9222
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

17

Comparing the two probabilities, one can see that the probability of selecting an
individual who consumes less than 224 pounds of meat per year is 58.71%, but
the probability of selecting a sample of 40 people with a mean consumption of
meat that is less than 224 pounds per year is 92.22%. This rather large
difference is due to the fact that the distribution of sample means is much less
variable than the distribution of individual data values.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

18

ACTIVITY 3B

TEST YOUR UNDERSTANDING BEFORE PROCEEDING TO THE NEXT
INPUT…!
1.

The average salary for workers at an electronic factory is RM13.50 per
hour. Assume that the standard deviation is RM2.90 per hour and the
distribution is approximately normal. If X is the mean salary per hour for a
random sample of the workers at the factory, find the mean and standard
deviation for a sample distribution X if the sample size is (a) 30 workers,
and (b) 75 workers

2.

The average weight of sugar sachets is 32 grams. Assume the standard
deviation is 0.3 gram. If a random sample of 20 sachets is selected, find
the probability that the mean of their weight is between 31.8 and 31.9
grams.

3.

Analysis of 150 compressive strength results gave a mean strength of 32
N/mm2 and standard deviation 6.5 N/mm2. Given that 10 samples of 12
results are considered, find the number of samples with mean strength
greater than 33 N/mm2.

4.

Asbestos-cement sheets are manufactured with a mean length 2400 mm
and standard deviation 3 mm. Given that 20 batches consisting of 3 dozen
sheets are considered, determine
(a) the probability that a batch (chosen at random) has a mean
length between 2399.5 mm and 2400.6 mm
(b) the number of batches with mean length less than 2399.3 mm.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

FEEDBACK TO ACTIVITY 3B

1. a) X ,  x    RM 13.50,  x  0.53
b) X ,  x    RM 13.50,  x  0.33
2. 0.667 or 66.7%
3.  x   =32 N/mm2,  x =1.81 N/mm2, P( x  33)  0.2912  3 samples
4. (a) P(2399.5< x <2400.6) = 0.7262

(b) P( x <2399.3) = 0.0808

19
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

20

INPUT

3.3 DISTRIBUTION OF THE SAMPLE MEANS
a. Distribution of the sample means with replacement
Statement 1
The shape of the distribution of the sample means X taken with replacement
from a known population with mean  and standard deviation  , regardless of
the sample size (n), will approach the normal distribution. As previously shown,
this distribution will have a mean  and a standard deviation n
Statement 2
If the sample is taken from any population with known  and  , and the sample
size is very large (n  30), the distribution of sample mean is almost normal with
2
min  and standard deviation  that is x  N (  , n )

b. Distribution of the sample means without replacement
The formula for the standard error of the mean,



, is accurate when the sample
n
are drawn with replacement or without replacement from a very large or infinite
population. Since sampling with replacement is for the most part unrealistic, a
correction factor is necessary for computing the standard error of the mean for
samples drawn without replacement from a finite population. Compute the
correction factor by using the following formula:
SAMPLE AND SAMPLE DISTRIBUTIONS

N n
N 1

CN303/3/

21

where N is the population size and n is the sample size.

This correction factor is necessary if relatively large samples are
taken from a small population, because the sample mean will then be more
accurately estimate the population means and there will be less error in the
estimation. Therefore, the standard error of the mean must be multiplied by the
correction factor to adjust it for large samples taken from a small population. That
is

x 



N n
N 1

n

Finally the formula for the z value becomes
z

X 


n

.

N n
N 1

When the population is large and the sample is small, the correction factor is
generally not used, since it will be very close to 1.000. Therefore

x 


n

.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

22

Example 3.3
1.

The average price of houses in Jitra is RM157000 and is rather skewed.
Assume the standard deviation is RM29500. If x is the mean price for a
sample of 400 houses selected at random, find the probability:
a) That the sample mean is between RM154000 and 160000.
b) That the mean price for this sample is below RM154000.

2.

The average time taken by line workers in an electronic firm to assemble
the electronic components is 80 hours with the standard deviation of 8
hours. Find the probabilities (P) of the mean assembly time if a random
sample consisting of 16 workers is selected.
a. P (78  x  82)
b. P (76  x  84)
c. P (74  x  86)

3.

The average service hour of 400 batteries is 800 with the standard
deviation of 45. If a random sample of 45 batteries is selected, what is the
probability that the sample mean is between 790 and 810 hours.

4.

The data shows the number of children belonging to a group of 50
Polytechnic lecturers.
No. of
children
No. of
lecturers

0

1

2

1

18 24

3

4

4

3

a. Find the mean and the standard deviation of the data above.
b. If a sample of 10 lecturers is taken, find the mean number of children
of this sample that is more than 2.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

23

Solution to Example 3.3
1. Although the price of houses in Jitra is skewed and not normally distributed,
the sample mean price is rather normal due to the big sample size (n=400).
Therefore the central limit theorem is applicable.
Given  =157000 and  =RM29500.
 x    RM 157000

x 


n



29500
 RM 1475
400

Therefore

x  N (157000,1475 2 )
a.

b.

P (154000  x  160000)
154000  157000) x  157000 160000  157000
 P(


)
1475
1475
1475
=P(-2.03  z  2.03)
=0.976

 x  157000 154500  157000 
P ( x  154500)  P

  P ( Z  2.03)  0.0212
1475
 1475


a.
b.

P (78  x  82)
P (76  x  84)
SAMPLE AND SAMPLE DISTRIBUTIONS

2.

CN303/3/

Although the sample size is small (n=16), the time distribution to
assemble the components is normally distributed. Therefore the distribution
of the sample mean  is normally distributed with mean = 80 hours and the
standard deviation  x  8 = 2 hours.
16
a.

b.

c.

 78  80 x  80 82  80 
P (78  x  82)  P



2
2 
 2
= P (1  Z  1)
=0.6826
 76  80 x  80 84  80 
P (76  x  84)  P



2
2 
 2
= P ( 2  Z  2 )
= 0.9544
 74  80 x  80 86  80 
P (74  x  86)  P



2
2 
 2
= P (3  Z  3)
= 0.9974

3.

The probability that the mean sample is between 790 and 810 hours is
0.9066.

4.

The probability distribution is:
No. of children(x)
Relative frequency, p(x)

a)

24

0
0.02

1
2
0.36 0.48

3
0.08

4
0.06

   xp(x) = 0(0.02)  1(0.36)  2(0.48)  3(0.08)  4(0.06)  1.8  2

 2   x 2 p( x)  (  ) 2  0 2 (0.02)  12 (0.36)  2 2 (0.48)  3 2 (0.08)  4 2 (0.06)  0.72
   0.72  0.8445
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

b)
Due to large samples (N = 50) and 10 lecturers were selected without
replacement, the sampling distribution for sample means is almost normal
with  x  1.8 and

x 


n

N  n 0.8485 50  10

 0.2424 Therefore
N 1
50  1
10

x  N (1.8,0.2424 2 )
 x  1 .8 2  1 .8 
P ( x  2)  P 

  P ( Z  0.83)  0.2033
 0.2424 0.2424 

25
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

26

ACTIVITY 3C

TEST YOUR UNDERSTANDING BEFORE PROCEEDING TO THE NEXT
INPUT…!
1. The heights of 2500 men are normally distributed with a mean of 170 cm and a
standard deviation of 7 cm. If random samples are taken of 30 men, predict
the standard deviation and the mean of sampling distribution of means, if
sampling is done (a) with replacement, and (b) without replacement.
2. A group of 1000 ingots of metal have a mean mass of 7.4 kg and a standard
deviation of 0.4 kg. Find the probability that a sample of 50 ingots chosen at
random from the group, without replacement, will have a combined mass of (a)
between 360 and 377.5 kg, and (b) more than 375 kg.
3. Determine the mean and standard deviation of the set of numbers 1, 2, 4, 5,
and 6, correct to three decimal places. By selecting all possible different
samples of size 2 which can be drawn with replacement (25 pairs) determine
(a) the mean of the sampling distribution of means, and (b) the standard error
of the means, correct to three decimal places.
4. Determine the standard error of the means for problem 3, if sampling is without
replacement, correct to three significant figures.
5. The length of 1500 bolts is normally distributed with a mean of 22.4 cm and a
standard deviation of 0.048 cm. If 30 samples are drawn at random from this
population, each of size 36 bolts, determine the mean of the sampling
distribution and the standard error of the means when sampling is done with
replacement.
6. Determine the standard error of the means in problem 5, if sampling is done
without replacement, correct to 4 decimal places.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

27

7. If a random sample of 64 lamps is drawn from a batch, determine the
probability that the mean time to failure will be less than 785 hours, correct to 3
decimal places.
8. Determine the probability that the mean time to failure of a random sample of
16 lamps will be between 790 hours and 810 hours, correct to 3 decimal
places.
9. For a random sample of 64 lamps, determine the probability that the mean
time to failure will exceed 820 hours, correct to 2 significant figures.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

FEEDBACK TO ACTIVITY 3C

1. (a)  x    1.278 cm
2.

(b)  x  1.271 cm,  x  170 cm

The mean of the sampling distribution of means =  x    7.4 kg
The standard error of the means,  x  0.0552 kg
(a) 0.9966 (b) 0.0351

3.   3.6000,   1.855, (a)  x  3.600 , (b)  x  1.312
4.  x = 1.136
5.  x =22.4 cm,  x =0.08 cm
6.  x = 0.0079 cm
7. 0.023
8. 0.497
9. 0.0038

28
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

29

SELF ASSESSMENT 3

You are approaching success. Try all the questions in this self-assessment
section and check your answers on the next page. If you encounter any
problems, consult your instructor. Good luck.
1. If the samples of a specific size are selected from a population and the
means are computed, what is this distribution of means called?
2. What is the mean of the sample means?
3. What does the central limit theorem say about the shape of the distribution
of sample means?
4. What formula is used to gain information about a sample mean when the
variable is normally distributed or when the sample size is 30 or more?

For exercise below, assume that the sample is taken from a large
population and the correction factor can be ignored.
5. The mean serum cholesterol of a large population of overweight
adults is 220 Mg/dl and the standard deviation is 16.3 mg/dl. If a sample of
adults is selected. Find the probability that the mean will be between 220
and 222 mg/dl.
6.

The mean weight of 18 year old females is 126 pound, and the standard
deviation is 15.7. If the sample of 25 females is selected, find the
probability that the mean of the sample will be greater than 128.3 pounds.
Assume the variable is normally distributed.
SAMPLE AND SAMPLE DISTRIBUTIONS

7.

CN303/3/

30

The average price of the pound of sliced bacon is RM2.02. Assume the
standard deviation is RM0.08. If a random sample of 40 one-pound
packages is selected, find the probability that the mean of the sample will
be less than RM2.00.

8.

The mean score on a dexterity test for 12 year old is 30. The standard
deviation is if a psychologist admitters the test to a class of 22 student,
find the probability that the mean of the sample will be between 27 and 31.
Assume the variable is normally distributed.

9.

The average age of lawyers is 43.6 years, with a standard deviation of 5.1
years. If the law firm employs 50 lawyers, find the probability that the
average age of the group is greater than 44.2 years old.

10.

Procter & Gamble reported that an American family of 4 washes an
average of one ton (2000 pounds) of clothes each year. If the standard
deviation of the distribution is 187.5 pounds, find the probability that the
mean of the randomly selected sample of 50 families or four will be
between 1980 and 1990 pounds.

11.

The average time it taken a group of adults to complete a certain
achievement test is 46.2 minutes. The standard deviation is 80 minutes.
Assume the variable is normally distributed
a)
b)
c)
d)

Find the probability that a randomly selected adult will
complete the test in less than 43 minutes.
Find the probability that if 50 randomly selected adults take
the test, the mean time it takes the group to complete the
test will be less than 43 minutes.
Does it seem reasonable that an adult would finish the test in
less than 43 minutes? Explain
Does it seem reasonable that the mean of the 50 adults
could be less than 43 minutes?
SAMPLE AND SAMPLE DISTRIBUTIONS

12.

31

The average cholesterol content of a certain brand of eggs is 215
milligrams and the standard deviation is 15 milligrams. Assume the
variable is normally distributed.
a)
b)

13.

CN303/3/

If a single egg is selected, find the probability that the
cholesterol content will be more than 220 milligrams.
If a sample of eggs is selected, find the probability that the
mean of the sample will be larger than 220 milligrams.

The average labor cost for car repairs for a large chain of car repair shop is
RM 48.25. The standard deviation is RM 4.20. Assume the variable is
normally distributed.
(a)
(b)

If a store is selected at random, find the probability that the
labour cost will range between RM 46 and RM 48
If stores are selected at random, find the probability that the
mean of the sample will be between RM 46 and RM 48.
© Which answer is larger? Explain why.
SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

32

FEEDBACK TO SELF-ASSESSMENT 3

Have you tried the questions??? If “YES”, check your answers now.
1. The distribution is called the sampling distribution of sample means.
2. The mean of the mean is equal to the population mean.
3. The distribution will be approximately normal when the sample size is
large.
x
4. z =
/ n
5. 0.2486
6. 0.2327
7. 0.0571
8. 0.8239
9. 0.2033
10. 0.1254
11. a) 0.3446
b) 0.0023
c) Yes , since it is within one standard deviation of the mean.
d) very unlikely
12. a) 0.3707

b) 0.0475

13. a) 0.1815

b) 0.3854

c) Means are less variable than individual data.

Contenu connexe

Tendances

random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distributionlovemucheca
 
5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functions5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functionssmiller5
 
Normal probability distribution
Normal probability distributionNormal probability distribution
Normal probability distributionNadeem Uddin
 
Normal distribution
Normal distributionNormal distribution
Normal distributionCamilleJoy3
 
Random sampling
Random samplingRandom sampling
Random samplingJesusDel2
 
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Meannszakir
 
Probability Distributions for Discrete Variables
Probability Distributions for Discrete VariablesProbability Distributions for Discrete Variables
Probability Distributions for Discrete Variablesgetyourcheaton
 
Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Vijay Hemmadi
 
Quartile in Statistics
Quartile in StatisticsQuartile in Statistics
Quartile in StatisticsHennaAnsari
 
Standard deviationnormal distributionshow
Standard deviationnormal distributionshowStandard deviationnormal distributionshow
Standard deviationnormal distributionshowBiologyIB
 
Chapter 4 part2- Random Variables
Chapter 4 part2- Random VariablesChapter 4 part2- Random Variables
Chapter 4 part2- Random Variablesnszakir
 
Probability distribution for Dummies
Probability distribution for DummiesProbability distribution for Dummies
Probability distribution for DummiesBalaji P
 
5.1 sequences and summation notation t
5.1 sequences and summation notation t5.1 sequences and summation notation t
5.1 sequences and summation notation tmath260
 
Random variables and probability distributions
Random variables and probability distributionsRandom variables and probability distributions
Random variables and probability distributionsAntonio F. Balatar Jr.
 
Introduction to Statistics and Probability
Introduction to Statistics and ProbabilityIntroduction to Statistics and Probability
Introduction to Statistics and ProbabilityBhavana Singh
 

Tendances (20)

random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
 
5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functions5.3 Graphs of Polynomial Functions
5.3 Graphs of Polynomial Functions
 
Normal probability distribution
Normal probability distributionNormal probability distribution
Normal probability distribution
 
Normal distribution
Normal distributionNormal distribution
Normal distribution
 
Random sampling
Random samplingRandom sampling
Random sampling
 
Chapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample MeanChapter 5 part1- The Sampling Distribution of a Sample Mean
Chapter 5 part1- The Sampling Distribution of a Sample Mean
 
Probability Distributions for Discrete Variables
Probability Distributions for Discrete VariablesProbability Distributions for Discrete Variables
Probability Distributions for Discrete Variables
 
Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis Introduction to probability distributions-Statistics and probability analysis
Introduction to probability distributions-Statistics and probability analysis
 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
 
Quartile in Statistics
Quartile in StatisticsQuartile in Statistics
Quartile in Statistics
 
The Central Limit Theorem
The Central Limit TheoremThe Central Limit Theorem
The Central Limit Theorem
 
Standard deviationnormal distributionshow
Standard deviationnormal distributionshowStandard deviationnormal distributionshow
Standard deviationnormal distributionshow
 
Chapter 4 part2- Random Variables
Chapter 4 part2- Random VariablesChapter 4 part2- Random Variables
Chapter 4 part2- Random Variables
 
Probability distribution for Dummies
Probability distribution for DummiesProbability distribution for Dummies
Probability distribution for Dummies
 
Estimating a Population Proportion
Estimating a Population Proportion  Estimating a Population Proportion
Estimating a Population Proportion
 
Random variables
Random variablesRandom variables
Random variables
 
5.1 sequences and summation notation t
5.1 sequences and summation notation t5.1 sequences and summation notation t
5.1 sequences and summation notation t
 
Random variables and probability distributions
Random variables and probability distributionsRandom variables and probability distributions
Random variables and probability distributions
 
Introduction to Statistics and Probability
Introduction to Statistics and ProbabilityIntroduction to Statistics and Probability
Introduction to Statistics and Probability
 
Median and Mode used in Teaching
Median and Mode used in TeachingMedian and Mode used in Teaching
Median and Mode used in Teaching
 

Similaire à Sample sample distribution

Chapter 3 sampling and sampling distribution
Chapter 3   sampling and sampling distributionChapter 3   sampling and sampling distribution
Chapter 3 sampling and sampling distributionAntonio F. Balatar Jr.
 
7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptx
7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptx7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptx
7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptxHASDINABKARIANEBRAHI
 
Lect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadLect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadRione Drevale
 
Chapter one on sampling distributions.ppt
Chapter one on sampling distributions.pptChapter one on sampling distributions.ppt
Chapter one on sampling distributions.pptFekaduAman
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distributionswarna dey
 
Lecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptxLecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptxshakirRahman10
 
A study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptxA study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptxjibinjohn140
 
EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...
EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...
EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...Elton John Embodo
 
Z and t_tests
Z and t_testsZ and t_tests
Z and t_testseducation
 
Sampling distribution.pptx
Sampling distribution.pptxSampling distribution.pptx
Sampling distribution.pptxssusera0e0e9
 
Chp11 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp11  - Research Methods for Business By Authors Uma Sekaran and Roger BougieChp11  - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp11 - Research Methods for Business By Authors Uma Sekaran and Roger BougieHassan Usman
 
Sampling Distribution -I
Sampling Distribution -ISampling Distribution -I
Sampling Distribution -ISadam Hussen
 
Sqqs1013 ch6-a122
Sqqs1013 ch6-a122Sqqs1013 ch6-a122
Sqqs1013 ch6-a122kim rae KI
 
Identifying the sampling distribution module5
Identifying the sampling distribution module5Identifying the sampling distribution module5
Identifying the sampling distribution module5REYEMMANUELILUMBA
 
tps5e_Ch10_2.ppt
tps5e_Ch10_2.ppttps5e_Ch10_2.ppt
tps5e_Ch10_2.pptDunakanshon
 

Similaire à Sample sample distribution (20)

Chapter 3 sampling and sampling distribution
Chapter 3   sampling and sampling distributionChapter 3   sampling and sampling distribution
Chapter 3 sampling and sampling distribution
 
7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptx
7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptx7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptx
7-THE-SAMPLING-DISTRIBUTION-OF-SAMPLE-MEANS-CLT.pptx
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Lect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadLect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spread
 
Chapter one on sampling distributions.ppt
Chapter one on sampling distributions.pptChapter one on sampling distributions.ppt
Chapter one on sampling distributions.ppt
 
Lecture-6.pdf
Lecture-6.pdfLecture-6.pdf
Lecture-6.pdf
 
estimation
estimationestimation
estimation
 
Estimation
EstimationEstimation
Estimation
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Lecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptxLecture 5 Sampling distribution of sample mean.pptx
Lecture 5 Sampling distribution of sample mean.pptx
 
A study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptxA study on the ANOVA ANALYSIS OF VARIANCE.pptx
A study on the ANOVA ANALYSIS OF VARIANCE.pptx
 
EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...
EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...
EMBODO LP Grade 12 Mean and Variance of the Sampling Distribution of the Samp...
 
Z and t_tests
Z and t_testsZ and t_tests
Z and t_tests
 
Sampling distribution.pptx
Sampling distribution.pptxSampling distribution.pptx
Sampling distribution.pptx
 
Chap 6
Chap 6Chap 6
Chap 6
 
Chp11 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp11  - Research Methods for Business By Authors Uma Sekaran and Roger BougieChp11  - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
Chp11 - Research Methods for Business By Authors Uma Sekaran and Roger Bougie
 
Sampling Distribution -I
Sampling Distribution -ISampling Distribution -I
Sampling Distribution -I
 
Sqqs1013 ch6-a122
Sqqs1013 ch6-a122Sqqs1013 ch6-a122
Sqqs1013 ch6-a122
 
Identifying the sampling distribution module5
Identifying the sampling distribution module5Identifying the sampling distribution module5
Identifying the sampling distribution module5
 
tps5e_Ch10_2.ppt
tps5e_Ch10_2.ppttps5e_Ch10_2.ppt
tps5e_Ch10_2.ppt
 

Sample sample distribution

  • 1. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 1 UNIT 3 SAMPLE AND SAMPLE DISTRIBUTIONS OBJECTIVES General Objective  To understand and the concept of sampling and sample distributions Specific Objectives At the end of the unit you should be able to:  Define the sampling distribution concept which is the base for inferential statistics.  Express the relationship between statistical samples and population parameters.  Explain the concept of sampling distribution of sample means based on random sample taken with and without replacement from a population.  Calculate the mean, variance and standard deviation of the distribution of the sample means taken with or without replacement from a population.  State the criteria for big samples (n>30).  Study the characteristics of the distributions of the means of samples taken from a population.  Use the central limit theorem to solve the probability problems involving distribution of sample means for large number of samples.
  • 2. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 2 INPUT 3.0 INTRODUCTION As an engineer, you are required to find out the mean value of the service life for newly developed light bulbs. One of the approaches is to randomly pick out, say 50 light bulbs from the whole population of thousand bulbs produced and have them tested. In doing so, you can approximate the mean value for the bulbs. This method is known as sampling. 3.1 SAMPLE DISTRIBUTIONS Every sample is a subset from a population. By studying the sample, it is possible to find out the characteristics of the sample and eventually determine the characteristics of the whole population. It would be ideal if the sample were a perfect miniature of the population in all characteristics. This ideal, however, is impossible to achieve. The best that can be done is to select a sample that will be representative with respect to some characteristics, preferably those pertaining to the study. For a sample to be a random sample, every member of the population must have an equal chance to be selected. If selected without being biased, it will become the representative of the population.
  • 3. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 3 3.1.1 SAMPLE STATISTICS AND POPULATION PARAMETERS Probability distribution concept can be applied for sample statistics. An example of sample statistics is the measurement for central tendency for a given sample  such as the mean (x) or the variation such as standard deviation, S. The population mean,  and the population standard deviation,  are the measurement for the central tendency of a sample. Below is a table for sample statistics and population parameters: Quantity Size Mean Sample statistics N  Variance Standard deviation Proportion Population parameters N  x s2 S 2  p^ p 3.1.2 DISTRIBUTION OF SAMPLE MEANS If we select 100 samples of a specific size from a large population and compute the mean of the same variable for each 100 samples. The sample means, x1 , x 2 ... x 100 , constitute a sampling distribution of sample means. If the samples are randomly selected with replacement, the sample means, for most part, will be somewhat different from the population mean  . These differences are caused by sampling error. Properties of the distribution of sample Means 1. The mean of the sample means will be the same as the population mean. 2. The standard deviation of the sample means will be smaller than the standard deviation of the population, and will be equal to the population standard deviation divided by the square root of the sample size
  • 4. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 4 Example 3.1 1. Suppose a lecturer gave an eight point quiz to a small class of four students. The results of the quiz were 2, 6, 4, and 8. Assume the four students constitute the population. Find i) The population  ,  and draw the graph of the sample means. ii)  x ,  x of the sample means 2. Assume that we have a population consisting of three numbers 1, 2, and 3. The probability distributions for these numbers are X P(x) Find 1 1/3 2 1/3 3 1/3 i) The population means, variance and standard deviation ii) Now, if all samples of size 2 are taken with replacement, and the mean of each sample is found, find: a) The probability distribution for sample means, x , draw a table b) The mean for the sample means c) The variance and standard deviation for sample means Solution to Example 3.1 1. The mean of the population is  2648 5 , 4 The standard deviation of the population is  (2  5) 2  (6  5) 2  (4  5) 2  (8  5) 2  2.236 4
  • 5. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 5 frequency, 1 Below is the graph of the sample means. The graph appears to be somewhat normal, even though it is a histogram. 1 2 3 4 score Now, if all samples of size 2 are taken with replacement, and the mean of each sample is found, the distribution is shown next. (You can draw a tree diagram if you wish) Sample 2, 2 2, 4 2, 6 2, 8 4, 2 4, 4 4, 6 4, 8 Mean 2 3 4 5 3 4 5 6 Sample 6, 2 6, 4 6, 6 6, 8 8, 2 8, 4 8, 6 8, 8 Mean 4 5 6 7 5 6 7 8 A frequency distribution of sample means is as follows.  X 2 3 4 5 6 7 8 F 1 2 3 4 3 2 1
  • 6. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 6 Below is the graph of the sample means. The graph appears to be somewhat normal, even though it is a histogram. 5 4 frequency 3 2 1 0 2 3 4 5 6 7 8 sample mean The mean of the sample means, denoted by    x 2  3  ...8 80   5 which is the 16 16 same as the population mean. Hence  x   The standard deviation of the sample means denoted by (2  5) 2  (3  5) 2  ..(8  5) 2  1.581 which the same as the population  x 16 2.236 standard deviation is divided by 2 :     1.581 x 2 Note: if all possible sample of size n are taken with replacement from the same population, the mean of the sample means, denoted by   , equals to the   x population mean  ; and the standard deviation of the sample means, denoted by   , equals  . n x 2. Population mean,   E ( X )   xp( x)  1(1 / 3)  2(1 / 3)  3(1 / 3)  1 / 3(1  2  3)  6 / 3  2 Population variance, s 2  E ( X 2 )  [ E ( X 2 )] = 12 ( 1 )  2 2 ( 1 )  32 ( 1 ) 3 3 3 1 = 3 (1  4  9) = 14 3  2  14  2 2  3 Therefore  2 3 2 3
  • 7. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 7 ii) Sample 1, 1 1, 2 1, 3 2, 1 2, 2 2, 3 3, 1 3, 2 3, 3  Mean, x 1.0 1.5 2.0 1.5 2.0 2.5 2.0 2.5 3.0 The probability distribution for sample means, x x 1.0 1/9 P( x ) 1.5 2.0 2.5 2/9 3/9 2/9 3.0 1/9   You can draw a histogram for sample means, x against P( x ) as in ‘activity 3A’ and then find the mean for the sample means,  x  E ( x )   x p ( x ) = 3 2 2 1( 1 )  1.5( 9 )  2( 9 )  2.5( 9 )  3( 1 ) 3 9 = 18 9 =2 =  (population mean) Variance for the sample means, x is  2 x  ( E ( x ) 2  [ E ( x )]2   x 2 p ( x ) 3 2 = 12 ( 1 )  1.5 2 ( 9 )  2.0 2 ( 9 ) 9 = 13 3  2 x  13  2 2  3 1 3
  • 8. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ Standard deviation for sample means, x :  x  Look:  x  1 3 = 2 3 2  2 n , 2  2 & n  2  3  n 1 3 8
  • 9. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 9 ACTIVITY 3A TEST YOUR UNDERSTANDING BEFORE PROCEEDING TO THE NEXT INPUT…! 1. Let the population consist of the digits 1, 2 and 3. Find the population mean and the population standard deviation. 2. 10000 female students are found to have a mean weight of 63 kg with a standard deviation of 7 kg. 100 samples of size 36 are taken, without replacement, from the above. Estimate the mean and standard deviation of the sample-means.
  • 10. SAMPLE AND SAMPLE DISTRIBUTIONS FEEDBACK TO ACTIVITY 3A 1.   2,  x = 2 3 2.  = 63 and  x = 1.17 CN303/3/ 10
  • 11. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 11 INPUT 3.2 THE CENTRAL LIMIT THEOREM As the sample size n increases, the shape of the distribution of the sample means taken with replacement from a population with mean  and standard deviation  will approach the normal distribution. As previously shown, this distribution will have a mean  and a standard deviation n The central limit theorem can be used to answer questions about sample means in the same manner that the normal distribution can be used to answer questions about individual values. The only difference is that a new formula must be used for the z values. z X   Notice that X is the sample mean, and the denominator is the n standard error of the mean. It is important to remember two things when using the central limit theorem: When the original variable is normally distributed, the distribution of the sample means will be normally distributed, for any sample size n. When the distribution of the original variable departs from normality, a sample size of 30 or more is needed to use the normal distribution to approximate the distribution of the sample means. The larger the sample, the better the approximation will be. ……………………………………………………………………………………………..
  • 12. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 12 NOTE Since the sample size is 30 or larger, the normality assumption is not necessary, X  X  as in the example above. When do we use z  or z  ?  / n The formula z  X  / n should be used to gain information about a sample mean whereas the formula z  X   is used to gain information about an individual data value obtained from the population. See the example below. …………………………………………………………………………………………… Example 3.2 1. Students in semester 1 and 2 in Polytechnics spend an average of 25 hours sleeping in a week. Assume the variable is normally distributed and the standard deviation is 3 hours. If 20 students from semester 1 and 2 are randomly selected, find the probability that the mean of the number of hours they sleep will be greater than 26.3 hours. 2. The average age of motorcycles registered in polytechnics is 8 years, or 96 months. Assume the standard deviation is 16 months. If a random sample of 36 motorcycles is selected, find the probability that the mean of their ages is between 90 and 100 months. 3. The average number of pounds of meat a person consumes a year is 218.4 pounds. Assume that the standard deviation is 25 pounds and the distribution is approximately normal. i) Find the probability that a person selected at random consumes less than 224 pounds per year. ii) If a sample of 40 individuals is selected, find the probability that the mean of the sample will be less than 224 pounds per year.
  • 13. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 13 Solution to Example 3.2 1. Since the variable is approximately normally distributed, the distribution of sample means will be approximately normal, with a mean of 25. The standard deviation of the sample means is x   n 3  20  0.671 25 26.3 The distribution of the means is shown above, with the appropriate area shaded. The z-value is z  X   n = 26.3  25 1 .3   1.94 3 0.671 20 The area between 0 and 1.94 is 0.4738. Since the desired area is in the tail, subtract 0.4738 from 0.5000. Hence 0.5000 – 0.4738 = 0.0262, or 2.62%. One can conclude that the probability of obtaining a sample mean larger than 26.3 hours is 2.62% (i.e., P ( X ) 26.3)  2.62% )
  • 14. SAMPLE AND SAMPLE DISTRIBUTIONS 2. CN303/3/ 14 The desired area is shown in the figure below: 90 96 100 The two z-values are z1  90  96 16 / 36  2.25 and z 2  100  96 16 / 36  1.50 The two areas corresponding to the z values 0f -2.25 and 1.50, respectively, are 0.4878 and 0.4332. Since the z-values are on opposite sides of the mean, find the probability of adding the areas: 0.478 + 0.4332 = 0.921, or 92.1%. Hence, the probability of obtaining a sample mean between 90 and 100 months is 92.1% i.e., P(90< X <100) = 92.1%.
  • 15. SAMPLE AND SAMPLE DISTRIBUTIONS 3. (i) CN303/3/ 15 Since the question asks about an individual person, the formula X  is used. The distribution is shown in the figure below. z  218.4 224 Distribution of individual data values for the population The z value is z  X    224  218.4  0.22 25 The area between 0 and 0.22 is 0.0871; this area must be added to 0.5000 to get the total area to the left of z = 0.22. 0.0871 + 0.5000 = 0.5871 Hence, the probability of selecting an individual who consumes less than 224 pounds of meat per year is 0.5871, or 58.71% ( i.e., P(X<224) = 0.5871.
  • 16. SAMPLE AND SAMPLE DISTRIBUTIONS (ii) CN303/3/ Since the question concerns the mean of a sample with a size of X  40, the formula z  is used. The area is shown in the figure / n below: 218.4 224 The z value is X  224  218.4  1.42 25 / n 40 The area between z = 0 and z = 1.42 is 0.422; this value must be added to 0.5000 to get the total area. 0.422 + 0.5000 = 0.9222 z 16  Hence, the probability that the mean of a sample of 40 individuals is less than 224 pounds per year is 0.9222, or 92.22%. That is P( X  224)  0.9222
  • 17. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 17 Comparing the two probabilities, one can see that the probability of selecting an individual who consumes less than 224 pounds of meat per year is 58.71%, but the probability of selecting a sample of 40 people with a mean consumption of meat that is less than 224 pounds per year is 92.22%. This rather large difference is due to the fact that the distribution of sample means is much less variable than the distribution of individual data values.
  • 18. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 18 ACTIVITY 3B TEST YOUR UNDERSTANDING BEFORE PROCEEDING TO THE NEXT INPUT…! 1. The average salary for workers at an electronic factory is RM13.50 per hour. Assume that the standard deviation is RM2.90 per hour and the distribution is approximately normal. If X is the mean salary per hour for a random sample of the workers at the factory, find the mean and standard deviation for a sample distribution X if the sample size is (a) 30 workers, and (b) 75 workers 2. The average weight of sugar sachets is 32 grams. Assume the standard deviation is 0.3 gram. If a random sample of 20 sachets is selected, find the probability that the mean of their weight is between 31.8 and 31.9 grams. 3. Analysis of 150 compressive strength results gave a mean strength of 32 N/mm2 and standard deviation 6.5 N/mm2. Given that 10 samples of 12 results are considered, find the number of samples with mean strength greater than 33 N/mm2. 4. Asbestos-cement sheets are manufactured with a mean length 2400 mm and standard deviation 3 mm. Given that 20 batches consisting of 3 dozen sheets are considered, determine (a) the probability that a batch (chosen at random) has a mean length between 2399.5 mm and 2400.6 mm (b) the number of batches with mean length less than 2399.3 mm.
  • 19. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ FEEDBACK TO ACTIVITY 3B 1. a) X ,  x    RM 13.50,  x  0.53 b) X ,  x    RM 13.50,  x  0.33 2. 0.667 or 66.7% 3.  x   =32 N/mm2,  x =1.81 N/mm2, P( x  33)  0.2912  3 samples 4. (a) P(2399.5< x <2400.6) = 0.7262 (b) P( x <2399.3) = 0.0808 19
  • 20. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 20 INPUT 3.3 DISTRIBUTION OF THE SAMPLE MEANS a. Distribution of the sample means with replacement Statement 1 The shape of the distribution of the sample means X taken with replacement from a known population with mean  and standard deviation  , regardless of the sample size (n), will approach the normal distribution. As previously shown, this distribution will have a mean  and a standard deviation n Statement 2 If the sample is taken from any population with known  and  , and the sample size is very large (n  30), the distribution of sample mean is almost normal with 2 min  and standard deviation  that is x  N (  , n ) b. Distribution of the sample means without replacement The formula for the standard error of the mean,  , is accurate when the sample n are drawn with replacement or without replacement from a very large or infinite population. Since sampling with replacement is for the most part unrealistic, a correction factor is necessary for computing the standard error of the mean for samples drawn without replacement from a finite population. Compute the correction factor by using the following formula:
  • 21. SAMPLE AND SAMPLE DISTRIBUTIONS N n N 1 CN303/3/ 21 where N is the population size and n is the sample size. This correction factor is necessary if relatively large samples are taken from a small population, because the sample mean will then be more accurately estimate the population means and there will be less error in the estimation. Therefore, the standard error of the mean must be multiplied by the correction factor to adjust it for large samples taken from a small population. That is x   N n N 1 n Finally the formula for the z value becomes z X   n . N n N 1 When the population is large and the sample is small, the correction factor is generally not used, since it will be very close to 1.000. Therefore x   n .
  • 22. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 22 Example 3.3 1. The average price of houses in Jitra is RM157000 and is rather skewed. Assume the standard deviation is RM29500. If x is the mean price for a sample of 400 houses selected at random, find the probability: a) That the sample mean is between RM154000 and 160000. b) That the mean price for this sample is below RM154000. 2. The average time taken by line workers in an electronic firm to assemble the electronic components is 80 hours with the standard deviation of 8 hours. Find the probabilities (P) of the mean assembly time if a random sample consisting of 16 workers is selected. a. P (78  x  82) b. P (76  x  84) c. P (74  x  86) 3. The average service hour of 400 batteries is 800 with the standard deviation of 45. If a random sample of 45 batteries is selected, what is the probability that the sample mean is between 790 and 810 hours. 4. The data shows the number of children belonging to a group of 50 Polytechnic lecturers. No. of children No. of lecturers 0 1 2 1 18 24 3 4 4 3 a. Find the mean and the standard deviation of the data above. b. If a sample of 10 lecturers is taken, find the mean number of children of this sample that is more than 2.
  • 23. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 23 Solution to Example 3.3 1. Although the price of houses in Jitra is skewed and not normally distributed, the sample mean price is rather normal due to the big sample size (n=400). Therefore the central limit theorem is applicable. Given  =157000 and  =RM29500.  x    RM 157000 x   n  29500  RM 1475 400 Therefore x  N (157000,1475 2 ) a. b. P (154000  x  160000) 154000  157000) x  157000 160000  157000  P(   ) 1475 1475 1475 =P(-2.03  z  2.03) =0.976  x  157000 154500  157000  P ( x  154500)  P    P ( Z  2.03)  0.0212 1475  1475  a. b. P (78  x  82) P (76  x  84)
  • 24. SAMPLE AND SAMPLE DISTRIBUTIONS 2. CN303/3/ Although the sample size is small (n=16), the time distribution to assemble the components is normally distributed. Therefore the distribution of the sample mean  is normally distributed with mean = 80 hours and the standard deviation  x  8 = 2 hours. 16 a. b. c.  78  80 x  80 82  80  P (78  x  82)  P    2 2   2 = P (1  Z  1) =0.6826  76  80 x  80 84  80  P (76  x  84)  P    2 2   2 = P ( 2  Z  2 ) = 0.9544  74  80 x  80 86  80  P (74  x  86)  P    2 2   2 = P (3  Z  3) = 0.9974 3. The probability that the mean sample is between 790 and 810 hours is 0.9066. 4. The probability distribution is: No. of children(x) Relative frequency, p(x) a) 24 0 0.02 1 2 0.36 0.48 3 0.08 4 0.06    xp(x) = 0(0.02)  1(0.36)  2(0.48)  3(0.08)  4(0.06)  1.8  2  2   x 2 p( x)  (  ) 2  0 2 (0.02)  12 (0.36)  2 2 (0.48)  3 2 (0.08)  4 2 (0.06)  0.72    0.72  0.8445
  • 25. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ b) Due to large samples (N = 50) and 10 lecturers were selected without replacement, the sampling distribution for sample means is almost normal with  x  1.8 and x   n N  n 0.8485 50  10   0.2424 Therefore N 1 50  1 10 x  N (1.8,0.2424 2 )  x  1 .8 2  1 .8  P ( x  2)  P     P ( Z  0.83)  0.2033  0.2424 0.2424  25
  • 26. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 26 ACTIVITY 3C TEST YOUR UNDERSTANDING BEFORE PROCEEDING TO THE NEXT INPUT…! 1. The heights of 2500 men are normally distributed with a mean of 170 cm and a standard deviation of 7 cm. If random samples are taken of 30 men, predict the standard deviation and the mean of sampling distribution of means, if sampling is done (a) with replacement, and (b) without replacement. 2. A group of 1000 ingots of metal have a mean mass of 7.4 kg and a standard deviation of 0.4 kg. Find the probability that a sample of 50 ingots chosen at random from the group, without replacement, will have a combined mass of (a) between 360 and 377.5 kg, and (b) more than 375 kg. 3. Determine the mean and standard deviation of the set of numbers 1, 2, 4, 5, and 6, correct to three decimal places. By selecting all possible different samples of size 2 which can be drawn with replacement (25 pairs) determine (a) the mean of the sampling distribution of means, and (b) the standard error of the means, correct to three decimal places. 4. Determine the standard error of the means for problem 3, if sampling is without replacement, correct to three significant figures. 5. The length of 1500 bolts is normally distributed with a mean of 22.4 cm and a standard deviation of 0.048 cm. If 30 samples are drawn at random from this population, each of size 36 bolts, determine the mean of the sampling distribution and the standard error of the means when sampling is done with replacement. 6. Determine the standard error of the means in problem 5, if sampling is done without replacement, correct to 4 decimal places.
  • 27. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 27 7. If a random sample of 64 lamps is drawn from a batch, determine the probability that the mean time to failure will be less than 785 hours, correct to 3 decimal places. 8. Determine the probability that the mean time to failure of a random sample of 16 lamps will be between 790 hours and 810 hours, correct to 3 decimal places. 9. For a random sample of 64 lamps, determine the probability that the mean time to failure will exceed 820 hours, correct to 2 significant figures.
  • 28. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ FEEDBACK TO ACTIVITY 3C 1. (a)  x    1.278 cm 2. (b)  x  1.271 cm,  x  170 cm The mean of the sampling distribution of means =  x    7.4 kg The standard error of the means,  x  0.0552 kg (a) 0.9966 (b) 0.0351 3.   3.6000,   1.855, (a)  x  3.600 , (b)  x  1.312 4.  x = 1.136 5.  x =22.4 cm,  x =0.08 cm 6.  x = 0.0079 cm 7. 0.023 8. 0.497 9. 0.0038 28
  • 29. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 29 SELF ASSESSMENT 3 You are approaching success. Try all the questions in this self-assessment section and check your answers on the next page. If you encounter any problems, consult your instructor. Good luck. 1. If the samples of a specific size are selected from a population and the means are computed, what is this distribution of means called? 2. What is the mean of the sample means? 3. What does the central limit theorem say about the shape of the distribution of sample means? 4. What formula is used to gain information about a sample mean when the variable is normally distributed or when the sample size is 30 or more? For exercise below, assume that the sample is taken from a large population and the correction factor can be ignored. 5. The mean serum cholesterol of a large population of overweight adults is 220 Mg/dl and the standard deviation is 16.3 mg/dl. If a sample of adults is selected. Find the probability that the mean will be between 220 and 222 mg/dl. 6. The mean weight of 18 year old females is 126 pound, and the standard deviation is 15.7. If the sample of 25 females is selected, find the probability that the mean of the sample will be greater than 128.3 pounds. Assume the variable is normally distributed.
  • 30. SAMPLE AND SAMPLE DISTRIBUTIONS 7. CN303/3/ 30 The average price of the pound of sliced bacon is RM2.02. Assume the standard deviation is RM0.08. If a random sample of 40 one-pound packages is selected, find the probability that the mean of the sample will be less than RM2.00. 8. The mean score on a dexterity test for 12 year old is 30. The standard deviation is if a psychologist admitters the test to a class of 22 student, find the probability that the mean of the sample will be between 27 and 31. Assume the variable is normally distributed. 9. The average age of lawyers is 43.6 years, with a standard deviation of 5.1 years. If the law firm employs 50 lawyers, find the probability that the average age of the group is greater than 44.2 years old. 10. Procter & Gamble reported that an American family of 4 washes an average of one ton (2000 pounds) of clothes each year. If the standard deviation of the distribution is 187.5 pounds, find the probability that the mean of the randomly selected sample of 50 families or four will be between 1980 and 1990 pounds. 11. The average time it taken a group of adults to complete a certain achievement test is 46.2 minutes. The standard deviation is 80 minutes. Assume the variable is normally distributed a) b) c) d) Find the probability that a randomly selected adult will complete the test in less than 43 minutes. Find the probability that if 50 randomly selected adults take the test, the mean time it takes the group to complete the test will be less than 43 minutes. Does it seem reasonable that an adult would finish the test in less than 43 minutes? Explain Does it seem reasonable that the mean of the 50 adults could be less than 43 minutes?
  • 31. SAMPLE AND SAMPLE DISTRIBUTIONS 12. 31 The average cholesterol content of a certain brand of eggs is 215 milligrams and the standard deviation is 15 milligrams. Assume the variable is normally distributed. a) b) 13. CN303/3/ If a single egg is selected, find the probability that the cholesterol content will be more than 220 milligrams. If a sample of eggs is selected, find the probability that the mean of the sample will be larger than 220 milligrams. The average labor cost for car repairs for a large chain of car repair shop is RM 48.25. The standard deviation is RM 4.20. Assume the variable is normally distributed. (a) (b) If a store is selected at random, find the probability that the labour cost will range between RM 46 and RM 48 If stores are selected at random, find the probability that the mean of the sample will be between RM 46 and RM 48. © Which answer is larger? Explain why.
  • 32. SAMPLE AND SAMPLE DISTRIBUTIONS CN303/3/ 32 FEEDBACK TO SELF-ASSESSMENT 3 Have you tried the questions??? If “YES”, check your answers now. 1. The distribution is called the sampling distribution of sample means. 2. The mean of the mean is equal to the population mean. 3. The distribution will be approximately normal when the sample size is large. x 4. z = / n 5. 0.2486 6. 0.2327 7. 0.0571 8. 0.8239 9. 0.2033 10. 0.1254 11. a) 0.3446 b) 0.0023 c) Yes , since it is within one standard deviation of the mean. d) very unlikely 12. a) 0.3707 b) 0.0475 13. a) 0.1815 b) 0.3854 c) Means are less variable than individual data.