Sample sample distribution

SAMPLE AND SAMPLE DISTRIBUTIONS

CN303/3/

1

UNIT 3


OBJECTIVES

General Objective
 To understand and the concept of sampling and sample distributions

Specific Objectives
At the end of the unit you should be able to:
 Define the sampling distribution concept which is the base for inferential
statistics.
 Express the relationship between statistical samples and population
parameters.
 Explain the concept of sampling distribution of sample means based on
random sample taken with and without replacement from a population.
 Calculate the mean, variance and standard deviation of the distribution of
the sample means taken with or without replacement from a population.
 State the criteria for big samples (n>30).
 Study the characteristics of the distributions of the means of samples
taken from a population.
 Use the central limit theorem to solve the probability problems involving
distribution of sample means for large number of samples.


CN303/3/

2

INPUT

3.0

INTRODUCTION

As an engineer, you are required to find out the mean value of the service life for
newly developed light bulbs. One of the approaches is to randomly pick out, say
50 light bulbs from the whole population of thousand bulbs produced and have
them tested. In doing so, you can approximate the mean value for the bulbs. This
method is known as sampling.
3.1

SAMPLE DISTRIBUTIONS

Every sample is a subset from a population. By studying the sample, it is
possible to find out the characteristics of the sample and eventually determine
the characteristics of the whole population. It would be ideal if the sample were a
perfect miniature of the population in all characteristics. This ideal, however, is
impossible to achieve. The best that can be done is to select a sample that will
be representative with respect to some characteristics, preferably those
pertaining to the study.
For a sample to be a random sample, every member of the population must
have an equal chance to be selected. If selected without being biased, it will
become the representative of the population.


CN303/3/

3

3.1.1 SAMPLE STATISTICS AND POPULATION PARAMETERS
Probability distribution concept can be applied for sample statistics. An example
of sample statistics is the measurement for central tendency for a given sample


such as the mean (x) or the variation such as standard deviation, S. The
population mean,  and the population standard deviation,  are the
measurement for the central tendency of a sample. Below is a table for sample
statistics and population parameters:
Quantity
Size
Mean

Sample
statistics
N


Variance
Standard
deviation
Proportion

Population parameters
N



x
s2
S

2


p^

p

3.1.2 DISTRIBUTION OF SAMPLE MEANS
If we select 100 samples of a specific size from a large population and compute
the mean of the same variable for each 100 samples. The sample means,

x1 , x 2 ... x 100 , constitute a sampling distribution of sample means.
If the samples are randomly selected with replacement, the sample means, for
most part, will be somewhat different from the population mean  . These
differences are caused by sampling error.
Properties of the distribution of sample Means
1. The mean of the sample means will be the same as the
population mean.
2. The standard deviation of the sample means will be smaller than
the standard deviation of the population, and will be equal to the
population standard deviation divided by the square root of the
sample size


CN303/3/

4

Example 3.1
1.

Suppose a lecturer gave an eight point quiz to a small class of four
students. The results of the quiz were 2, 6, 4, and 8. Assume the four
students constitute the population.
Find i) The population  ,  and draw the graph of the sample means.
ii)  x ,  x of the sample means

2.

Assume that we have a population consisting of three numbers 1, 2, and
3. The probability distributions for these numbers are
X
P(x)
Find

1
1/3

2
1/3

3
1/3

i) The population means, variance and standard deviation
ii) Now, if all samples of size 2 are taken with replacement, and the
mean of each sample is found, find:
a) The probability distribution for sample means, x , draw a table
b) The mean for the sample means
c) The variance and standard deviation for sample means

Solution to Example 3.1
1.

The mean of the population is



2648
5 ,
4

The standard deviation of the population is



(2  5) 2  (6  5) 2  (4  5) 2  (8  5) 2
 2.236
4


CN303/3/

5

frequency, 1

Below is the graph of the sample means. The graph appears to be
somewhat normal, even though it is a histogram.

1

2

3

4

score

Now, if all samples of size 2 are taken with replacement, and the mean of each
sample is found, the distribution is shown next. (You can draw a tree diagram if
you wish)
Sample
2, 2
2, 4
2, 6
2, 8
4, 2
4, 4
4, 6
4, 8

Mean
2
3
4
5
3
4
5
6

Sample
6, 2
6, 4
6, 6
6, 8
8, 2
8, 4
8, 6
8, 8

Mean
4
5
6
7
5
6
7
8

A frequency distribution of sample means is as follows.


X
2
3
4
5
6
7
8

F
1
2
3
4
3
2
1


CN303/3/

6

Below is the graph of the sample means. The graph appears to be somewhat
normal, even though it is a histogram.
5
4
frequency

3
2
1
0
2

3

4

5

6

7

8

sample mean

The mean of the sample means, denoted by   
x

2  3  ...8 80

 5 which is the
16
16

same as the population mean. Hence  x  
The standard deviation of the sample means denoted by

(2  5) 2  (3  5) 2  ..(8  5) 2
 1.581 which the same as the population

x
16
2.236
standard deviation is divided by 2 :   
 1.581
x
2
Note: if all possible sample of size n are taken with replacement from the same
population, the mean of the sample means, denoted by   , equals to the

 

x

population mean  ; and the standard deviation of the sample means, denoted
by   , equals  .
n
x
2. Population mean,
  E ( X )   xp( x)  1(1 / 3)  2(1 / 3)  3(1 / 3)  1 / 3(1  2  3)  6 / 3  2
Population variance, s 2  E ( X 2 )  [ E ( X 2 )]
= 12 ( 1 )  2 2 ( 1 )  32 ( 1 )
3
3
3
1
= 3 (1  4  9)
= 14
3

 2  14  2 2 
3
Therefore



2
3

2
3


CN303/3/

7

ii)
Sample
1, 1
1, 2
1, 3
2, 1
2, 2
2, 3
3, 1
3, 2
3, 3



Mean, x
1.0
1.5
2.0
1.5
2.0
2.5
2.0
2.5
3.0

The probability distribution for sample means, x
x

1.0
1/9

P( x )

1.5 2.0 2.5
2/9 3/9 2/9

3.0
1/9





You can draw a histogram for sample means, x against P( x ) as in ‘activity 3A’
and then find the mean for the sample means,  x  E ( x )   x p ( x ) =
3
2
2
1( 1 )  1.5( 9 )  2( 9 )  2.5( 9 )  3( 1 )
3
9
= 18
9
=2
=  (population mean)
Variance for the sample means, x is
 2 x  ( E ( x ) 2  [ E ( x )]2   x 2 p ( x )
3
2
= 12 ( 1 )  1.5 2 ( 9 )  2.0 2 ( 9 )
9
=

13
3

 2 x  13  2 2 
3

1
3


CN303/3/

Standard deviation for sample means, x :  x 
Look:  x 

1
3

=

2
3

2



2
n

, 2  2 & n  2 
3


n

1
3

8


CN303/3/

9

ACTIVITY 3A

TEST YOUR UNDERSTANDING BEFORE PROCEEDING TO THE NEXT
INPUT…!
1. Let the population consist of the digits 1, 2 and 3. Find the population
mean and the population standard deviation.
2. 10000 female students are found to have a mean weight of 63 kg with a
standard deviation of 7 kg. 100 samples of size 36 are taken, without
replacement, from the above. Estimate the mean and standard deviation
of the sample-means.


FEEDBACK TO ACTIVITY 3A

1.   2,  x =

2
3

2.  = 63 and  x = 1.17

CN303/3/

10


CN303/3/

11

INPUT

3.2

THE CENTRAL LIMIT THEOREM

As the sample size n increases, the shape of the distribution of the sample
means taken with replacement from a population with mean  and standard
deviation  will approach the normal distribution. As previously shown, this
distribution will have a mean  and a standard deviation n
The central limit theorem can be used to answer questions about sample means
in the same manner that the normal distribution can be used to answer questions
about individual values. The only difference is that a new formula must be used
for the z values.
z

X 



Notice that X is the sample mean, and the denominator is the

n
standard error of the mean. It is important to remember two things when using
the central limit theorem:

When the original variable is normally distributed, the distribution of the sample
means will be normally distributed, for any sample size n.
When the distribution of the original variable departs from normality, a sample
size of 30 or more is needed to use the normal distribution to approximate the
distribution of the sample means. The larger the sample, the better the
approximation will be.
……………………………………………………………………………………………..


CN303/3/

12

NOTE
Since the sample size is 30 or larger, the normality assumption is not necessary,
X 
X 
as in the example above. When do we use z 
or z 
?

/ n
The formula z 

X 

/ n

should be used to gain information about a sample mean

whereas the formula z 

X 



is used to gain information about an individual

data value obtained from the population. See the example below.
……………………………………………………………………………………………

Example 3.2
1. Students in semester 1 and 2 in Polytechnics spend an average of 25
hours sleeping in a week. Assume the variable is normally distributed and
the standard deviation is 3 hours. If 20 students from semester 1 and 2
are randomly selected, find the probability that the mean of the number of
hours they sleep will be greater than 26.3 hours.
2. The average age of motorcycles registered in polytechnics is 8 years, or
96 months. Assume the standard deviation is 16 months. If a random
sample of 36 motorcycles is selected, find the probability that the mean of
their ages is between 90 and 100 months.
3. The average number of pounds of meat a person consumes a year is
218.4 pounds. Assume that the standard deviation is 25 pounds and the
distribution is approximately normal.
i)
Find the probability that a person selected at random consumes
less than 224 pounds per year.
ii)
If a sample of 40 individuals is selected, find the probability that
the mean of the sample will be less than 224 pounds per year.


CN303/3/

13


1. Since the variable is approximately normally distributed, the distribution of
sample means will be approximately normal, with a mean of 25. The
standard deviation of the sample means is

x 


n

3



20

 0.671

25

26.3

The distribution of the means is shown above, with the appropriate area shaded.
The z-value is z 

X 



n

=

26.3  25
1 .3

 1.94
3
0.671
20

The area between 0 and 1.94 is 0.4738. Since the desired area is in the tail,
subtract 0.4738 from 0.5000. Hence 0.5000 – 0.4738 = 0.0262, or 2.62%.
One can conclude that the probability of obtaining a sample mean larger than
26.3 hours is 2.62% (i.e., P ( X ) 26.3)  2.62% )


2.

CN303/3/

14

The desired area is shown in the figure below:

90

96

100

The two z-values are

z1 

90  96
16 / 36

 2.25 and z 2 

100  96
16 / 36

 1.50

The two areas corresponding to the z values 0f -2.25 and 1.50, respectively, are
0.4878 and 0.4332. Since the z-values are on opposite sides of the mean, find
the probability of adding the areas: 0.478 + 0.4332 = 0.921, or 92.1%.
Hence, the probability of obtaining a sample mean between 90 and 100 months
is 92.1% i.e., P(90< X <100) = 92.1%.


3.

(i)

CN303/3/

15

Since the question asks about an individual person, the formula
X 
is used. The distribution is shown in the figure below.
z



218.4 224
Distribution of individual data
values for the population
The z value is z 

X 





224  218.4
 0.22
25

The area between 0 and 0.22 is 0.0871; this area must be added to 0.5000 to get
the total area to the left of z = 0.22.
0.0871 + 0.5000 = 0.5871
Hence, the probability of selecting an individual who consumes less than 224
pounds of meat per year is 0.5871, or 58.71% ( i.e., P(X<224) = 0.5871.


(ii)

CN303/3/

Since the question concerns the mean of a sample with a size of
X 
40, the formula z 
is used. The area is shown in the figure
/ n
below:

218.4

224

The z value is

X 

224  218.4
 1.42
25
/ n
40
The area between z = 0 and z = 1.42 is 0.422; this value must be added to
0.5000 to get the total area.
0.422 + 0.5000 = 0.9222
z

16



Hence, the probability that the mean of a sample of 40 individuals is less than
224 pounds per year is 0.9222, or 92.22%. That is P( X  224)  0.9222


CN303/3/

17

Comparing the two probabilities, one can see that the probability of selecting an
individual who consumes less than 224 pounds of meat per year is 58.71%, but
the probability of selecting a sample of 40 people with a mean consumption of
meat that is less than 224 pounds per year is 92.22%. This rather large
difference is due to the fact that the distribution of sample means is much less
variable than the distribution of individual data values.


CN303/3/

18

ACTIVITY 3B

INPUT…!
1.

The average salary for workers at an electronic factory is RM13.50 per
hour. Assume that the standard deviation is RM2.90 per hour and the
distribution is approximately normal. If X is the mean salary per hour for a
random sample of the workers at the factory, find the mean and standard
deviation for a sample distribution X if the sample size is (a) 30 workers,
and (b) 75 workers

2.

The average weight of sugar sachets is 32 grams. Assume the standard
deviation is 0.3 gram. If a random sample of 20 sachets is selected, find
the probability that the mean of their weight is between 31.8 and 31.9
grams.

3.

Analysis of 150 compressive strength results gave a mean strength of 32
N/mm2 and standard deviation 6.5 N/mm2. Given that 10 samples of 12
results are considered, find the number of samples with mean strength
greater than 33 N/mm2.

4.

Asbestos-cement sheets are manufactured with a mean length 2400 mm
and standard deviation 3 mm. Given that 20 batches consisting of 3 dozen
sheets are considered, determine
(a) the probability that a batch (chosen at random) has a mean
length between 2399.5 mm and 2400.6 mm
(b) the number of batches with mean length less than 2399.3 mm.


CN303/3/

FEEDBACK TO ACTIVITY 3B

1. a) X ,  x    RM 13.50,  x  0.53
b) X ,  x    RM 13.50,  x  0.33
2. 0.667 or 66.7%
3.  x   =32 N/mm2,  x =1.81 N/mm2, P( x  33)  0.2912  3 samples
4. (a) P(2399.5< x <2400.6) = 0.7262

(b) P( x <2399.3) = 0.0808

19


CN303/3/

20

INPUT

3.3 DISTRIBUTION OF THE SAMPLE MEANS
a. Distribution of the sample means with replacement
Statement 1
The shape of the distribution of the sample means X taken with replacement
from a known population with mean  and standard deviation  , regardless of
the sample size (n), will approach the normal distribution. As previously shown,
this distribution will have a mean  and a standard deviation n
Statement 2
If the sample is taken from any population with known  and  , and the sample
size is very large (n  30), the distribution of sample mean is almost normal with
2
min  and standard deviation  that is x  N (  , n )

b. Distribution of the sample means without replacement
The formula for the standard error of the mean,



, is accurate when the sample
n
are drawn with replacement or without replacement from a very large or infinite
population. Since sampling with replacement is for the most part unrealistic, a
correction factor is necessary for computing the standard error of the mean for
samples drawn without replacement from a finite population. Compute the
correction factor by using the following formula:


N n
N 1

CN303/3/

21

where N is the population size and n is the sample size.

This correction factor is necessary if relatively large samples are
taken from a small population, because the sample mean will then be more
accurately estimate the population means and there will be less error in the
estimation. Therefore, the standard error of the mean must be multiplied by the
correction factor to adjust it for large samples taken from a small population. That
is

x 



N n
N 1

n

Finally the formula for the z value becomes
z

X 


n

.

N n
N 1

When the population is large and the sample is small, the correction factor is
generally not used, since it will be very close to 1.000. Therefore

x 


n

.


CN303/3/

22

Example 3.3
1.

The average price of houses in Jitra is RM157000 and is rather skewed.
Assume the standard deviation is RM29500. If x is the mean price for a
sample of 400 houses selected at random, find the probability:
a) That the sample mean is between RM154000 and 160000.
b) That the mean price for this sample is below RM154000.

2.

The average time taken by line workers in an electronic firm to assemble
the electronic components is 80 hours with the standard deviation of 8
hours. Find the probabilities (P) of the mean assembly time if a random
sample consisting of 16 workers is selected.
a. P (78  x  82)
b. P (76  x  84)
c. P (74  x  86)

3.

The average service hour of 400 batteries is 800 with the standard
deviation of 45. If a random sample of 45 batteries is selected, what is the
probability that the sample mean is between 790 and 810 hours.

4.

The data shows the number of children belonging to a group of 50
Polytechnic lecturers.
No. of
children
No. of
lecturers

0

1

2

1

18 24

3

4

4

3

a. Find the mean and the standard deviation of the data above.
b. If a sample of 10 lecturers is taken, find the mean number of children
of this sample that is more than 2.


CN303/3/

23

1. Although the price of houses in Jitra is skewed and not normally distributed,
the sample mean price is rather normal due to the big sample size (n=400).
Therefore the central limit theorem is applicable.
Given  =157000 and  =RM29500.
 x    RM 157000

x 


n



29500
 RM 1475
400

Therefore

x  N (157000,1475 2 )
a.

b.

P (154000  x  160000)
154000  157000) x  157000 160000  157000
 P(


)
1475
1475
1475
=P(-2.03  z  2.03)
=0.976

 x  157000 154500  157000 
P ( x  154500)  P

  P ( Z  2.03)  0.0212
1475
 1475


a.
b.

P (78  x  82)
P (76  x  84)


2.

CN303/3/

Although the sample size is small (n=16), the time distribution to
assemble the components is normally distributed. Therefore the distribution
of the sample mean  is normally distributed with mean = 80 hours and the
standard deviation  x  8 = 2 hours.
16
a.

b.

c.

 78  80 x  80 82  80 
P (78  x  82)  P



2
2 
 2
= P (1  Z  1)
=0.6826
 76  80 x  80 84  80 
P (76  x  84)  P



2
2 
 2
= P ( 2  Z  2 )
= 0.9544
 74  80 x  80 86  80 
P (74  x  86)  P



2
2 
 2
= P (3  Z  3)
= 0.9974

3.

The probability that the mean sample is between 790 and 810 hours is
0.9066.

4.

The probability distribution is:
No. of children(x)
Relative frequency, p(x)

a)

24

0
0.02

1
2
0.36 0.48

3
0.08

4
0.06

   xp(x) = 0(0.02)  1(0.36)  2(0.48)  3(0.08)  4(0.06)  1.8  2

 2   x 2 p( x)  (  ) 2  0 2 (0.02)  12 (0.36)  2 2 (0.48)  3 2 (0.08)  4 2 (0.06)  0.72
   0.72  0.8445


CN303/3/

b)
Due to large samples (N = 50) and 10 lecturers were selected without
replacement, the sampling distribution for sample means is almost normal
with  x  1.8 and

x 


n

N  n 0.8485 50  10

 0.2424 Therefore
N 1
50  1
10

x  N (1.8,0.2424 2 )
 x  1 .8 2  1 .8 
P ( x  2)  P 

  P ( Z  0.83)  0.2033
 0.2424 0.2424 

25


CN303/3/

26

ACTIVITY 3C

INPUT…!
1. The heights of 2500 men are normally distributed with a mean of 170 cm and a
standard deviation of 7 cm. If random samples are taken of 30 men, predict
the standard deviation and the mean of sampling distribution of means, if
sampling is done (a) with replacement, and (b) without replacement.
2. A group of 1000 ingots of metal have a mean mass of 7.4 kg and a standard
deviation of 0.4 kg. Find the probability that a sample of 50 ingots chosen at
random from the group, without replacement, will have a combined mass of (a)
between 360 and 377.5 kg, and (b) more than 375 kg.
3. Determine the mean and standard deviation of the set of numbers 1, 2, 4, 5,
and 6, correct to three decimal places. By selecting all possible different
samples of size 2 which can be drawn with replacement (25 pairs) determine
(a) the mean of the sampling distribution of means, and (b) the standard error
of the means, correct to three decimal places.
4. Determine the standard error of the means for problem 3, if sampling is without
replacement, correct to three significant figures.
5. The length of 1500 bolts is normally distributed with a mean of 22.4 cm and a
standard deviation of 0.048 cm. If 30 samples are drawn at random from this
population, each of size 36 bolts, determine the mean of the sampling
distribution and the standard error of the means when sampling is done with
replacement.
6. Determine the standard error of the means in problem 5, if sampling is done
without replacement, correct to 4 decimal places.


CN303/3/

27

7. If a random sample of 64 lamps is drawn from a batch, determine the
probability that the mean time to failure will be less than 785 hours, correct to 3
decimal places.
8. Determine the probability that the mean time to failure of a random sample of
16 lamps will be between 790 hours and 810 hours, correct to 3 decimal
places.
9. For a random sample of 64 lamps, determine the probability that the mean
time to failure will exceed 820 hours, correct to 2 significant figures.


CN303/3/

FEEDBACK TO ACTIVITY 3C

1. (a)  x    1.278 cm
2.

(b)  x  1.271 cm,  x  170 cm

The mean of the sampling distribution of means =  x    7.4 kg
The standard error of the means,  x  0.0552 kg
(a) 0.9966 (b) 0.0351

3.   3.6000,   1.855, (a)  x  3.600 , (b)  x  1.312
4.  x = 1.136
5.  x =22.4 cm,  x =0.08 cm
6.  x = 0.0079 cm
7. 0.023
8. 0.497
9. 0.0038

28


CN303/3/

29

SELF ASSESSMENT 3

You are approaching success. Try all the questions in this self-assessment
section and check your answers on the next page. If you encounter any
problems, consult your instructor. Good luck.
1. If the samples of a specific size are selected from a population and the
means are computed, what is this distribution of means called?
2. What is the mean of the sample means?
3. What does the central limit theorem say about the shape of the distribution
of sample means?
4. What formula is used to gain information about a sample mean when the
variable is normally distributed or when the sample size is 30 or more?

For exercise below, assume that the sample is taken from a large
population and the correction factor can be ignored.
5. The mean serum cholesterol of a large population of overweight
adults is 220 Mg/dl and the standard deviation is 16.3 mg/dl. If a sample of
adults is selected. Find the probability that the mean will be between 220
and 222 mg/dl.
6.

The mean weight of 18 year old females is 126 pound, and the standard
deviation is 15.7. If the sample of 25 females is selected, find the
probability that the mean of the sample will be greater than 128.3 pounds.
Assume the variable is normally distributed.


7.

CN303/3/

30

The average price of the pound of sliced bacon is RM2.02. Assume the
standard deviation is RM0.08. If a random sample of 40 one-pound
packages is selected, find the probability that the mean of the sample will
be less than RM2.00.

8.

The mean score on a dexterity test for 12 year old is 30. The standard
deviation is if a psychologist admitters the test to a class of 22 student,
find the probability that the mean of the sample will be between 27 and 31.
Assume the variable is normally distributed.

9.

The average age of lawyers is 43.6 years, with a standard deviation of 5.1
years. If the law firm employs 50 lawyers, find the probability that the
average age of the group is greater than 44.2 years old.

10.

Procter & Gamble reported that an American family of 4 washes an
average of one ton (2000 pounds) of clothes each year. If the standard
deviation of the distribution is 187.5 pounds, find the probability that the
mean of the randomly selected sample of 50 families or four will be
between 1980 and 1990 pounds.

11.

The average time it taken a group of adults to complete a certain
achievement test is 46.2 minutes. The standard deviation is 80 minutes.
Assume the variable is normally distributed
a)
b)
c)
d)

Find the probability that a randomly selected adult will
complete the test in less than 43 minutes.
Find the probability that if 50 randomly selected adults take
the test, the mean time it takes the group to complete the
test will be less than 43 minutes.
Does it seem reasonable that an adult would finish the test in
less than 43 minutes? Explain
Does it seem reasonable that the mean of the 50 adults
could be less than 43 minutes?


12.

31

The average cholesterol content of a certain brand of eggs is 215
milligrams and the standard deviation is 15 milligrams. Assume the
variable is normally distributed.
a)
b)

13.

CN303/3/

If a single egg is selected, find the probability that the
cholesterol content will be more than 220 milligrams.
If a sample of eggs is selected, find the probability that the
mean of the sample will be larger than 220 milligrams.

The average labor cost for car repairs for a large chain of car repair shop is
RM 48.25. The standard deviation is RM 4.20. Assume the variable is
normally distributed.
(a)
(b)

If a store is selected at random, find the probability that the
labour cost will range between RM 46 and RM 48
If stores are selected at random, find the probability that the
mean of the sample will be between RM 46 and RM 48.
© Which answer is larger? Explain why.


CN303/3/

32

FEEDBACK TO SELF-ASSESSMENT 3

Have you tried the questions??? If “YES”, check your answers now.
1. The distribution is called the sampling distribution of sample means.
2. The mean of the mean is equal to the population mean.
3. The distribution will be approximately normal when the sample size is
large.
x
4. z =
/ n
5. 0.2486
6. 0.2327
7. 0.0571
8. 0.8239
9. 0.2033
10. 0.1254
11. a) 0.3446
b) 0.0023
c) Yes , since it is within one standard deviation of the mean.
d) very unlikely
12. a) 0.3707

b) 0.0475

13. a) 0.1815

b) 0.3854

c) Means are less variable than individual data.

Sample sample distribution

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Sample sample distribution

Similaire à Sample sample distribution (20)

Sample sample distribution