A Critique of the Proposed National Education Policy Reform
ย
Statistics (recap)
1. Statistics (Recap)
Finance & Management Students
Farzad Javidanrad
October 2013
University of Nottingham-Business School
2. Probability
โข Some Preliminary Concepts:
๏ Random: Something that happens (occurs) by chance.
๏ Population: A set of all possible outcome of a random experiment
or a collection of all members of a specific group under study. This
collection makes an space that all possible samples can be derived
from. For that reason it is sometimes called sample space.
๏ Sample: Any subset of population (sample space).
In tossing a die:
Random event is the event of appearing any face of the die.
Population (sample space) is the set of .
Sample is any subset of the set above such as or .
๏ป ๏ฝ61,2,3,4,5,
๏ป ๏ฝ3 ๏ป ๏ฝ6,4,2
3. Probability
โข Two events are mutually exclusive if they cannot happen together.
The occurrence of one of them prevents the occurrence of another.
For example, if the baby is a boy it cannot be a girl and vice versa.
โข Two events are independent if occurrence of one of them has no
effect on the chance of occurrence of another. For example, the
result of rolling a die has no impact on the outcome of flipping a
coin. But in the experiment of taking two cards consecutively from a
set of 52 cards (if the cards can be chosen equally likely) the chance
of getting the second card is affected by the result of the first card.
โข Two events are exhaustive if they include all possible outcomes
together. For example, in rolling a die the possibility of having odd
numbers or even numbers.
4. Probability
โข If event ๐จ can happen in ๐ different ways out of ๐ equally likely
ways, the probability of event ๐จ can be shown as its relative
frequency; i.e. :
๐ ๐ด =
๐
๐
U: sample space (population)
A: an event (sample)
Aโ: mutually exclusive event with A
A & Aโ are exhaustive collectively
No. of ways that event ๐ด
occurs
Total of equally likely and
possible outcomes
๐ด๐ด
๐ดโฒ
U
5. Probability
โข As 0 โค ๐ โค ๐ it can be concluded that
0 โค
๐
๐
โค 1
Or
0 โค ๐(๐ด) โค 1
โข ๐ ๐ด = 0 means that event ๐ด cannot happen and ๐ ๐ด = 1
means that the event will happen with certainty.
โข With the definition of ๐ดโฒ as an event of โnon-occurrenceโ of event
๐ด, we can find that:
๐ ๐ดโฒ
=
๐ โ ๐
๐
= 1 โ
๐
๐
= 1 โ ๐ ๐ด
Or
๐ ๐ด + ๐ ๐ดโฒ
= 1
6. Probability of Multiple Events
โข If ๐จ and ๐ฉ are not mutually exclusive events so, the probability of
happening one of them (๐จ ๐๐ ๐ฉ) can be calculated as following:
๐ท ๐จ โช ๐ฉ = ๐ท ๐จ + ๐ท ๐ฉ โ ๐ท(๐จ โฉ ๐ฉ)
๐ ๐ด ๐๐ ๐ต ๐ ๐ด ๐๐๐ ๐ต
๐ ๐ด ๐ ๐ต
๐ ๐ด โฉ ๐ต
7. Probability of Multiple Events
P(A)
P(B)P(C)
๐ ๐ด โฉ ๐ต โฉ ๐ถ
In case, we are dealing with more events:
๐ท ๐จ โช ๐ฉ โช ๐ช = ๐ท ๐จ + ๐ท ๐ฉ + ๐ท ๐ช โ ๐ท ๐จ โฉ ๐ฉ โ ๐ท ๐จ โฉ ๐ช โ
๐ท ๐ฉ โฉ ๐ช + ๐ท(๐จ โฉ ๐ฉ โฉ ๐ช)
8. Probability of Multiple Events
โข Considering ๐ท ๐จ โช ๐ฉ = ๐ท ๐จ + ๐ท ๐ฉ โ ๐ท(๐จ โฉ ๐ฉ) we can have the
following situations:
1. If ๐จ and ๐ฉ are mutually exclusive events, then :
๐ท ๐จ โฉ ๐ฉ = ๐
2. If ๐จ and ๐ฉ are two independent events, then:
๐ท ๐จ โฉ ๐ฉ = ๐ท(๐จ) ร ๐ท(๐ฉ)
3. If ๐จ and ๐ฉ are dependent events, then:
๐ท ๐จ โฉ ๐ฉ = ๐ท(๐จ) ร ๐ท(๐ฉ ๐จ) = ๐ท(๐ฉ) ร ๐ท(๐จ ๐ฉ)
Where ๐ท(๐จ ๐ฉ) and ๐ท(๐ฉ ๐จ) are conditional probabilities and in the case of
๐ท(๐จ ๐ฉ) means the probability of event ๐ด provided that event ๐ต has already
happened.
9. Probability of Multiple Events
o The probability of picking at random a Heart or a Queen on a single
experiment from a card deck of 52 is:
๐ ๐ป โช ๐ = ๐ ๐ป + ๐ ๐ โ ๐ ๐ป โฉ ๐ =
13
52
+
4
52
โ
1
52
=
4
13
o The probability of getting a 1 or a 4 on a single toss of a fair die is:
๐ 1 โช 4 = ๐ 1 + ๐ 4 =
1
6
+
1
6
=
1
3
As they cannot happen together they are mutually exclusive events
and ๐ 1 โฉ 4 = 0.
o The probability of having two heads in the experiment of tossing
two fair coins is: (two independent events)
๐ ๐ป โฉ ๐ป =
1
2
.
1
2
=
1
4
10. Probability of Multiple Events
o The probability of picking two ace without returning the first card
into the batch of 52 playing cards, which represents a conditional
probability, is:
๐ 1๐ ๐ก ๐๐๐ โฉ 2๐๐ ๐๐๐ = ๐(1๐ ๐ก ๐๐๐) ร ๐(2๐๐ ๐๐๐ 1๐ ๐ก ๐๐๐)
Or can be written with less words involved:
๐ ๐ด1 โฉ ๐ด2 = ๐(๐ด1) ร ๐(๐ด2 ๐ด1) =
4
52
ร
3
51
=
1
221
โข If two events ๐จ and ๐ฉ are independent from each other then:
๐ท(๐จ ๐ฉ) = ๐ท ๐จ ๐๐๐ ๐ท(๐ฉ ๐จ) = ๐ท(๐ฉ)
11. Random Variable & Probability Distribution
Some Basic Concepts:
โข Variable: A letter (symbol) which represents the elements of a
specific set.
โข Random Variable: A variable whose values are randomly appear
based on a probability distribution.
โข Probability Distribution: A corresponding rule (function) which
corresponds a probability to the values of a random variable.
โข Variables (including random variables) are divided into two general
categories:
1) Discrete Variables, and
2) Continuous Variables
12. Random Variable & Probability Distribution
โข A discrete variable is the variable whose elements (values) can be
corresponded to the values of the natural numbers set or any subset
of that. So, it is possible to put an order and count its elements
(values). The number of elements can be finite or infinite.
โข For a discrete variable it is not possible to define any neighbourhood,
whatever small, at any value in its domain. There is a jump from one
value to another value.
โข If the elements of the domain of a variable can be corresponded to
the values of the real numbers set or any subset of that, the variable
is called continuous. It is not possible to order and count the
elements of a continuous variable. A variable is continuous if for any
value in its domain a neighbourhood, whatever small, can be defined.
13. Random Variable & Probability Distribution
โข Probability Distribution: A rule (function) that associates a
probability either to all possible elements of a random variable (RV)
individually or a set of them in an interval.*
โข For a discrete RV this rule associates a probability to each possible
individual outcome. For example, the probability distribution for
occurrence of a Head when filliping a fair coin: (Note: ๐๐ = 1)
๐ 0 1
๐(๐ฅ) 0.5 0.5
In one trial ๐ป, ๐
๐ 0 1 2
๐(๐ฅ) 0.25 0.5 0.25
In two trials
๐ป๐ป, ๐ป๐, ๐๐ป, ๐๐
๐ = ๐ท๐๐๐๐ (+1) --- (0) (-1)
๐(๐ฅ) 0.6 0.1 0.3
Change in the price of a
share in one day
o The probability distribution for the price change of a share in stock market
14. Probability Distributions (Continuous)
โข The probability that a continuous random variable chooses
just one of its values in its domain is zero, because the number
of all possible outcomes ๐ is infinite and
๐
โ
โ ๐.
โข For the above reason, the probability of a continuous random
variable need to be calculated in an interval.
โข The probability distribution of a continuous random variable is
often called a probability density function (PDF) or simply
probability function and it is usually shown by ๐(๐) and it has
following properties:
I. ๐(๐ฅ) โฅ 0 (similar to ๐ท(๐) โฅ ๐ for discrete RV*)
II. โโ
+โ
๐ ๐ฅ ๐๐ฅ = 1 (similar to ๐ท ๐ = ๐ for discrete RV)
III. ๐
๐
๐ ๐ฅ ๐๐ฅ = ๐ ๐ โค ๐ฅ โค ๐ = ๐น ๐ โ ๐น ๐ (probability
given to set of values in an interval [a,b] )**
15. Probability Distributions (Continuous)
โข where ๐น(๐ฅ) is the integral of the PDF function (๐(๐ฅ)) and it is
called as Cumulative Distribution Function (CDF) and for any
real value of ๐ is defined as:
๐น(๐ฅ) โก ๐(๐ โค ๐ฅ)
CDF shows the area under
PDF function (๐(๐ฑ)) from
โ โ to ๐ฑ . For discrete
random variable, CDF
shows the summation of
all probabilities before
the value of ๐ฑ .
Adopted from http://beyondbitsandatomsblog.stanford.edu/spring2010/tag/embodied-artifacts/
๐น(๐ฅ)
๐(๐ฅ)
๐น(๐ฅ)โก๐(๐โค๐ฅ)
๐น(๐ฅ)โก๐(๐โค๐ฅ)
16. Some Characteristics of Probability Distributions
โข Expected Value (Probabilistic Mean Value): It is one of the most
important measures which shows the central tendency of the
distribution. It is the weighted average of all possible values of
random variable ๐ and it is shown by ๐ฌ(๐).
โข For a discreet RV (with n possible outcomes)
๐ฌ ๐ = ๐ ๐ ๐ท ๐ ๐ + ๐ ๐ ๐ท ๐ ๐ + โฏ + ๐ ๐ ๐ท ๐ ๐ =
๐=๐
๐
๐๐ ๐ท(๐๐)
โข For a continuous RV
๐ฌ ๐ =
โโ
+โ
๐. ๐ ๐ ๐ ๐
17. Some Characteristics of Probability Distributions
โข Properties of ๐ฌ(๐):
i. If ๐ is a constant then ๐ฌ ๐ = ๐ .
ii. If ๐ and ๐ are constants then ๐ฌ ๐๐ + ๐ = ๐๐ฌ ๐ + ๐ .
iii. If ๐ ๐, โฆ , ๐ ๐ are constants then
๐ฌ ๐ ๐ ๐ ๐ + โฏ + ๐ ๐ ๐ ๐ = ๐ ๐ ๐ฌ ๐ ๐ + โฏ + ๐ ๐ ๐ฌ(๐ ๐)
Or
๐ฌ(
๐=๐
๐
๐๐ ๐๐) =
๐=๐
๐
๐๐ ๐ฌ(๐๐)
iv. If ๐ and ๐ are independent random variables then
๐ฌ ๐๐ = ๐ฌ ๐ . ๐ฌ ๐
18. Some Characteristics of Probability Distributions
v. If ๐ ๐ is a function of random variable ๐ then
๐ฌ ๐ ๐ = ๐ ๐ . ๐ท(๐)
๐ฌ ๐ ๐ = ๐ ๐ . ๐ ๐ ๐ ๐
โข Variance: To measure how random variable ๐ is dispersed around
its expected value, variance can help. If we show ๐ฌ ๐ = ๐ , then
๐๐๐ ๐ = ๐ ๐ = ๐ฌ[ ๐ โ ๐ฌ ๐
๐
]
= ๐ฌ[ ๐ โ ๐ ๐]
= ๐ฌ[๐ ๐ โ ๐๐๐ + ๐ ๐]
= ๐ฌ ๐ ๐ โ ๐๐๐ฌ ๐ + ๐ ๐
= ๐ฌ ๐ ๐ โ ๐ ๐
For discreet RV
For continuous RV
19. Some Characteristics of Probability Distributions
๐๐๐ ๐ =
๐=๐
๐
๐๐ โ ๐ ๐. ๐ท(๐)
๐๐๐ ๐ = โโ
+โ
๐๐ โ ๐ ๐. ๐ ๐ ๐ ๐
โข Properties of Variance:
i. if ๐ is a constant then ๐๐๐ ๐ = ๐ .
ii. If ๐ and ๐ are constants then ๐๐๐ ๐๐ + ๐ = ๐ ๐ ๐๐๐(๐) .
iii. If ๐ and ๐ are independent random variables then
๐๐๐ ๐ ยฑ ๐ = ๐๐๐ ๐ + ๐๐๐(๐)
can be extended to more
variables
For discreet RV
For continuous RV
20. โข Some of the well-known probability distributions are:
โข The Binomial Distribution:
1. The probability of the occurrence of an event is ๐ and is not
changing.
2. The experiment is repeated for ๐ times.
3. The probability that out of ๐ times, the event appears ๐ times is:
๐ ๐ฅ =
๐!
๐ฅ! ๐ โ ๐ฅ !
๐ ๐ฅ(1 โ ๐) ๐โ๐ฅ
The mean value and standard deviation of the binomial distribution
are:
๐ = ๐=0
๐
๐ฅ๐. ๐ ๐ฅ๐ = ๐๐ ๐ = ๐=0
๐
๐ฅ๐ โ ๐ 2. ๐(๐ฅ๐) = ๐๐(1 โ ๐)
So, to show that the probability distribution of the random variable ๐
is binomial we can write: ๐~๐ต๐(๐๐, ๐๐ 1 โ ๐ )
Probability Distributions (Discrete RV)
21. Probability Distributions (Discrete RV)
โข A gambler thinks his chance to get a 1 in rolling a die is high. What
is his chance to have 4 one out of six experiments using a fair die?
The probability of having a one in an individual trial is
1
6
and it
remains the same in all 6 experiments. So,
๐ ๐ฅ = 4 =
6!
4! 2!
1
6
4
5
6
2
=
375
7776
= 0.048 โ 5%
โข The Poisson Distribution:
1. It is used to calculate the probability of number of desired event
(no. of successes)in a specific period of time.
2. The average number of desired event (no. of successes) per unit of
time remains constant.
22. โข So, the probability of having ๐ numbers of success is calculated by:
๐ ๐ฅ =
๐ ๐ฅ ๐โ๐
๐ฅ!
Where ๐ is the average number of successes in a specific period of time and
๐ = 2.7182 .
โข The mean value and standard deviation of the Poisson distribution are:
๐ =
๐=0
๐
๐ฅ๐. ๐ ๐ฅ๐ = ๐ and ๐ =
๐=0
๐
๐ฅ๐ โ ๐ 2. ๐(๐ฅ๐) = ๐
So, to show that the probability distribution of the random variable ๐ is
Poisson we can write: ๐ฟ~Poi(๐, ๐).
o The emergency section in a hospital receives 2 calls per half an hour (4
calls in an hour). The probability of getting just 2 calls in a randomly
chosen hour in a random day is:
๐ ๐ฅ = 2 =
42 ๐โ4
2!
= 0.146 โ 15%
Probability Distributions (Discrete RV)
23. The Normal Distribution (Continuous RV)
โข The Normal Distribution: It is the best known probability
distribution which reflects the nature of most random
variables in the world. The probability density function (PDF)
of normal distribution is:
1. Symmetrical around its mean value (๐).
2. Bell-shaped, with two tails approaching the horizontal axis
asymptotically as we move further away from the mean.
Adopted from
http://www.pdnotebook.com/2
010/06/statistical-tolerance-
analysis-root-sum-square/
24. The Normal Distribution (Continuous RV)
3. The probability density function (PDF) of normal distribution
can be represented by:
๐ ๐ =
๐
๐ ๐๐
๐
โ
๐โ๐ ๐
๐๐ ๐
(โโ < ๐ < +โ)
Where ๐ and ๐ are mean and standard deviation respectively.
๐ = โโ
+โ
๐. ๐ ๐ ๐ ๐ and ๐ = โโ
+โ
๐ โ ๐ ๐ . ๐ ๐ ๐ ๐
So, ๐ฟ~๐ต(๐, ๐ ๐).
โข A linear combination of independent normally distributed random
variables is itself normally distributed, that is,
If ๐ฟ~๐ต ๐ ๐, ๐ ๐
๐ and ๐~๐ต ๐ ๐, ๐ ๐
๐ and if ๐ = ๐๐ฟ + ๐๐ then
๐~๐ต(๐๐ ๐ + ๐๐ ๐ , ๐ ๐
๐ ๐
๐
+ ๐ ๐
๐ ๐
๐
)
โข This can be extended to more than two random variables.
25. The Normal Distribution (Continuous RV)
โข Recalling the last property of PDF ( ๐
๐
๐ ๐ฅ ๐๐ฅ = ๐(๐ โค ๐ฅ โค ๐)), it is
difficult to calculate the probability using the above PDF with different
values of ๐ and ๐. The solution for this problem is to transform the normal
variable ๐ to the standardised normal variable (or simply, standard normal
variable) random variable ๐ , by: ๐ =
๐โ๐
๐
which its parameters (๐ and ๐2
) are independent from the influence of other
random variablesโ parameters with normal distribution because we always
have:๐ฌ ๐ = ๐ and ๐๐๐ ๐ = ๐ (why?)
โข The probability distribution for the standard normal variable is defined as:
๐ ๐ =
๐
๐๐
๐โ
๐ ๐
๐ ๐~๐ต(๐, ๐).
Standardised
Adopted and amended from
http://www.mathsisfun.com/data/standard-normal-
distribution.html
๐ฟ~๐ต(๐, ๐ ๐) ๐~๐ต(๐, ๐)
26. The Standard Normal Distribution
0
โข Properties of the standard normal distribution curve:
1. It is symmetrical around y-axis.
2. The area under the curve can be split into two equal areas, that is:
โโ
0
๐ ๐ง ๐๐ง =
0
+โ
๐ ๐ง ๐๐ง = 0.5
โข To find the area under the curve and before ๐ ๐ = ๐. ๐๐ , using the z-
table (next slide), we have:
๐ ๐ง โค ๐ง1 = 1.26 =
โโ
0
๐ ๐ง ๐๐ง +
0
๐ง1
๐ ๐ง ๐๐ง = 0.5 + 0.3962 = 0.8962 โ 90%
๐(๐ง)
50%
๐ง
50% 50%
๐ ๐ = ๐. ๐๐
0.5
0.3962
28. Working with the Z-Table
โข To find the probability
๐ 0.89 < ๐ง < 1.5 =
0
๐ง2
๐(๐ง)๐๐ง โ
0
๐ง1
๐ ๐ง ๐๐ง
= ๐น 1.5 โ ๐น 0.89 = 0.4332 โ 0.3133
= 0.119 โ 12%
as both values are positive.
โข To find the probability in the negative area we
need to find the equivalent area in the positive side:
๐ โ1.32 < ๐ง < โ1.25 = ๐ 1.25 < ๐ง < 1.32
= ๐น 1.32 โ ๐น 1.25
= 0.4066 โ 0.3944 = 0.0122 โ 1%
1.50.89
29. Working with the Z-Table
โข To find ๐(โ2.15 < ๐ง) we can write:
โโ
โ2.15
๐. ๐๐ง =
โโ
0
๐. ๐๐ง โ
โ2.15
0
๐. ๐๐ง
= 0.5 โ 0.4842 = 0.0158 โ 2%
โข And finally, to find ๐(๐ง โฅ 1.93) , we have:
1.93
+โ
๐. ๐๐ง =
0
+โ
๐. ๐๐ง โ
0
1.93
๐. ๐๐ง
= 0.5 โ 0.4732 = 0.0268
0-2.15 =โก
0
2.15
๐. ๐๐ง
0 =1.93
30. An Example
o If the income of employees in a big company normally distributed
with ๐ = ยฃ๐๐๐๐๐ and ๐ = ยฃ๐๐๐๐, what is the probability of an
employee picked randomly have an income
a) above ยฃ22000, b) between ยฃ16000 and ยฃ24000.
a) We need to transform ๐ to ๐ firstly:
๐ ๐ฅ > 22000 = ๐
๐ฅ โ 20000
4000
>
22000 โ 20000
4000
= ๐ ๐ง > 0.5 = 0.5 โ 01915 = 0.3085 โ 31%
b) ๐ 16000 < ๐ฅ < 24000 = ๐(
16000โ20000
4000
<
๐ฅโ20000
4000
<
24000โ20000
4000
)
= ๐ โ1 < ๐ง < 1
= 0.3413 + 0.3413
= 0.6826 โ 68%
31. The เฃ2
(Chi-Squared)Distribution
โข The เฃ ๐(Chi-Squared)Distribution:
Let ๐ ๐, ๐ ๐, โฆ , ๐ ๐be ๐ independent standardised normal distributed
random variables, then the sum of the squares of them
๐ =
๐=1
๐
๐๐
2
have a Chi-Square distribution with a degree of freedom equal to the
number of random variables (๐ ๐ = ๐). So, ๐ฟ~ .
The mean value and standard
deviation of the RV with a Chi-Squared
distribution are ๐ ๐๐๐ ๐๐
Respectively. So we can write:
๐ฟ~
2
k๏ฃ
Probability Density Function (PDF) of เฃ2
Distribution
Adoptedfromhttp://2012books.lardbucket.org/books/beginning-statistics/s15-chi-square-tests-and-f-tests.html
33. The t-Distribution
โข If ๐~๐ต ๐, ๐ and ๐ฟ~ and two random variables
๐ and ๐ฟ are independent then the random variable
๐ =
๐
๐ฟ
๐
=
๐. ๐
๐ฟ
follows studentโs t-distribution (t-distribution) with ๐ degree of
freedom. For a sample size ๐ we have ๐ ๐ = ๐ = ๐ โ ๐.
โข The mean value and standard deviation of this distribution are
๐ =
๐ ๐ > ๐
๐๐๐ ๐๐๐๐๐๐ ๐ = ๐, ๐
๐ =
๐โ๐
๐โ๐
๐ > ๐
โ ๐ = ๐
๐๐๐ ๐๐๐๐๐๐ ๐ = ๐, ๐
)2,(2
kkk๏ฃ
34. The t-Distribution
โข The t-distribution like the standard normal distribution is a bell-
shaped and symmetrical distribution with zero mean (n>2) but it is
flatter but as the degree of freedom increases (or ๐ increases)it
approaches the standard normal distribution and for ๐โฅ๐๐ their
behaviours are similar.
โข From the table (next slide)
๐ ๐ก = 1.706 ๐๐ =26 = 0.05 โ 5% or ๐ก0.05,26 = 1.706
Adoptedfromhttp://education-
portal.com/academy/lesson/what
-is-a-t-test-procedure-
interpretation-
examples.html#lesson
= ๐. ๐๐๐
5%
36. The F Distribution
โข If ๐1~ and ๐2~ and ๐1 and ๐2 are independent then the
random variable
๐น =
๐1
๐1
๐2
๐2
follows F distribution with ๐1 and ๐2 degrees of freedom, i.e.:
๐น~๐น๐1,๐2
or ๐น~๐น(๐1, ๐2)
โข This distribution is skewed to
the right as the Chi-Square
distribution but as ๐1 and ๐2
increase (๐ โ โ) it approaches
to normal distribution.
2
2k๏ฃ2
1k๏ฃ
Adoptedfrom
http://www.vosesoftware.com/ModelRiskHelp/index.htm#Dis
tributions/Continuous_distributions/F_distribution.htm
37. The F Distribution
โข The mean and standard deviation of the F distribution are:
๐ =
๐2
๐2โ2
๐๐๐ (๐2 > 2) and
๐ =
๐2
๐2โ2
2(๐1+๐2โ2)
๐1(๐2โ4)
๐๐๐ (๐2 > 4)
โข Relation between t & Chi-Square Distributions with F distribution:
โข For a random variable ๐~๐ก ๐it can be shown that ๐2~๐น1,๐. This can
also be written as
๐ก ๐
2 = ๐น1,๐
โข If ๐2 is large enough, then
๐1. ๐น๐1,๐2
~
2
1k๏ฃ
38. ๐ผ = 0.25
All adopted from
http://www.stat.purdue.edu/~yuzhu/stat514s05/tab
les.html
43. Statistical Inference (Estimation)
โข Statistical inference or statistical induction is one of the most
important aspect of decision making and it refers to the process of
drawing a conclusion about the unknown parameters of the
population from a sample of randomly chosen data.
โข So, the idea is that a sample of randomly chosen data provides the
best information about parameters of the population and it can be
considered as a representative of the population when its size
reasonably (appropriately) large.
โข The first step in statistical inference (induction) is estimation which
is the process of finding an estimate or approximation for the
population parameters (such as mean value and standard deviation)
using the data in the sample.
44. Statistical Inference (Estimation)
โข The value of ๐ฟ (sample mean) in a randomly chosen and
appropriately large sample is a good estimator of the population
mean ๐ . The value of ๐ ๐
(sample variance) is also a good estimator
of the population variance ๐ ๐.
โข Before taking any sample from population (when the sample is not
realised or observed) we can talk about the probability distribution
of a hypothetical sample. The probability distribution of a random
variable ๐ in a hypothetical sample follows the probability
distribution of the population even if the sampling process is
repeated for many times.
โข But the probability distribution of the sample mean ๐ฟ in repeated
sampling does not necessarily follow the probability distribution of
its population when number of sampling increases.
45. Central Limit Theorem
โข Central Limit Theorem:
Imagine random variable ๐ฟ with any probability distribution is defined
in a population with the mean ๐ and the variance ๐ ๐. If we get
๐ independent samples ๐ฟ ๐, ๐ฟ ๐, โฆ , ๐ฟ ๐ and for each sample we
calculate the mean values ๐ฟ ๐, ๐ฟ ๐, โฆ , ๐ฟ ๐(see figure below)
๐ฟ~๐. ๐. ๐ (๐, ๐ ๐
)
๐ฟ ๐
๐ฟ ๐
โฎ
๐ฟ ๐
๐. ๐. ๐ โกIndependent &
Identically Distributed RVs
46. Central Limit Theorem
As the number of sampling increases infinitely, the random variable ๐ฟ
has a normal distribution (regardless of the population distribution)
and we have
๐ฟ~๐ต ๐,
๐ ๐
๐
when ๐ โ +โ
And in the standard form:
๐ =
๐ฟ โ ๐ ๐ฟ
๐ ๐ฟ
=
๐ฟ โ ๐
๐
๐
=
๐ ๐ฟ โ ๐
๐
~๐ต(๐, ๐)
o Taking sample of 36 elements from a population with the mean of 20 and
standard deviation of 12, what is the probability that the sample mean
falls between 18 and 24?
๐ 18 < ๐ฅ < 24 = ๐ โ1 <
๐ฅ โ 20
12
36
< 2 = 0.3413 + 0.4772 โ 82%
47. Estimation
โข In previous slides we introduced some of the most important
probability distributions for discrete & continuous random variables.
โข In many cases we know the nature of the probability distribution of
a random variable, defined in a population, but have no idea about
its parameters such as mean value or/and standard deviation.
โข Point Estimation:
โข To estimate the unknown parameters of a probability distribution of
a random variable we can either have a point estimation or an
interval estimation using an estimator.
โข The estimator is a function of the sample values ๐ ๐, ๐ ๐, โฆ , ๐ ๐ and it
is often called a statistic. If ๐ฝ represent that estimator we have:
๐ฝ = ๐(๐ ๐, ๐ ๐, โฆ , ๐ ๐)
48. Estimation
โข ๐ฝ is said to be an unbiased estimator of true ๐ฝ (parameter of the
population) if ๐ฌ ๐ฝ = ๐ฝ. Because the bias itself is defined as
๐ฉ๐๐๐ = ๐ฌ ๐ฝ โ ๐ฝ
o For example, the sample mean ๐ฟ is a point and unbiased estimator
for the unknown parameter ๐ (population mean):
๐ฝ = ๐ฟ = ๐ ๐ ๐, ๐ ๐, โฆ , ๐ ๐ =
๐
๐
๐ ๐ + ๐ ๐ + โฏ + ๐ ๐
It is unbiased because ๐ฌ ๐ฟ = ๐.
49. โข The sample variance in the form of ๐ ๐
=
๐ ๐โ ๐ ๐
๐
is a point but a
biased estimator of the population variance ๐ ๐
in a small sample:
๐ฌ ๐ ๐
= ๐ ๐
(๐ โ
๐
๐
) โ ๐ ๐
But it is a consistent estimator because it will approaches to ๐ ๐when the
sample size ๐ increases indefinitely (๐ โ โ)
โข With Besselโs correction (changing ๐ to (๐ โ ๐)) we can define
another sample variance which is unbiased even for small sample
size.
๐ ๐ =
๐๐ โ ๐ ๐
๐ โ ๐
โข The methods of finding point estimators are mostly least-square
method and maximum likelihood method which among them the first
method will be discussed later.
Estimation
50. Interval Estimation
โข Interval Estimation:
โข Interval estimation, in contrary, provides an interval or a range of
possible estimates at a specific level of probability, which is called
level of confidence, within which the true value of the population
parameter may lie.
โข If ๐ฝ ๐ and ๐ฝ ๐ are respectively the lowest and highest estimates of ๐ฝ
the probability that ๐ฝ is covered by the interval ๐ฝ ๐, ๐ฝ ๐ is:
๐๐ซ ๐ฝ ๐ โค ๐ฝ โค ๐ฝ ๐ = ๐ โ ๐ถ (0 < ๐ผ < 1)
Where ๐ โ ๐ถ is the level of confidence and ๐ถ itself is called level of
significance. The interval ๐ฝ ๐, ๐ฝ ๐ is called confidence interval.
51. Interval Estimation
๏ง How to find ๐ฝ ๐ ๐๐๐ ๐ฝ ๐?
In order to find the lower and upper limits of a confidence interval we need to
have a prior knowledge about the nature of distribution of the random variable
in the population.
๏ฑ If random variable ๐ is normally distributed in the population and the
population standard deviation (๐) is known, the 95% confidence interval for
the unknown population mean (๐) can be constructed by finding the
symmetric z-values associated to 95% area under the standard normal
curve:
๐ โ ๐ถ = ๐๐% โ ๐ถ = ๐% โ
๐ถ
๐
= ๐. ๐%
So, ยฑ๐ ๐.๐๐๐ = ยฑ๐. ๐๐
We know that: ๐ =
๐ฟโ๐ ๐ฟ
๐ ๐ฟ
=
๐ฟโ๐
๐
๐
, so:
๐ท(โ๐ ๐ถ
๐
โค ๐ โค ๐ ๐ถ
๐
) = ๐๐%
Adopted & altered from http://upload.wikimedia.org/wikipedia/en/b/bf/NormalDist1.96.png
=1โ๐ผ
๐ถ
๐
= ๐. ๐๐๐
๐ถ
๐
= ๐. ๐๐๐
โ๐ ๐ถ
๐
= = ๐ ๐ถ
๐
52. Interval Estimation
โข So we can write:
๐ท ๐ โ ๐. ๐๐๐ ๐ โค ๐ โค ๐ + ๐. ๐๐๐ ๐ = ๐. ๐๐
Or
๐ท ๐ โ ๐. ๐๐
๐
๐
โค ๐ โค ๐ + ๐. ๐๐
๐
๐
= ๐. ๐๐
Therefore, the interval ๐ โ ๐. ๐๐
๐
๐
, ๐ + ๐. ๐๐
๐
๐
represents a 95%
confidence interval (๐ถ๐ผ95%)of the unknown value of ๐.
It means in repeated random
sampling (for 100 times) we
expect 95 out of 100 intervals,
such as the above, cover the
unknown value of the
population mean ๐ .
๐ ฬ โ ๐.๐๐ ๐/โ๐ = = ๐ ฬ โ ๐.๐๐ ๐/โ๐
Adopted and altered from http://forums.anarchy-online.com/showthread.php?t=604728
53. Interval Estimation for population Proportion
๏ฑ A confidence interval can be constructed for the population
proportion (see the graph below)
๐~๐ต๐(๐๐, ๐๐ 1 โ ๐ )
๐ ๐
๐ ๐2
๐ ๐
โฎ
๐ ๐
๐ ๐ = ๐ฌ ๐ = ๐ =
๐
๐
๐ ๐
๐
= ๐๐๐ ๐ =
๐ ๐
๐ ๐
=
๐(๐ โ ๐)
๐
๐ in each sample
represents a
sample proportion.
In repeated random
sampling ๐ has its
own probability
distribution with
mean value and
variance
54. Interval Estimation for population Proportion
โข The 90% confidence interval for the population proportion ๐ when
sample size is bigger than 30 (n>30) and there is no information
about the population variance will be constructed as following:
ยฑ๐ ๐ถ
๐
=
๐ โ ๐
๐(๐ โ ๐)
๐
๐ท(โ๐ ๐ถ
๐
โค ๐ โค +๐ ๐ถ
๐
) = ๐ โ ๐ถ
๐ท( ๐ โ ๐ ๐ถ
๐
.
๐(๐โ ๐)
๐
โค ๐ โค ๐+๐ ๐ถ
๐
.
๐(๐โ ๐)
๐
) = ๐. ๐
So, the confidence interval can be simply
written as:
๐ช๐ฐ ๐๐% = ๐ โ ๐. ๐๐๐
๐(๐ โ ๐)
๐ =90% ๐ถ
๐ = ๐. ๐๐๐ถ
๐ = ๐. ๐๐
โ๐ ๐ถ
๐
= โ๐. ๐๐๐ ๐ ๐ถ
๐
= ๐. ๐๐๐
Obviously, if we had
knowledge about the
population variance we
were be able to estimate
the population
proportion ๐ directly.
Why?
Adopted and altered fromhttp://www.stat.wmich.edu/s216/book/node83.html
55. Examples
o Imagine the weight of people in a society distributed normally. A
random sample of 25 with the sample mean 72 kg is taken from this
society. If the standard deviation of the population is 6 kg find a)the
90% b)95% and c) 99% confidence interval for the unknown
population mean.
a) 1 โ ๐ผ = 0.9 โ
๐ผ
2
= 0.05 โ ๐ ๐ผ
2
= 1.645
So, ๐ถ๐ผ90% = 72 ยฑ 1.645 ร
6
25
= 70.03 , 73.97
b) 1 โ ๐ผ = 0.95 โ
๐ผ
2
= 0.025 โ ๐ ๐ผ
2
= 1.96
So, ๐ถ๐ผ95% = 72 ยฑ 1.96 ร
6
25
= 69.65 , 74.35
c) 1 โ ๐ผ = 0.99 โ
๐ผ
2
= 0.005 โ ๐ ๐ผ
2
= 2.58
So, ๐ถ๐ผ99% = 72 ยฑ 2.58 ร
6
25
= 68.9 , 75.1
56. Examples
o Samples from one of the lines of production in a factory suggests
that 10% of products are defective. If the range of 1% difference
between sample and population proportion is acceptable what
sample size we need to construct a 95% confidence interval for the
population proportion? What about if the acceptable gap between
sample & population proportion increased to 3%?
1 โ ๐ผ = 0.95 โ
๐ผ
2
= 0.025 โ ๐ ๐ผ
2
= 1.96
๐ ๐ผ
2
=
๐ โ ๐
๐(1 โ ๐)
๐
โ 1.96 =
0.01
0.1 ร 0.9
๐
โ ๐ = 196 ร 0.3 2 โ 3458
If the gap increases to 3% then:
1.96 =
0.03
0.1ร0.9
๐
โ ๐ = 196 ร 0.1 2 โ 385
57. Interval Estimation (Using t-distribution)
โข If the population standard deviation ๐ is unknown and we use
sample standard deviation ๐ instead, and the size of the sample is
less than 30 (๐ < ๐๐) then the random variable
๐ โ ๐
๐
๐
~๐ ๐โ๐
has t-distribution with ๐ ๐ = ๐ โ ๐.
This means a confidence interval for the population mean ๐ will be in
the form of:
๐ช๐ฐ(๐โ๐ถ) = ๐ โ ๐ ๐ถ
๐,๐โ๐
๐
๐
, ๐ + ๐ ๐ถ
๐,๐โ๐
๐
๐
โ๐ ๐ถ
๐
,๐โ๐
๐ ๐ถ
๐
,๐โ๐
1 โ ๐ผ % ๐ถ
๐
๐ถ
๐
Adopted and altered from http://cnx.org/content/m46278/latest/?collection=col11521/latest
58. Interval Estimation
โข The following flowchart can help to choose between Z and t-
distributions when the interval estimation is constructed for ๐ in
the population.
Use
nonparametric
methods
Adopted from http://www.expertsmind.com/questions/flow-chart-for-confidence-interval-30112489.aspx
59. Interval Estimation
โข Here there is a list of confidence intervals for the subject parameters
in the population.
Adopted from http://www.bls-stats.org/uploads/1/7/6/7/1767713/250709.image0.jpg
60. Hypothesis Testing
โข Hypothesis testing is one of the important aspects of statistical inference.
The main idea is to find out if some claims/statements (in the form of
hypothesis) about population parameters can be statistically rejected by
the evidence from the sample using a test statistic (a function of sample).
โข Claims can be made in the form of null hypothesis (๐ป0) against the
alternative hypothesis (๐ป1) and they are just rejectable. These two
hypotheses should be mutually exclusive and collectively exhaustive. For
example:
๐ป0: ๐ = 0.8 ๐๐๐๐๐๐ ๐ก ๐ป1: ๐ โ 0.8
๐ป0: ๐ โฅ 2.1 ๐๐๐๐๐๐ ๐ก ๐ป1: ๐ < 2.1
๐ป0: ๐2
โค 0.4 ๐๐๐๐๐๐ ๐ก ๐ป1: ๐2
> 0.4
๏ Always remember that the equality sign comes with ๐ป0.
โข If the value of the test statistic lies in the rejection area(s) the null
hypothesis must be rejected, otherwise the sample does not
provide sufficient evidence to reject the null hypothesis.
61. Hypothesis Testing
โข Assuming we know the distribution of the random variable in the
population and also having statistical independence between different
random variables, in hypothesis testing we need to follow the following
steps:
1. Stating the relevant null & alternative hypotheses. The state of the null
hypothesis (being =, โฅ, โค something)indicates how many rejection
regions we will have (for = sign we will have two regions and for others
just one region; depending on the difference between the value of
estimator and the claimed value for the population parameter the
rejection area could be on the right or left of the distribution curve).
๐ป0: ๐ = 0.5
๐ป1: ๐ โ 0.5
๐ป0: ๐ โฅ 0.5 (๐๐ ๐ โค 0.5)
๐ป1: ๐ < 0.5 (๐๐ ๐ > 0.5)
GraphsAdopted from http://www.soc.napier.ac.uk/~cs181/Modules/CM/Statistics/Statistics%203.html
62. Hypothesis Testing
2. Identifying the level of significance of the test (๐ถ) and it is usually
considered to be 5% or 1%, depending on the nature of the test and the
goals of researcher. When ๐ถ is known with the prior knowledge about
the sample distribution, the critical region(s) (or rejection area(s)) can be
identified.
Here we have two
critical values for
standard normal
distributions
associated to the level
of significance ๐ผ =
5% and ๐ผ = 1%
Adoptedfrom http://www.psychstat.missouristate.edu/introbook/sbk26.htm
๐ ๐ผ=1.65
๐ ๐ผ=2.33
63. Hypothesis Testing
3. Constructing a test statistic (a function based on the sample distribution &
sample size). This function is used to decide whther or not to reject ๐ฏ ๐.
TableAdoptedfromhttp://www.bls-stats.org/uploads/1/7/6/7/1767713/250714.image0.jpg
Here we
have a list
of some of
the test
statistics
for testing
different
hypotheses
64. Hypothesis Testing
4. Taking a random sample from the population and calculating the value of
the test statistic. If the value is in the rejection area the null hypothesis ๐ฏ ๐
will be rejected in favour of the alternative ๐ฏ ๐at the predetermined
significance level ๐ถ, otherwise the sample does not provide sufficient
evidence to reject ๐ฏ ๐ (this does not mean that we accept ๐ฏ ๐)
Adoptedfrom http://www.onekobo.com/Articles/Statistics/03-Hypotheses/Stats3%20-%2010%20-%20Rejection%20Region.htm
โ๐ ๐ถ ๐๐ โ ๐ ๐ถ,๐ ๐ if there is a left-tail test
โ๐ ๐ถ
๐
๐๐ โ ๐ ๐ถ
๐
,๐ ๐ if there is a two-tail test
+๐ ๐ถ ๐๐ + ๐ ๐ถ,๐ ๐ if there is a right-tail test
+๐ ๐ถ
๐
๐๐ + ๐ ๐ถ
๐
,๐ ๐ if there is a two-tail test
65. Example
o A chocolate factory claims that its new tin of cocoa powder contains at
least 500 gr of the powder. A standard checking agency takes a random
sample of ๐ = 25 of the tins and found out that sample mean weight of
tins is ๐ = 520 ๐๐ and the sample standard deviation is ๐ = 75 ๐๐. If we
assume the weight of cocoa powder in tins has a normal distribution,
does the sample provide enough evidence to support the claim at 95%
level of confidence?
1. ๐ป0: ๐ โฅ 500
๐ป1: ๐ < 500 (so, it is a one-tail test)
2. Level of significance ๐ผ = 5% โ ๐ก ๐ผ
2
,(๐โ1)
= ๐ก0.05,24 = 1.711 (it is t-
distribution because ๐ < 30 and we do not have a prior knowledge
about the population standard deviation)
3. The value of the test statistics is : ๐ก =
๐โ๐
๐
๐
=
520โ500
75
25
= 1.33
4. As 1.33 < 1.711 we are not in the rejection area so, the claim cannot be
rejected at 5% level of significance.
66. Type I & Type II Errors
โข Two types of errors can occur in hypothesis testing:
A. Type I error; when based on our sample we reject a true null
hypothesis.
B. Type II error; when based on our sample we cannot reject a false
null hypothesis.
โข By reducing the level of significance ๐ถ we can reduce the
probability of making type I error (why?) however, at the same
time, we increase the probability of making type II error.
โข What would happen to type I and type II errors if we increase the
sample size? (Hint: look at the confidence intervals)
Adoptedfrom http://whatilearned.wikia.com/wiki/Hypothesis_Testing?file=Type_I_and_Type_II_Error_Table.jpg
67. Type I & Type II Errors
โข The following graph shows how a change of the critical line (critical
value) changes the probability of making type I and type II errors:
๐ท ๐ป๐๐๐ ๐ฐ ๐๐๐๐๐ = ๐ถ
And
๐ท ๐ป๐๐๐ ๐ฐ๐ฐ ๐๐๐๐๐ = ๐ท
Adoptedfrom http://www.weibull.com/hotwire/issue88/relbasics88.htm
The Power Of a Test:
The power of a test is
the probability that the
test will correctly reject
the null hypothesis. It is
the probability of not
committing type II
error. The power is
equal to ๐ โ ๐ท which
means by reducing ๐ท
the power of the test
will increase.
68. The P-Value
โข It is not unusual to reject ๐ป0 at some level of significance, for
example ๐ผ = 5% , but being unable to reject it at some other
levels, e.g. ๐ผ = 1% . The dependence of the final decision to the
value of ๐ผ is the weak point of the classical approach.
โข In the new approach, we try to find p-value which is the lowest
significance level at which ๐ป0 can be rejected. If the level of
significance is determined at 5% and the lowest significance level at
which ๐ป0 can be rejected (p-value) is 2% so the null hypothesis
should be rejected; i.e.
๐ โ ๐๐๐๐๐ < ๐ถ
๏ To understand this concept better letโs look at an example:
โข Suppose we believe that the mean life expectancy of the people in
a city is 75 years (๐ป0: ๐ = 75). But our observation shows a sample
mean of 76 years for a sample size of 100 with a sample variance of
4 years.
Reject ๐ป0
69. The P-Value
โข The Z-score (test statistic) can be calculated as following:
โข At 5% level of significance the critical Z-value is 1.96 so we must
reject ๐ฏ ๐. But, we should not have had this result (or should not
have had those observations in our random sample) from the
beginning if our assumption about the population mean ๐ was
correct.
โข The p-value is the probability of
having these type of results
or even worse than that (i.e. a Z-score
bigger than 2.5) considering the null
hypothesis was correct,
๐ท(๐ โฅ ๐. ๐ ๐ = ๐๐) = ๐ โ ๐๐๐๐๐ โ ๐. ๐๐๐ (it means in 1000 samples this type of
results can happen theoretically 6 times; but it has happened in our first random
sampling).
๐ =
๐ โ ๐
๐
๐
=
76 โ 75
4
100
= 2.5
Z=2.5
๐ท ๐ โฅ ๐. ๐
โ ๐. ๐๐๐
http://faculty.elgin.edu/dkernler/statistics/ch10/10-2.html
70. The P-Value
โข As we cannot deny what we have observed and obtained from the
sample, eventually we need to change our belief about the
population mean and reject our assumption about that.
โข The smaller the p-value, the stronger evidence against ๐ป0.