Inferential Statistics.pdf

Dr. Shivakumar B. N.
Assistant Professor
Department of Mathematics
CMR Institute of Technology
Bengaluru
Inferential Statistics

• Definition of null, Alternate, Simple and composite
hypothesis;
• Level of significance;
• Type I and type II errors;
• Testing equality of single and two means (large
samples), Single and two proportions;
• Independence of attributes.

Definition
Inferential statistics takes data from a sample and
makes inferences about the larger population
from which the sample was drawn.

Gather data
Statistical Inference
Draw conclusions
Ask a question
Analyse data
Gather data
Gather data
Census
Sample
Sample
Can we reliably use the results from a single
sample to make conclusions about a
population?

Population: The totality of units (objects), having some common characteristic
of interest, under consideration for a statistical investigation, is called
population.
It might be finite or infinite. The size of finite sample is denoted by N.
Example: If we are undertaking a statistical investigation about ‘monthly
expenditure on food’, all the households in a city will constitute the population.
Parameters: The characteristics that describe the population like mean, variance
etc., are called the parameters of the population.

Sample: Small portion of the population.
Example: Let us assume that there are 1000 households in the
city and we randomly select only 50 households for the study.
These 50 households form a sample size 50.
Statistic: The characteristics that describe the sample, like the
arithmetic mean, and the sample variance are called statistics.

Statistical Hypothesis
A statistical hypothesis is an assertion or conjecture about the distribution of one or more
random variables. If a statistical hypothesis completely specifies the distribution, it is referred
to as a simple hypothesis, if not, it is referred to as a composite hypothesis.
A statistical hypothesis is denoted by H.
Example:
1. H: The population is normall distributed with parameter 𝜇 = 50 and 𝜎2 = 9
This a simple hypothesis since, not only is the functional form of the distribution specified, but
also the values of all parameters.
2. 𝐻: 𝜃 ≥ 10,000
This a composite hypothesis since, 𝜃 > 10,000 does not assign a specific value to parameter 𝜃,
nor does it specify the functional form of the distribution.

Null and alternate hypothesis
The hypothesis which is being tested statistically for possible rejection
is called the null hypothesis and is usually denoted by 𝐻0.
The rejection of the null hypothesis, 𝐻0, leads to the acceptance of an
alternative hypothesis, denoted by 𝐻1.

Type I error: Taking a wrong decision to reject the null hypothesis when it is
actually true is called the error of the first kind or type I error.
Type II error: Taking a wrong decision to accept the null hypothesis when it is
actually not true is called the error of the second kind or type II error.
Level of significance:
The probability of occurrence of the first kind of error is called the level of
significance and is denoted by 𝛼
Note:
• If 𝛼 = 0.05, the critical value 𝑘 = 1.96 and
• If 𝛼 = 0.01, the critical value 𝑘 = 2.58

One-Tailed Test
A one-tailed test is a statistical test in which the critical area of a distribution is one-sided so
that it is either greater than or less than a certain value, but not both.
The Basics of a One-Tailed Test
• A basic concept in inferential statistics is hypothesis testing.
• Hypothesis testing is run to determine whether a claim is true or not, given a population
parameter.
• When the testing is set up to show that the sample mean would be higher or lower than the
population mean, it is referred to as a one-tailed test.

KEY TAKEAWAYS
•A one-tailed test is a statistical hypothesis test set up to show that
the sample mean would be higher or lower than the population mean,
but not both.
•When using a one-tailed test, the analyst is testing for the possibility
of the relationship in one direction of interest, and completely
disregarding the possibility of a relationship in another direction.
•Before running a one-tailed test, the analyst must set up a null
hypothesis and an alternative hypothesis and establish a probability
value (p-value).

Test on two means (Tests for equality of means)
𝑍 =
𝑥1 − 𝑥2
𝜎1
2
𝑛1
+
𝜎2
2
𝑛2
If, for the samples, 𝑍 𝑐𝑎𝑙 > 𝑘, 𝐻0 is rejected and if, on the other hand, 𝑍 𝑐𝑎𝑙 ≤ 𝑘, 𝐻0 is accepted.
Level of significance:
• If 𝛼 = 0.05, the critical value 𝑘 = 1.96 and
• If 𝛼 = 0.01, the critical value 𝑘 = 2.58

Example: It is known that IQ of boys has standard deviation 10 and that IQ of girls has standard deviation 11. Mean IQ of 100
randomly selected boys is 95 and Mean IF of 80 randomly selected girls is 97. Can it be concluded that on an average, boys and
girls have the same IQ? (use 1% level of significance)
𝑍 =
𝑥1 − 𝑥2
𝜎1
2
𝑛1
+
𝜎2
2
𝑛2

Test for proportion
Suppose the proportion of an attribute in a population is not known
and we want to test weather the proportion is given value 𝑃0.
𝑍 =
𝑝 − 𝑝0
𝑝0𝑄0
𝑛
Where
𝑝 =
𝑥
𝑛

Example: The manufacturers of a certain brand of pens opined that 35% of the pens users in Bengaluru used their brand of pens.
To verify this claim, a survey of pen users was conducted. Among 347 of them 107 people said they used particular brand. Does
this figure support the manufacturers claim.
𝑍 =
𝑝 − 𝑝0
𝑝0𝑄0
𝑛
𝑝 =
𝑥
𝑛

Test for equality of proportion
Suppose there are two population with unknown proportions and we
wish to test weather the proportions in the two populations are
equal.
The null hypothesis is 𝐻0: 𝑃1 = 𝑃2 (the proportions are equal)
The alternative hypothesis is 𝐻1 = 𝑃1 ≠ 𝑃2
𝑍 =
𝑝1 − 𝑝2
𝑃𝑄
1
𝑛1
+
1
𝑛2
Where
𝑃 =
𝑥1 + 𝑥2
𝑛1 + 𝑛2

Example: In a random sample consisting 326 teenagers, 143 claimed to watch National Geographic channel regularly.
Among a random sample of 213 adults 137 watch it regularly. Test weather the proportion teenage viewers differs from
the adult viewers. (Use 5% level of significance)
𝑝1 =
𝑥1
𝑛1
and 𝑝2 =
𝑥2
𝑛2
𝑃 =
𝑥1 + 𝑥2
𝑛1 + 𝑛2
𝑄 = 1 − 𝑃
𝐻0: 𝑃1 = 𝑃2 = 𝑃
𝐻1: 𝑃1 ≠ 𝑃2

TESTS BASED ON t-DISTRIBUTION FOR SMALL SAMPLES
1. Test for single mean
𝒕 =
𝒙 = 𝝁
𝒔/ 𝒏
Where 𝒔 =
σ 𝒙𝟏−𝒙 𝟐
𝒏−𝟏
For the sample if 𝒕 𝒄𝒂𝒍 > 𝒌, 𝑯𝟎 is rejected
If 𝒕 𝒄𝒂𝒍 ≤ 𝒌, 𝑯𝟎 is accepted
The critical value k for level of significance 𝜶 is
𝒌 = 𝒕𝜶/𝟐, 𝒏 − 𝟏

Example 1: A manufacturer manufactures a kind of an axle with a specified diameter 0.700 inch. A random sample of 10
parts shows a mean diameter 0.742 inches with a standard deviation of 0.040 inches. Is the manufacture producing goods
that meet specification? (Test at 5% level of significance).
𝜇 = 0.700 𝑖𝑛𝑐ℎ𝑒𝑠, 𝑠 = 0.040, 𝑥 = 0.742 𝑖𝑛𝑐ℎ𝑒𝑠
𝑛 = 10 𝑖. 𝑒. , 𝑑. 𝑓 = 𝑛 − 1 = 9
1. 𝐻0: 𝜇 = 0.700 𝑖𝑛𝑐ℎ𝑒𝑠
𝐻1: 𝜇 ≠ 0.700 inches and
𝛼 = 0.05
2. The test statistic is
𝒕 =
𝒙 − 𝝁
𝒔/ 𝒏
The critical value 𝒌 = 𝒕𝜶/𝟐, 𝒏 − 𝟏 = 𝒕𝟎.𝟎𝟐𝟓,𝟗
= 𝟐. 𝟐𝟔𝟐
3. The value of the test statistic using the sample is
𝑡 𝑐𝑎𝑙 = 3.32

Example 2: A sample of 26 bulbs gives a mean life of 990 hours with a standard deviation of 20 hours. The manufacturer
claims that the mean life of bukbs is 1000 hours. Is the sample not upto the standard?
𝜇 = 1000, 𝑠 = 20, 𝑥 = 990
𝑛 = 26 𝑖. 𝑒. , 𝑑. 𝑓 = 𝑛 − 1 = 25
1. 𝐻0: 𝜇 = 1000
𝐻1: 𝜇 ≠ 10000 and
𝛼 = 0.05
2. The test statistic is
𝒕 =
𝒙 − 𝝁
𝒔/ 𝒏
The critical value 𝒌 = 𝒕𝜶/𝟐, 𝒏 − 𝟏
3. The value of the test statistic using the sample is
𝑡 𝑐𝑎𝑙 = 2.5

INTRODUCTION
❑ The chi-square test is an important test amongst the several tests of significance
developed by statisticians.
❑ It was developed by Karl Pearson in1900.
❑ CHI SQUARE TEST is a non parametric test not based on any assumption or
distribution of any variable.
❑ This statistical test follows a specific distribution known as chi square distribution.
❑ In general the test we use to measure the differences between what is observed
and what is expected according to an assumed hypothesis is called the chi-square
test.
In particular, for 𝜶 = 𝟎. 𝟎𝟓, 𝒌 = 𝟑. 𝟖𝟒 and for 𝜶 = 𝟎. 𝟎𝟏, 𝒌 = 𝟔. 𝟔𝟑

Example 1: Suppose out of 100 participants participating in an awareness program camp, there were 60 girls and
40 boys. These results refer to observed frequencies and are denoted by O.
Observed Expected 𝑶 − 𝑬 𝑶 − 𝑬 𝟐
Boys 40 50 -10 100
Girls 60 50 10 100
𝜒2
value for girls and boys will be:
=
100
50
+
100
50
= 2 + 2 = 𝟒 > 𝟑. 𝟖𝟒𝟏
Therefore the hypothesis that there is no difference between expected and observed
values, is rejected.

Example 1:
Leaf Cutter
Ants
Carpenter
Ants
Black Ants Total
Observed 25 18 17 60
Expected 20 20 20 60
O-E 5 -2 -3 0
(O-E)2
E
1.25 0.2 0.45 χ2 = 1.90
HO: Lizards eat equal amounts of leaf cutter, carpenter and black ants.
HA: Lizards eat more amounts of one species of ants than the others.
Calculate degrees of freedom: (c-1)(r-1) = 3-1 = 2
Under a critical value of your choice (e.g. α = 0.05 or 95% confidence),
look up Chi-square statistic on a Chi-square distribution table.

Example 1:
χ2
α=0.05 = 5.991

Example 1:
Chi-square statistic: χ2 = 5.991 Our calculated value: χ2 = 1.90
*If chi-square statistic > your calculated value, then you do not reject your
null hypothesis. There is a significant difference that is not due to chance.
5.991 > 1.90 ∴ We do not reject our null hypothesis.
Leaf Cutter
Ants
Carpenter
Ants
Black Ants Total
Observed 25 18 17 60
Expected 20 20 20 60
O-E 5 -2 -3 0
(O-E)2
E
1.25 0.2 0.45 χ2 = 1.90

Inferential Statistics.pdf

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Inferential Statistics.pdf

Similar to Inferential Statistics.pdf (20)

Recently uploaded

Recently uploaded (20)

Inferential Statistics.pdf