SlideShare une entreprise Scribd logo
1  sur  28
Télécharger pour lire hors ligne
Statistics Lab
Rodolfo Metulini
IMT Institute for Advanced Studies, Lucca, Italy

Lesson 3 - Point Estimate, Confidence Interval and Hypotesis
Tests - 16.01.2014
Introduction
Let’s start having empirical data (one variable of length N)
extracted from external file, suppose to consider it to be the
population. We define a sample of size n.
Suppose we do not have information on population (or, better, we
want to check if and how the sample can represent the
population)
We, in other words, want to make infererence using the
information contained in the sample, in order to obtain an
estimation for the population.
That sample is one of several samples we can randomly draw from
the population (the sample space).
What are the instruments to obtain infos about the population?
(1) Sample mean (point estimation) (2) Confidence interval (3)
Hypotesis tests
Sample space
In probability theory, the sample space of an experiment or random
trial is the set of all possible outcomes or results of that
experiment.
It is common to refer to a sample space by the labels S, Ω, or
U.
For example, for tossing two coins, the corresponding sample space
would be {(head,head), (head,tail), (tail,head), (tail,tail)}, so that
the dimension is 4. dim(Ω) = 4. It means that we can obtain 4
different samples with corresponding 4 different sample
means.
In pratice, we face up with only one sample took at random from
the sample space.
Point estimate
Point estimate permit us to summarize the information contained
in the population (dimension N), throughout only 1 value
constructed using n vales.
The most used, unbiased point estimator is the sample mean.
n
xi
ˆ
X n = 1=1
n
Other point estimators are: (1) Sample Median (2) Sample Mode
(3) Geometric mean.
Geometric Mean = Mg =

2
n
i=1 xi

1
= exp[ n

n
1=1 lnxi ]

An example of what is not an estimator is when you use the
sample mean after subsetting the sample truncating it on a certain
value.
P.S. A Naif definition of estimator: when the estimator is
computed using all the n informations in the sample.
Efficient estimators

The BLUE (Best Linear Unbiased Estimator) is defined as
follow:
1. is a linear function of all the sample values
ˆ
2. is unbiased (E (Xn ) = θ)
3. has the smallest sample variance among all unbiased
estimators.
The sample mean is BLUE for the parameter µ
Some estimators are biased but consistent: An estimator is
consistent when become unbiased for n −→ ∞
Point estimators - cases

ˆ
Normal samples: Xn is the BLUE estimator for µ parameter
(mean)
ˆ
Bernoulli samples f (x) = ρx (1 − ρ)1−x : Xn is a unbiased
estimator for ρ parameter (frequency)
e −k k x
ˆ
): Xn is a unbiased estimator
x!
for k parameter (which represent both mean and variance of
the distribution)
Poisson samples f (x) =

1
:is a unbiased
ˆ
Xn
estimator for λ parameter (density at value 0)

Exponential samples f (x) = λe −λy )
Confidence interval theory
With point estimators we make use of only one value to infer
about population.
With confidence interval we define a minimum and a maximum
value in which the population parameter we expect to lie.
Formally, we need to calculate:
σ
ˆ
µ1 = Xn − z ∗ √
n
σ
ˆ
µ2 = Xn + z ∗ √
n
and we end up with interval µ = {µ1 ; µ2 }
ˆ
ˆ
Here: Xn is the sample mean; z is the upper (or lower) critical
value of the theoretical distribution. σ is the standard deviation of
the theoretical distribution. n the sample size.
(See the graph)
Confidence interval theory - Gaussian
We will make some assumptions for what we might find in an
experiment and find the resulting confidence interval using a
normal distribution.
Let assume that the sample mean is 5, the standard deviation in
population is known and it is equal to 2, and the sample size is
n = 20. In the example below we will use a 95 per cent confidence
level and wish to find the confidence interval.
N.B. Here, since the confidence interval is 95, the z (the critical
value) to consider is the one corresponding with CDF (i.e. dnorm)
= 0.975.
We also can speak of α = 0.05, or 1 − α = 0.95, or
1 − α/2 = 0.975
Confidence interval theory - T-student
We use T − student distribution when n is small and sd is
unknown in population. We need to use a sample variance
estimation: σ =
ˆ

ˆ
(xi −Xn )2
n−1

The t-student distribution is more spread out.
In simple words, since we do not know the population sd, we need
for more large intervals (caution - approach).
The only difference with normal distribution, is that we use the
command associated with the t-distribution rather than the normal
distribution. Here we repeat the procedures above, but we will
assume that we are working with a sample standard deviation
rather than an exact standard deviation.
N.B. The T distribution is characterize by its degree of freedom. In
this test the degree aere equal to n − 1, because we use 1
estimation (1 constraint)
Confidence interval theory - comparison of two means

In some case we can have an experiment called (for example)
case-control.
Let’s imagine to have the population splitted in 2: one is the
treated group, the second is the non treated group.
Suppose to extract two samples from them with aim to test if the
two samples comes from a population with the same mean
parameter (is the treatment effective?)
The output of this test will be a confidence interval represting the
difference between the two means.
N.B. Here, the degree of freedom of the t-distribution are equal to
min(n1 , n2 ) − 1
Formulas
Gaussian confidence interval:
ˆ
µ = {µ1 , µ2 } = Xn ± z ∗
ˆ

σ
√
n

T - student confidence interval:
ˆ
µ = {µ1 , µ2 } = Xn ± tn−1 ∗
ˆ

σ
ˆ
√
n

T-student confidence interval for two sample difference:
ˆ
ˆ
µdiff = {µdiff 1 , µdiff2 } = (X1 − X2 ) ± tn−1 ∗ sd;
ˆ
where sd = sd1 ∗

sd1
n1

+ sd2 ∗

sd2
n2

Gussian confidence interval for proportion (bernoulli
distribution):
ρ = {ρ1 , ρ2 } = fˆ ± z ∗ sd;
ˆ
1
where sd =

ρ(1−ρ)
n2
Hypotesis testing
Researchers retain or reject hypothesis based on measurements of
observed samples.
The decision is often based on a statistical mechanism called
hypothesis testing.
A type I error is the mishap of falsely rejecting a null hypothesis
when the null hypothesis is true.
The probability of committing a type I error is called the
significance level of the hypothesis testing, and is denoted by the
Greek letter α (the same used in the confidence intervals).
We demonstrate the procedure of hypothesis testing in R first with
the intuitive critical value approach.
Then we discuss the popular p − value (and very quick) approach
as alternative.
Hypotesis testing - lower tail

The null hypothesis of the lower tail test of the population mean
can be expressed as follows:
µ ≥ µ0 ; where µ0 is a hypothesized lower bound of the true
population mean µ.
Let us define the test statistic z in terms of the sample mean, the
sample size and the population standard deviation σ:
z=

ˆ
Xn −µ0
√
σ/ n

Then the null hypothesis of the lower tail test is to be rejected if
z ≤ zα , where zα is the 100(α) percentile of the standard normal
distribution.
Hypotesis testing - upper tail

The null hypothesis of the upper tail test of the population mean
can be expressed as follows:
µ ≤ µ0 ; where µ0 is a hypothesized upper bound of the true
population mean µ.
Let us define the test statistic z in terms of the sample mean, the
sample size and the population standard deviation σ:
z=

ˆ
Xn −µ0
√
σ/ n

Then the null hypothesis of the upper tail test is to be rejected if
z ≥ z1−α , where z1−α is the 100(1 − α) percentile of the
standard normal distribution.
Hypotesis testing - two tailed

The null hypothesis of the two-tailed test of the population mean
can be expressed as follows:
µ = µ0 ; where µ0 is a hypothesized value of the true population
mean µ. Let us define the test statistic z in terms of the sample
mean, the sample size and the population standard deviation
σ:
z=

ˆ
Xn −µ0
√
σ/ n

Then the null hypothesis of the two-tailed test is to be rejected if
z ≤ zα/2 or z ≥ z1−α/2 , where zα/2 is the 100(α/2) percentile of
the standard normal distribution.
Hypotesis testing - lower tail with Unknown variance

The null hypothesis of the lower tail test of the population mean
can be expressed as follows:
µ ≥ µ0 ; where µ0 is a hypothesized lower bound of the true
population mean µ.
Let us define the test statistic t in terms of the sample mean, the
sample size and the sample standard deviation σ :
ˆ
t=

ˆ
Xn −µ0
√
σ/ n
ˆ

Then the null hypothesis of the lower tail test is to be rejected if
t ≤ tα , where tα is the 100(α) percentile of the Student t
distribution with n − 1 degrees of freedom.
Hypotesis testing - upper tail with Unknown variance

The null hypothesis of the upper tail test of the population mean
can be expressed as follows:
µ ≤ µ0 ; where µ0 is a hypothesized upper bound of the true
population mean µ.
Let us define the test statistic t in terms of the sample mean, the
sample size and the sample standard deviation σ :
ˆ
t=

ˆ
Xn −µ0
√
σ/ n
ˆ

Then the null hypothesis of the upper tail test is to be rejected if
t ≥ t1−α , where t1−α is the 100(1 − α) percentile of the Student
t distribution with n1 degrees of freedom.
Hypotesis testing - two tailed with Unknown variance

The null hypothesis of the two-tailed test of the population mean
can be expressed as follows:
µ = µ0 ; where µ0 is a hypothesized value of the true population
mean µ. Let us define the test statistic t in terms of the sample
mean, the sample size and the sample standard deviation σ :
ˆ
t=

ˆ
Xn −µ0
√
σ/ n
ˆ

Then the null hypothesis of the two-tailed test is to be rejected if
t ≤ tα/2 or t ≥ t1−α/2 , where tα/2 is the 100(α/2) percentile of
the Student t distribution with n − 1 degrees of freedom.
Lower Tail Test of Population Proportion

The null hypothesis of the lower tail test about population
proportion can be expressed as follows:
ρ ≥ ρ0 ; where ρ0 is a hypothesized lower bound of the true
population proportion ρ.
Let us define the test statistic z in terms of the sample proportion
and the sample size:
z=

ρ−ρ0
ˆ
ρ0 (1−ρ0 )
n

Then the null hypothesis of the lower tail test is to be rejected if
z ≤ zα , where zα is the 100(α) percentile of the standard normal
distribution.
Upper Tail Test of Population Proportion

The null hypothesis of the upper tail test about population
proportion can be expressed as follows:
ρ ≤ ρ0 ; where ρ0 is a hypothesized lower bound of the true
population proportion ρ.
Let us define the test statistic z in terms of the sample proportion
and the sample size:
z=

ρ−ρ0
ˆ
ρ0 (1−ρ0 )
n

Then the null hypothesis of the lower tail test is to be rejected if
z ≥ z1−α , where z1−α is the 100(1 − α) percentile of the standard
normal distribution.
Two Tailed Test of Population Proportion

The null hypothesis of the upper tail test about population
proportion can be expressed as follows:
ρ = ρ0 ; where ρ0 is a hypothesized true population
proportion.
Let us define the test statistic z in terms of the sample proportion
and the sample size:
z=

ρ−ρ0
ˆ
ρ0 (1−ρ0 )
n

Then the null hypothesis of the lower tail test is to be rejected if
z ≤ zα/2 or z ≥ z1−α/2
Sample size definition

The quality of a sample survey can be improved (worsened) by
increasing (decreasing) the sample size.
The formula below provide the sample size needed under the
requirement of population proportion interval estimate at (1 − α)
confidence level, margin of error E and planned parameter
estimation.
Here, z1−α/2 is the 100(1 − α/2) percentile of the standard normal
distribution.
For mean: n =

2
z1−α/2 ∗σ 2

E2

For proportion: n =

2
z1−α/2 ρ∗(1−ρ)

E2
Sample size definition - Exercises
Mean: Assume the population standard deviation σ of the
student height in survey is 9.48. Find the sample size needed
to achieve a 1.2 centimeters margin of error at 95 per cent
confidence level.
Since there are two tails of the normal distribution, the 95 per
cent confidence level would imply the 97.5th percentile of the
normal distribution at the upper tail. Therefore, z1−α/2 is
given by qnorm(.975).
Population: Using a 50 per cent planned proportion estimate,
find the sample size needed to achieve 5 per cent margin of
error for the female student survey at 95 per cent confidence
level.
Since there are two tails of the normal distribution, the 95 per
cent confidence level would imply the 97.5th percentile of the
normal distribution at the upper tail. Therefore, z1−α/2 is
given by qnorm(.975).
Homeworks
1: Confidence interval for the proportion. Suppose we have a
sample of size n = 25 of births. 15 of that are female. Define the
interval (at 99 per cent) for the proportion of female in the
population. HINT: Apply with the proper functions in R, the
formula in slide 11.
2: Hypotesis test to compare two proportions. Suppose we have
two schools. Sampling from the first, n = 20 and the Hispanics
students are 8. Sampling from the second, n = 18 and Hispanics
students are 4. Can we state (at 95 per cent) the frequency of
Hispanics are the same in the two schools? N.B.: the test here is
two tailed.
The hypotesis test here is:
z=
ρ=

ρ1 −ˆ2
ˆ ρ
sd ; where
(ρ1 ∗n1 +ρ2 +n2 )
n1 +n2

sd =

1
ρ(1 − ρ)[ n1 +

1
n2 ];
Charts - 1

Figure: Representation of the critical point for the upper tail hypotesis
test
Charts - 2

Figure: Representation of the critical point for the lower tail hypotesis
test
Charts - 3

Figure: Representation of the critical point for the two-tailed hypotesis
test
Charts - 4

Figure: Type I and Type II errors in hypotesis testing

Contenu connexe

Tendances

Statistical inference concept, procedure of hypothesis testing
Statistical inference   concept, procedure of hypothesis testingStatistical inference   concept, procedure of hypothesis testing
Statistical inference concept, procedure of hypothesis testingAmitaChaudhary19
 
Statistical inference
Statistical inferenceStatistical inference
Statistical inferenceJags Jagdish
 
Point estimation.pptx
Point estimation.pptxPoint estimation.pptx
Point estimation.pptxDrNidhiSinha
 
Powerpoint sampling distribution
Powerpoint sampling distributionPowerpoint sampling distribution
Powerpoint sampling distributionSusan McCourt
 
Maximum likelihood estimation
Maximum likelihood estimationMaximum likelihood estimation
Maximum likelihood estimationzihad164
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Harve Abella
 
Probability basics and bayes' theorem
Probability basics and bayes' theoremProbability basics and bayes' theorem
Probability basics and bayes' theoremBalaji P
 
Estimation in statistics
Estimation in statisticsEstimation in statistics
Estimation in statisticsRabea Jamal
 
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...nszakir
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testingo_devinyak
 

Tendances (20)

Confidence interval
Confidence intervalConfidence interval
Confidence interval
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Statistical inference concept, procedure of hypothesis testing
Statistical inference   concept, procedure of hypothesis testingStatistical inference   concept, procedure of hypothesis testing
Statistical inference concept, procedure of hypothesis testing
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Hypothesis testing Part1
Hypothesis testing Part1Hypothesis testing Part1
Hypothesis testing Part1
 
HYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.pptHYPOTHESIS TESTING.ppt
HYPOTHESIS TESTING.ppt
 
Statistical inference
Statistical inferenceStatistical inference
Statistical inference
 
Testing Hypothesis
Testing HypothesisTesting Hypothesis
Testing Hypothesis
 
Point estimation.pptx
Point estimation.pptxPoint estimation.pptx
Point estimation.pptx
 
Point estimation
Point estimationPoint estimation
Point estimation
 
Sufficient statistics
Sufficient statisticsSufficient statistics
Sufficient statistics
 
Basic concepts of probability
Basic concepts of probabilityBasic concepts of probability
Basic concepts of probability
 
Powerpoint sampling distribution
Powerpoint sampling distributionPowerpoint sampling distribution
Powerpoint sampling distribution
 
Maximum likelihood estimation
Maximum likelihood estimationMaximum likelihood estimation
Maximum likelihood estimation
 
Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)Estimation and hypothesis testing 1 (graduate statistics2)
Estimation and hypothesis testing 1 (graduate statistics2)
 
Probability basics and bayes' theorem
Probability basics and bayes' theoremProbability basics and bayes' theorem
Probability basics and bayes' theorem
 
Estimation in statistics
Estimation in statisticsEstimation in statistics
Estimation in statistics
 
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
Chapter 6 part1- Introduction to Inference-Estimating with Confidence (Introd...
 
Bayes Theorem
Bayes TheoremBayes Theorem
Bayes Theorem
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testing
 

En vedette

Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval EstimationShubham Mehta
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimationTech_MX
 
Point estimate for a population proportion p
Point estimate for a population proportion pPoint estimate for a population proportion p
Point estimate for a population proportion pMuel Clamor
 
Chapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample SizeChapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample SizeRose Jenkins
 
Chapter 3 Confidence Interval
Chapter 3 Confidence IntervalChapter 3 Confidence Interval
Chapter 3 Confidence Intervalghalan
 
Lesson 05 chapter 8 hypothesis testing
Lesson 05 chapter 8 hypothesis testingLesson 05 chapter 8 hypothesis testing
Lesson 05 chapter 8 hypothesis testingNing Ding
 
Chemical reaction and balancing chemical equation
Chemical reaction and balancing chemical equationChemical reaction and balancing chemical equation
Chemical reaction and balancing chemical equationInternational advisers
 
Chemical naming jeopardy
Chemical naming jeopardyChemical naming jeopardy
Chemical naming jeopardyzehnerm2
 
L10 confidence intervals
L10 confidence intervalsL10 confidence intervals
L10 confidence intervalsLayal Fahad
 
Alkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical propertiesAlkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical propertiesMRSMPC
 
10 naming and formula writing 2012
10 naming and formula writing 201210 naming and formula writing 2012
10 naming and formula writing 2012mrtangextrahelp
 
Writing and Naming formula
Writing and Naming formulaWriting and Naming formula
Writing and Naming formulaJeric Lazo
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststNor Ihsan
 
CABT SHS Statistics & Probability - Estimation of Parameters (intro)
CABT SHS Statistics & Probability -  Estimation of Parameters (intro)CABT SHS Statistics & Probability -  Estimation of Parameters (intro)
CABT SHS Statistics & Probability - Estimation of Parameters (intro)Gilbert Joseph Abueg
 

En vedette (20)

Point and Interval Estimation
Point and Interval EstimationPoint and Interval Estimation
Point and Interval Estimation
 
Theory of estimation
Theory of estimationTheory of estimation
Theory of estimation
 
Point estimate for a population proportion p
Point estimate for a population proportion pPoint estimate for a population proportion p
Point estimate for a population proportion p
 
Chapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample SizeChapter 7 – Confidence Intervals And Sample Size
Chapter 7 – Confidence Intervals And Sample Size
 
Confidence Intervals
Confidence IntervalsConfidence Intervals
Confidence Intervals
 
Chapter 3 Confidence Interval
Chapter 3 Confidence IntervalChapter 3 Confidence Interval
Chapter 3 Confidence Interval
 
Lesson 05 chapter 8 hypothesis testing
Lesson 05 chapter 8 hypothesis testingLesson 05 chapter 8 hypothesis testing
Lesson 05 chapter 8 hypothesis testing
 
Chemical reaction and balancing chemical equation
Chemical reaction and balancing chemical equationChemical reaction and balancing chemical equation
Chemical reaction and balancing chemical equation
 
Chemical naming jeopardy
Chemical naming jeopardyChemical naming jeopardy
Chemical naming jeopardy
 
L10 confidence intervals
L10 confidence intervalsL10 confidence intervals
L10 confidence intervals
 
Alkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical propertiesAlkanes ==names of each member , naming and physical and chemical properties
Alkanes ==names of each member , naming and physical and chemical properties
 
RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3RESEARCH METHODS LESSON 3
RESEARCH METHODS LESSON 3
 
10 naming and formula writing 2012
10 naming and formula writing 201210 naming and formula writing 2012
10 naming and formula writing 2012
 
Chemical nomenclature 1
Chemical nomenclature 1Chemical nomenclature 1
Chemical nomenclature 1
 
Writing and Naming formula
Writing and Naming formulaWriting and Naming formula
Writing and Naming formula
 
Unit 12 Chemical Naming and Formulas
Unit 12 Chemical Naming and FormulasUnit 12 Chemical Naming and Formulas
Unit 12 Chemical Naming and Formulas
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10stst
 
Point Estimation
Point EstimationPoint Estimation
Point Estimation
 
biostatistics basic
biostatistics basic biostatistics basic
biostatistics basic
 
CABT SHS Statistics & Probability - Estimation of Parameters (intro)
CABT SHS Statistics & Probability -  Estimation of Parameters (intro)CABT SHS Statistics & Probability -  Estimation of Parameters (intro)
CABT SHS Statistics & Probability - Estimation of Parameters (intro)
 

Similaire à Point Estimate, Confidence Interval, Hypotesis tests

hypothesisTestPPT.pptx
hypothesisTestPPT.pptxhypothesisTestPPT.pptx
hypothesisTestPPT.pptxdangwalakash07
 
Descriptive Statistics Formula Sheet Sample Populatio.docx
Descriptive Statistics Formula Sheet    Sample Populatio.docxDescriptive Statistics Formula Sheet    Sample Populatio.docx
Descriptive Statistics Formula Sheet Sample Populatio.docxsimonithomas47935
 
C2 st lecture 11 the t-test handout
C2 st lecture 11   the t-test handoutC2 st lecture 11   the t-test handout
C2 st lecture 11 the t-test handoutfatima d
 
Lect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadLect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadRione Drevale
 
Application of Statistical and mathematical equations in Chemistry Part 2
Application of Statistical and mathematical equations in Chemistry Part 2Application of Statistical and mathematical equations in Chemistry Part 2
Application of Statistical and mathematical equations in Chemistry Part 2Awad Albalwi
 
Small Sampling Theory Presentation1
Small Sampling Theory Presentation1Small Sampling Theory Presentation1
Small Sampling Theory Presentation1jravish
 
Pertemuan 10 new - Komputasi Statistik.pptx
Pertemuan 10 new - Komputasi Statistik.pptxPertemuan 10 new - Komputasi Statistik.pptx
Pertemuan 10 new - Komputasi Statistik.pptxSANDIPALAGALANA
 
Statistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesStatistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesLuca Massarelli
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handoutfatima d
 
Sampling distribution.pptx
Sampling distribution.pptxSampling distribution.pptx
Sampling distribution.pptxssusera0e0e9
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptxMinilikDerseh1
 

Similaire à Point Estimate, Confidence Interval, Hypotesis tests (20)

Talk 3
Talk 3Talk 3
Talk 3
 
Inferential statistics-estimation
Inferential statistics-estimationInferential statistics-estimation
Inferential statistics-estimation
 
hypothesisTestPPT.pptx
hypothesisTestPPT.pptxhypothesisTestPPT.pptx
hypothesisTestPPT.pptx
 
Descriptive Statistics Formula Sheet Sample Populatio.docx
Descriptive Statistics Formula Sheet    Sample Populatio.docxDescriptive Statistics Formula Sheet    Sample Populatio.docx
Descriptive Statistics Formula Sheet Sample Populatio.docx
 
U unit8 ksb
U unit8 ksbU unit8 ksb
U unit8 ksb
 
C2 st lecture 11 the t-test handout
C2 st lecture 11   the t-test handoutC2 st lecture 11   the t-test handout
C2 st lecture 11 the t-test handout
 
Lect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spreadLect w2 measures_of_location_and_spread
Lect w2 measures_of_location_and_spread
 
Application of Statistical and mathematical equations in Chemistry Part 2
Application of Statistical and mathematical equations in Chemistry Part 2Application of Statistical and mathematical equations in Chemistry Part 2
Application of Statistical and mathematical equations in Chemistry Part 2
 
estimation
estimationestimation
estimation
 
Estimation
EstimationEstimation
Estimation
 
TEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptxTEST OF SIGNIFICANCE.pptx
TEST OF SIGNIFICANCE.pptx
 
Small Sampling Theory Presentation1
Small Sampling Theory Presentation1Small Sampling Theory Presentation1
Small Sampling Theory Presentation1
 
Pertemuan 10 new - Komputasi Statistik.pptx
Pertemuan 10 new - Komputasi Statistik.pptxPertemuan 10 new - Komputasi Statistik.pptx
Pertemuan 10 new - Komputasi Statistik.pptx
 
Qt notes by mj
Qt notes by mjQt notes by mj
Qt notes by mj
 
Statistics Applied to Biomedical Sciences
Statistics Applied to Biomedical SciencesStatistics Applied to Biomedical Sciences
Statistics Applied to Biomedical Sciences
 
C2 st lecture 10 basic statistics and the z test handout
C2 st lecture 10   basic statistics and the z test handoutC2 st lecture 10   basic statistics and the z test handout
C2 st lecture 10 basic statistics and the z test handout
 
Sampling distribution.pptx
Sampling distribution.pptxSampling distribution.pptx
Sampling distribution.pptx
 
Estimating a Population Mean
Estimating a Population Mean  Estimating a Population Mean
Estimating a Population Mean
 
Categorical data analysis full lecture note PPT.pptx
Categorical data analysis full lecture note  PPT.pptxCategorical data analysis full lecture note  PPT.pptx
Categorical data analysis full lecture note PPT.pptx
 
Statistical analysis by iswar
Statistical analysis by iswarStatistical analysis by iswar
Statistical analysis by iswar
 

Plus de University of Salerno

Modelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataModelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataUniversity of Salerno
 
Carpita metulini 111220_dssr_bari_version2
Carpita metulini 111220_dssr_bari_version2Carpita metulini 111220_dssr_bari_version2
Carpita metulini 111220_dssr_bari_version2University of Salerno
 
A strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census dataA strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census dataUniversity of Salerno
 
Detecting and classifying moments in basketball matches using sensor tracked ...
Detecting and classifying moments in basketball matches using sensor tracked ...Detecting and classifying moments in basketball matches using sensor tracked ...
Detecting and classifying moments in basketball matches using sensor tracked ...University of Salerno
 
BASKETBALL SPATIAL PERFORMANCE INDICATORS
BASKETBALL SPATIAL PERFORMANCE INDICATORSBASKETBALL SPATIAL PERFORMANCE INDICATORS
BASKETBALL SPATIAL PERFORMANCE INDICATORSUniversity of Salerno
 
Human activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone dataHuman activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone dataUniversity of Salerno
 
Players Movements and Team Performance
Players Movements and Team PerformancePlayers Movements and Team Performance
Players Movements and Team PerformanceUniversity of Salerno
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...University of Salerno
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...University of Salerno
 
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...University of Salerno
 
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...University of Salerno
 
The Worldwide Network of Virtual Water with Kriskogram
The Worldwide Network of Virtual Water with KriskogramThe Worldwide Network of Virtual Water with Kriskogram
The Worldwide Network of Virtual Water with KriskogramUniversity of Salerno
 

Plus de University of Salerno (20)

Modelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataModelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large data
 
Regression models for panel data
Regression models for panel dataRegression models for panel data
Regression models for panel data
 
Carpita metulini 111220_dssr_bari_version2
Carpita metulini 111220_dssr_bari_version2Carpita metulini 111220_dssr_bari_version2
Carpita metulini 111220_dssr_bari_version2
 
A strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census dataA strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census data
 
Detecting and classifying moments in basketball matches using sensor tracked ...
Detecting and classifying moments in basketball matches using sensor tracked ...Detecting and classifying moments in basketball matches using sensor tracked ...
Detecting and classifying moments in basketball matches using sensor tracked ...
 
BASKETBALL SPATIAL PERFORMANCE INDICATORS
BASKETBALL SPATIAL PERFORMANCE INDICATORSBASKETBALL SPATIAL PERFORMANCE INDICATORS
BASKETBALL SPATIAL PERFORMANCE INDICATORS
 
Human activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone dataHuman activity spatio-temporal indicators using mobile phone data
Human activity spatio-temporal indicators using mobile phone data
 
Poster venezia
Poster veneziaPoster venezia
Poster venezia
 
Metulini280818 iasi
Metulini280818 iasiMetulini280818 iasi
Metulini280818 iasi
 
Players Movements and Team Performance
Players Movements and Team PerformancePlayers Movements and Team Performance
Players Movements and Team Performance
 
Big Data Analytics for Smart Cities
Big Data Analytics for Smart CitiesBig Data Analytics for Smart Cities
Big Data Analytics for Smart Cities
 
Meeting progetto ode_sm_rm
Meeting progetto ode_sm_rmMeeting progetto ode_sm_rm
Meeting progetto ode_sm_rm
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Space-Time Analysis of Mov...
 
Metulini1503
Metulini1503Metulini1503
Metulini1503
 
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
A Spatial Filtering Zero-Inflated approach to the estimation of the Gravity M...
 
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
The Water Suitcase of Migrants: Assessing Virtual Water Fluxes Associated to ...
 
The Global Virtual Water Network
The Global Virtual Water NetworkThe Global Virtual Water Network
The Global Virtual Water Network
 
The Worldwide Network of Virtual Water with Kriskogram
The Worldwide Network of Virtual Water with KriskogramThe Worldwide Network of Virtual Water with Kriskogram
The Worldwide Network of Virtual Water with Kriskogram
 
Ad b 1702_metu_v2
Ad b 1702_metu_v2Ad b 1702_metu_v2
Ad b 1702_metu_v2
 

Dernier

General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 

Dernier (20)

General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Point Estimate, Confidence Interval, Hypotesis tests

  • 1. Statistics Lab Rodolfo Metulini IMT Institute for Advanced Studies, Lucca, Italy Lesson 3 - Point Estimate, Confidence Interval and Hypotesis Tests - 16.01.2014
  • 2. Introduction Let’s start having empirical data (one variable of length N) extracted from external file, suppose to consider it to be the population. We define a sample of size n. Suppose we do not have information on population (or, better, we want to check if and how the sample can represent the population) We, in other words, want to make infererence using the information contained in the sample, in order to obtain an estimation for the population. That sample is one of several samples we can randomly draw from the population (the sample space). What are the instruments to obtain infos about the population? (1) Sample mean (point estimation) (2) Confidence interval (3) Hypotesis tests
  • 3. Sample space In probability theory, the sample space of an experiment or random trial is the set of all possible outcomes or results of that experiment. It is common to refer to a sample space by the labels S, Ω, or U. For example, for tossing two coins, the corresponding sample space would be {(head,head), (head,tail), (tail,head), (tail,tail)}, so that the dimension is 4. dim(Ω) = 4. It means that we can obtain 4 different samples with corresponding 4 different sample means. In pratice, we face up with only one sample took at random from the sample space.
  • 4. Point estimate Point estimate permit us to summarize the information contained in the population (dimension N), throughout only 1 value constructed using n vales. The most used, unbiased point estimator is the sample mean. n xi ˆ X n = 1=1 n Other point estimators are: (1) Sample Median (2) Sample Mode (3) Geometric mean. Geometric Mean = Mg = 2 n i=1 xi 1 = exp[ n n 1=1 lnxi ] An example of what is not an estimator is when you use the sample mean after subsetting the sample truncating it on a certain value. P.S. A Naif definition of estimator: when the estimator is computed using all the n informations in the sample.
  • 5. Efficient estimators The BLUE (Best Linear Unbiased Estimator) is defined as follow: 1. is a linear function of all the sample values ˆ 2. is unbiased (E (Xn ) = θ) 3. has the smallest sample variance among all unbiased estimators. The sample mean is BLUE for the parameter µ Some estimators are biased but consistent: An estimator is consistent when become unbiased for n −→ ∞
  • 6. Point estimators - cases ˆ Normal samples: Xn is the BLUE estimator for µ parameter (mean) ˆ Bernoulli samples f (x) = ρx (1 − ρ)1−x : Xn is a unbiased estimator for ρ parameter (frequency) e −k k x ˆ ): Xn is a unbiased estimator x! for k parameter (which represent both mean and variance of the distribution) Poisson samples f (x) = 1 :is a unbiased ˆ Xn estimator for λ parameter (density at value 0) Exponential samples f (x) = λe −λy )
  • 7. Confidence interval theory With point estimators we make use of only one value to infer about population. With confidence interval we define a minimum and a maximum value in which the population parameter we expect to lie. Formally, we need to calculate: σ ˆ µ1 = Xn − z ∗ √ n σ ˆ µ2 = Xn + z ∗ √ n and we end up with interval µ = {µ1 ; µ2 } ˆ ˆ Here: Xn is the sample mean; z is the upper (or lower) critical value of the theoretical distribution. σ is the standard deviation of the theoretical distribution. n the sample size. (See the graph)
  • 8. Confidence interval theory - Gaussian We will make some assumptions for what we might find in an experiment and find the resulting confidence interval using a normal distribution. Let assume that the sample mean is 5, the standard deviation in population is known and it is equal to 2, and the sample size is n = 20. In the example below we will use a 95 per cent confidence level and wish to find the confidence interval. N.B. Here, since the confidence interval is 95, the z (the critical value) to consider is the one corresponding with CDF (i.e. dnorm) = 0.975. We also can speak of α = 0.05, or 1 − α = 0.95, or 1 − α/2 = 0.975
  • 9. Confidence interval theory - T-student We use T − student distribution when n is small and sd is unknown in population. We need to use a sample variance estimation: σ = ˆ ˆ (xi −Xn )2 n−1 The t-student distribution is more spread out. In simple words, since we do not know the population sd, we need for more large intervals (caution - approach). The only difference with normal distribution, is that we use the command associated with the t-distribution rather than the normal distribution. Here we repeat the procedures above, but we will assume that we are working with a sample standard deviation rather than an exact standard deviation. N.B. The T distribution is characterize by its degree of freedom. In this test the degree aere equal to n − 1, because we use 1 estimation (1 constraint)
  • 10. Confidence interval theory - comparison of two means In some case we can have an experiment called (for example) case-control. Let’s imagine to have the population splitted in 2: one is the treated group, the second is the non treated group. Suppose to extract two samples from them with aim to test if the two samples comes from a population with the same mean parameter (is the treatment effective?) The output of this test will be a confidence interval represting the difference between the two means. N.B. Here, the degree of freedom of the t-distribution are equal to min(n1 , n2 ) − 1
  • 11. Formulas Gaussian confidence interval: ˆ µ = {µ1 , µ2 } = Xn ± z ∗ ˆ σ √ n T - student confidence interval: ˆ µ = {µ1 , µ2 } = Xn ± tn−1 ∗ ˆ σ ˆ √ n T-student confidence interval for two sample difference: ˆ ˆ µdiff = {µdiff 1 , µdiff2 } = (X1 − X2 ) ± tn−1 ∗ sd; ˆ where sd = sd1 ∗ sd1 n1 + sd2 ∗ sd2 n2 Gussian confidence interval for proportion (bernoulli distribution): ρ = {ρ1 , ρ2 } = fˆ ± z ∗ sd; ˆ 1 where sd = ρ(1−ρ) n2
  • 12. Hypotesis testing Researchers retain or reject hypothesis based on measurements of observed samples. The decision is often based on a statistical mechanism called hypothesis testing. A type I error is the mishap of falsely rejecting a null hypothesis when the null hypothesis is true. The probability of committing a type I error is called the significance level of the hypothesis testing, and is denoted by the Greek letter α (the same used in the confidence intervals). We demonstrate the procedure of hypothesis testing in R first with the intuitive critical value approach. Then we discuss the popular p − value (and very quick) approach as alternative.
  • 13. Hypotesis testing - lower tail The null hypothesis of the lower tail test of the population mean can be expressed as follows: µ ≥ µ0 ; where µ0 is a hypothesized lower bound of the true population mean µ. Let us define the test statistic z in terms of the sample mean, the sample size and the population standard deviation σ: z= ˆ Xn −µ0 √ σ/ n Then the null hypothesis of the lower tail test is to be rejected if z ≤ zα , where zα is the 100(α) percentile of the standard normal distribution.
  • 14. Hypotesis testing - upper tail The null hypothesis of the upper tail test of the population mean can be expressed as follows: µ ≤ µ0 ; where µ0 is a hypothesized upper bound of the true population mean µ. Let us define the test statistic z in terms of the sample mean, the sample size and the population standard deviation σ: z= ˆ Xn −µ0 √ σ/ n Then the null hypothesis of the upper tail test is to be rejected if z ≥ z1−α , where z1−α is the 100(1 − α) percentile of the standard normal distribution.
  • 15. Hypotesis testing - two tailed The null hypothesis of the two-tailed test of the population mean can be expressed as follows: µ = µ0 ; where µ0 is a hypothesized value of the true population mean µ. Let us define the test statistic z in terms of the sample mean, the sample size and the population standard deviation σ: z= ˆ Xn −µ0 √ σ/ n Then the null hypothesis of the two-tailed test is to be rejected if z ≤ zα/2 or z ≥ z1−α/2 , where zα/2 is the 100(α/2) percentile of the standard normal distribution.
  • 16. Hypotesis testing - lower tail with Unknown variance The null hypothesis of the lower tail test of the population mean can be expressed as follows: µ ≥ µ0 ; where µ0 is a hypothesized lower bound of the true population mean µ. Let us define the test statistic t in terms of the sample mean, the sample size and the sample standard deviation σ : ˆ t= ˆ Xn −µ0 √ σ/ n ˆ Then the null hypothesis of the lower tail test is to be rejected if t ≤ tα , where tα is the 100(α) percentile of the Student t distribution with n − 1 degrees of freedom.
  • 17. Hypotesis testing - upper tail with Unknown variance The null hypothesis of the upper tail test of the population mean can be expressed as follows: µ ≤ µ0 ; where µ0 is a hypothesized upper bound of the true population mean µ. Let us define the test statistic t in terms of the sample mean, the sample size and the sample standard deviation σ : ˆ t= ˆ Xn −µ0 √ σ/ n ˆ Then the null hypothesis of the upper tail test is to be rejected if t ≥ t1−α , where t1−α is the 100(1 − α) percentile of the Student t distribution with n1 degrees of freedom.
  • 18. Hypotesis testing - two tailed with Unknown variance The null hypothesis of the two-tailed test of the population mean can be expressed as follows: µ = µ0 ; where µ0 is a hypothesized value of the true population mean µ. Let us define the test statistic t in terms of the sample mean, the sample size and the sample standard deviation σ : ˆ t= ˆ Xn −µ0 √ σ/ n ˆ Then the null hypothesis of the two-tailed test is to be rejected if t ≤ tα/2 or t ≥ t1−α/2 , where tα/2 is the 100(α/2) percentile of the Student t distribution with n − 1 degrees of freedom.
  • 19. Lower Tail Test of Population Proportion The null hypothesis of the lower tail test about population proportion can be expressed as follows: ρ ≥ ρ0 ; where ρ0 is a hypothesized lower bound of the true population proportion ρ. Let us define the test statistic z in terms of the sample proportion and the sample size: z= ρ−ρ0 ˆ ρ0 (1−ρ0 ) n Then the null hypothesis of the lower tail test is to be rejected if z ≤ zα , where zα is the 100(α) percentile of the standard normal distribution.
  • 20. Upper Tail Test of Population Proportion The null hypothesis of the upper tail test about population proportion can be expressed as follows: ρ ≤ ρ0 ; where ρ0 is a hypothesized lower bound of the true population proportion ρ. Let us define the test statistic z in terms of the sample proportion and the sample size: z= ρ−ρ0 ˆ ρ0 (1−ρ0 ) n Then the null hypothesis of the lower tail test is to be rejected if z ≥ z1−α , where z1−α is the 100(1 − α) percentile of the standard normal distribution.
  • 21. Two Tailed Test of Population Proportion The null hypothesis of the upper tail test about population proportion can be expressed as follows: ρ = ρ0 ; where ρ0 is a hypothesized true population proportion. Let us define the test statistic z in terms of the sample proportion and the sample size: z= ρ−ρ0 ˆ ρ0 (1−ρ0 ) n Then the null hypothesis of the lower tail test is to be rejected if z ≤ zα/2 or z ≥ z1−α/2
  • 22. Sample size definition The quality of a sample survey can be improved (worsened) by increasing (decreasing) the sample size. The formula below provide the sample size needed under the requirement of population proportion interval estimate at (1 − α) confidence level, margin of error E and planned parameter estimation. Here, z1−α/2 is the 100(1 − α/2) percentile of the standard normal distribution. For mean: n = 2 z1−α/2 ∗σ 2 E2 For proportion: n = 2 z1−α/2 ρ∗(1−ρ) E2
  • 23. Sample size definition - Exercises Mean: Assume the population standard deviation σ of the student height in survey is 9.48. Find the sample size needed to achieve a 1.2 centimeters margin of error at 95 per cent confidence level. Since there are two tails of the normal distribution, the 95 per cent confidence level would imply the 97.5th percentile of the normal distribution at the upper tail. Therefore, z1−α/2 is given by qnorm(.975). Population: Using a 50 per cent planned proportion estimate, find the sample size needed to achieve 5 per cent margin of error for the female student survey at 95 per cent confidence level. Since there are two tails of the normal distribution, the 95 per cent confidence level would imply the 97.5th percentile of the normal distribution at the upper tail. Therefore, z1−α/2 is given by qnorm(.975).
  • 24. Homeworks 1: Confidence interval for the proportion. Suppose we have a sample of size n = 25 of births. 15 of that are female. Define the interval (at 99 per cent) for the proportion of female in the population. HINT: Apply with the proper functions in R, the formula in slide 11. 2: Hypotesis test to compare two proportions. Suppose we have two schools. Sampling from the first, n = 20 and the Hispanics students are 8. Sampling from the second, n = 18 and Hispanics students are 4. Can we state (at 95 per cent) the frequency of Hispanics are the same in the two schools? N.B.: the test here is two tailed. The hypotesis test here is: z= ρ= ρ1 −ˆ2 ˆ ρ sd ; where (ρ1 ∗n1 +ρ2 +n2 ) n1 +n2 sd = 1 ρ(1 − ρ)[ n1 + 1 n2 ];
  • 25. Charts - 1 Figure: Representation of the critical point for the upper tail hypotesis test
  • 26. Charts - 2 Figure: Representation of the critical point for the lower tail hypotesis test
  • 27. Charts - 3 Figure: Representation of the critical point for the two-tailed hypotesis test
  • 28. Charts - 4 Figure: Type I and Type II errors in hypotesis testing