Introduction to the t-test

INTRODUCTION TO
THE T STATISTIC

Scaife (1976) Eye-spot Pattern Test
• N = 16 birds placed in two-
chamber box for 60 minutes
• Can move freely between:
• Chamber with eye-spot patterns
• Chamber without patterns
• H0: Eye-spot patterns have no
effect on birds’ behavior
• If true, then birds should wander
randomly between chambers,
averaging 30 min. in each
• H0: µplainside = 30 min
n
MM
z
M 


 



n
x
M



THE T-STATISTIC: AN
ALTERNATIVE TO Z

Rewind: Hypothesis Testing with z
• We begin with a known population
• (μ and σ)
• We state the null and alternative hypotheses and
select an alpha level
• We find the critical region
• Critical z-score which separates the set of unlikely
outcomes if the null is true
• Compute the z-score corresponding to the mean
of the sample we have obtained
• Make a decision about the null hypothesis

Remember the Basic Concepts
1. For a single sample, M should approximate µ
2. The standard error (σM) provides a measure of
how well a sample mean approximates the
population mean
3. We compare the obtained sample mean (M)
with the hypothesized population mean (µ) by
computing a z-score statistic
M
M
z



n
M

 

Computing the z-Score
Things we need
Sample size (n)
Sample mean (M)
Population mean (μ)
Population standard
deviation (σ)
Without these, we
cannot calculate:
n
M
z
M
M








The t Statistic
Used to test hypotheses about an unknown
population mean μ when the value of σ is unknown
Ms
M
t


n
s
n
s
n
s
sM
22

Same structure as the z-
statistic, except this
uses the estimated
standard error

How Do We Estimate Standard Error?
Standard Error Estimated Standard Error
When we know the
population standard
deviation (σ)
When we do not know
the population standard
deviation (σ):
we can use the
information we do have
(sample standard
deviation; s) to get an
estimate of the standard
error
n
M

 
n
s
sM 

Estimated Standard Error (sM)
Used as an estimate of σM when the value of σ is
unknown
Computed using s (or s²) and provides an estimate
of the standard distance between a sample mean M
and the population mean μ
n
s
n
s
sM
2


Estimated Standard Error (sM)
1. s2 is an unbiased statistic
• On average, provides an
accurate and unbiased
estimate of σ2
• Thus, the most accurate way
to estimate σM
2. From this point forward
we will be working with
formulas that require
variance and not the SD.
• For familiarity’s sake
Two reasons for using variance and not SD:
n
s
n
s
n
s
sM
22

sizesample
variancesample
errorstandardestimated 

The t-Statistic
• Uses population variance
• Population parameters are
known
• Uses sample variance
• Population parameters are
not known
z-score t-score
ns
M
s
M
t
M /2
 



n
MM
z
M /2



 



Ms
M
t


errorstandard
differenceobtained
A large t value indicates that the obtained difference
between the data and the hypothesis is greater than
would be expected if the treatment has no effect

Degrees of Freedom (df)
Determines the number of scores in the sample
that are independent and free to vary
• If n = 3 and M = 5, then ΣX must equal what?
• M = ΣX ÷ n  5 = ΣX ÷ 3
• The first two scores have no restrictions
• They are independent values and could be any value
• The third score is restricted
• Can only be one value, based on the sum of the first
two scores (2 + 9 = 11)
• X3 MUST be 4
• The sample mean places a restriction on the
value of one score in the sample
X
2
9
?
n = 3
M = 5
ΣX = 15
 3 × 5 = 15
 This score has no “freedom”
ΣX = ?

Degrees of Freedom and the t-Score
Determines the number of scores in the sample that are
independent and free to vary
• For a sample of n scores, the degrees of freedom (df) are
defined as df = n-1
• When we know the sample mean, only n-1 of the scores are free to
vary; the final score is restricted
• The greater the df:
• The better s² represents σ²
• The better the t-statistic represents the z-score
• The larger the sample (n), the better it represents the population
• The df associated with s² also describes how well t represents z

ThetDistribution Remember this slide from Chapter 7?
Sampling Distribution
• A distribution of statistics obtained by
selecting all the possible samples of
a specific size from a population
• Distribution of statistics
• vs.
• Distribution of scores

The t Distribution
The complete set of t values computed for every
possible random sample for a specific sample size
(n) or a specific degrees of freedom (df)
• The distribution of z-scores computed from sampling
distribution of the mean tends to be normally distributed
• We look up proportions in the unit normal table
• The t-distribution approximates a
normal distribution in the same way
the t-statistic approximates a
z-score (+/- critical values)
• Exact shape changes with df

“Family” of t Distributions
• As sample size increases, df increases, and the
t-distribution more closely approximates a normal
distribution
• More variability than the z-distribution, because sM
changes with each sample (unlike σM)

Finding Proportions and Probabilities
1. Determine if your test is one- or two-tailed
2. Determine alpha level
3. Calculate df (n - 1)
4. Locate appropriate critical value
• Use more stringent (smaller) df if exact value is not listed

Examples
021.2)40(: 05.0  ttailedtwo
567.2)17(: 01.0  ttailedone
n = 41, α = 0.05, two-tailed test
df = 40, critical t-values = +2.021 and -2.021
n = 18, α = 0.01, one-tailed test (decrease)
df = 17, critical t-value = -2.567

Hypothesis Test with the t Statistic

When Do We Use the t-Test?
1. To determine the effect of treatment on a
population mean
2. In situations where the population mean is
unknown

Hypothesis Testing Using the t Statistic
We have a population of interest with an unknown
variance (σ²) and unknown change in µ
1. State H0 (no change, etc.) and H1
2. Set the alpha level and locate the critical region
• Determine the critical t-value(s)
• Note: You must find the value for df and use the t-
distribution table
3. Collect sample data and calculate the t and sM
4. Make a decision
• Either “reject” or “fail to reject” H0

Example
• A psychologist has prepared an “Optimism Test” that is
administered yearly to graduating college seniors. The
test measures how each graduating class feels about its
future – the higher the score, the more optimistic the
class. Last year’s class had a mean score of μ = 15.
• A sample of n = 9 seniors from this year’s class was
selected and tested. The scores for these seniors are as
follows:
7 12 11 15 7 8 15 9 6
• On the basis of this sample, can the psychologist
conclude that this year’s class has a different level of
optimism from last year’s class?

Before we do anything, we must…
…choose a test statistic!
We want to evaluate a mean difference between M
and μ. We do not know the population standard
deviation
What statistic would I use?
(hint: it’s what we’ve been discussing this ENTIRE lecture)
t-Test! Correctamundo!

Step 1: State your hypotheses
• State the hypotheses
There is no difference in mean Optimism Test scores
between this year’s class and last year’s class.
The mean score on the Optimism Test for this year’s class
is different than the mean score for last year’s class
15:1 H
15:0 H

Step 2: Locate the Critical Region
• Select an alpha level
• Determine df
• Locate the critical region
• Is this a one- or two-tailed test?
On the basis of this sample, can the psychologist conclude that this
year’s class has a different level of optimism from last year’s class?
306.2)8(05.0 t
05.0
8191  ndf

Step 3: Calculate t
• Calculate the sample mean:
• Calculate estimated standard error from the
sample data:
• Calculate the t-statistic:
10
9
90


n
X
M
 
n
x
xSS
2
2  
39.4
14.1
1510





Ms
M
t

n
s
sM
2
 14.1
9
75.11

1
2


n
SS
s 75.11
8
94

  94900994
9
8100
994
9
90
994
2


Make a Decision Regarding H0
Our Computed t
• Our t value falls in the critical region
• Our obtained t (-4.39) has a larger absolute value than our critical t
(2.306)
• Reject the null hypothesis at the 0.05 level of significance
• We can conclude that there is a significant difference in
level of optimism between this year’s and last year’s
graduating class:
Our Criticial t (tcrit)
39.4)8( t 306.2)8(05.0 t
tailedtwopt  ,05.0,39.4)8(

Assumptions of the t-Test
1. The values in the sample must consist of
independent observations
2. The population sampled must be normal
• The t-test is robust against violation of this
assumption if the sample size is relatively large

Influence of n and s2
Sample size
• Sample size is inversely
related to the estimate
standard error.
• A large sample size
increases the likelihood of a
significant test
Variance
• Sample variance is directly
related to the estimated
standard error.
• A large variance decreases
the likelihood of a
significant test
n
s
s
s
M
t M
M
2
; 




Measuring the Effect Size for t



M
dsCohen'
s
M
d

Estimated
Okay, so you found a significant effect. What does
that mean?
• You cannot assume that a “significant effect” means
there was a large effect of treatment.
• We must always measure the effect size for t
• Just like we did for the z, we calculate Cohen’s d for the t
• Difference: Now we use the s instead of the
(unknown) σ

Measuring the Effect Size for t
s
M
d


deviationstandardsample
differencemean
Estimated
(0 < d < 0.2) = Small effect size
(0.2 < d < 0.8) = Medium effect size
(d > 0.8) = Large effect size

An Alternative Measure of Effect Size
By measuring the amount of variability that can be
attributed to the treatment, we obtain a measure of
the size of the treatment effect.
(0.01 < r ² < 0.09) = Small effect size
(0.09 < r ² < 0.25) = Medium effect size
(r ² > 0.25) = Large effect size
If r ² = 0.11, then the treatment has a medium effect;
11% of the variability in the sample is due to the treatment effect



dft
t
r 2
2
2 The proportion of variance
accounted for by the treatment

Confidence Intervals
A range of values that estimates the unknown population
mean by estimating the t value
• Consists of an interval of values around a sample mean
• s = a reasonably accurate estimate of σ
• If we obtain an M = 86, we can be reasonably confident
that µ = ±86
• We can confidently estimate that the value of the parameter
should be located within that interval
)( MstM crit

• Every mean has a corresponding t-value
• HOWEVER, if we do not know µ,
we cannot compute t
)( MstM crit
Ms
M
t


What do I do? WHAT DO I DO?!?

We estimate the t value!
• If we have a sample with n = 9, then we know that df = 8
)( MstM 

For a hypothetical sample of n = 9
M = 13, sM = 1
1. Select a level of confidence (let’s use 80%)
2. Look up the t value associated with a two-tailed
proportion of 0.20 and a df = 8
tcrit = ±1.397
3. Calculate your confidence interval
)( Mcrit stM 
397.113)1(397.113 
397.14397.113  603.11397.113 

Put EVERYTHING together…..
]397.14,603.11[CI%80,05.,00.3)8(  pt

ONE-TAILED TESTS
Directional Hypotheses

Example
• A psychologist has prepared an “Optimism Test” that is
administered yearly to graduating college seniors. The
test measures how each graduating class feels about its
future – the higher the score, the more optimistic the
class. Last year’s class had a mean score of μ = 15.
• A sample of n = 9 seniors from this year’s class was
selected and tested. The scores for these seniors are as
follows:
7 12 11 15 7 8 15 9 6
• On the basis of this sample, can the psychologist
conclude that this year’s class has a higher level of
optimism from last year’s class?

Step 1: State your hypotheses
• State the hypotheses
There is no difference in mean Optimism Test scores
between this year’s class and last year’s class.
The mean score on the Optimism Test for this year’s class
is greater than the mean score for last year’s class
15: optimism0 H
15: optimism1 H

Step 2: Locate the Critical Region
• Select an alpha level
• Determine df
• Locate the critical region
• Is this a one- or two-tailed test?
On the basis of this sample, can the psychologist conclude that this
year’s class has a greater level of optimism from last year’s class?
860.1)8(05.0 t
05.0
8191  ndf

Make a Decision Regarding H0
Our Computed t
• Our t value does not fall in the critical region
• Our obtained t (-4.39) does not exceed our critical t (1.860)
• Fail to reject the null hypothesis at the 0.05 level of
significance
• We can conclude that there was no significant increase in
level of optimism between this year’s and last year’s
graduating class:
Our Criticial t (tcrit)
39.4)8( t 860.1)8(05.0 t
tailedonept  ,05.0,39.4)8(

Introduction to the t-test

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Introduction to the t-test

Similaire à Introduction to the t-test (20)

Plus de Kaori Kubo Germano, PhD

Plus de Kaori Kubo Germano, PhD (12)

Dernier

Dernier (20)

Introduction to the t-test