Presentation on how to chat with PDF using ChatGPT code interpreter
25 Testing
1. Stat310 Testing
Hadley Wickham
Sunday, 19 April 2009
2. 1. Import question
2. Recap
3. More examples/practice
4. Choosing a cut-off
5. P value is a random variable too!
6. Next time
Sunday, 19 April 2009
3. Final
Which would you prefer?
a) a 3 hour final
b) a 2 hour final
Sunday, 19 April 2009
4. Recap
What is a null hypothesis? What is an
alternative hypothesis?
What is the opposite of rejecting the null
hypothesis? Why?
Sunday, 19 April 2009
5. Testing jargon
No: Null hypothesis. Nothing is
happening. (Thing we want to disprove)
Yes: Alternative hypothesis. Something
interesting is happening.
Sunday, 19 April 2009
6. Absence of
evidence is not
evidence of absence
Sunday, 19 April 2009
7. The lady tasting tea
A thought experiment by R. A. Fisher
(famous early statistician, 1890-1962)
A lady at a tea party claims that she can
tell the difference between putting the
milk in first and second.
How can we be sure?
Sunday, 19 April 2009
8. Experiment
8 cups. 4 milk first, 4 milk second.
Presented in random order.
What is the null hypothesis?
How many possible outcomes are there?
Sunday, 19 April 2009
9. Your turn
What would the distribution of correct
responses be under the null hypothesis?
How many would she need to get correct
for us to be reasonably certain that she
really could tell the difference?
Sunday, 19 April 2009
11. Another example
Xi ~ iid Normal(μx, 1)
Yi ~ iid Normal(μy, 1)
Do they have the same means?
Sunday, 19 April 2009
12. 1. Write down null and alternative
hypotheses
2. Figure out good test statistic
(for this class, usually obvious)
3. Work out distribution under the null
Sunday, 19 April 2009
13. Experiment
x = 7.0 5.8 2.0 5.0 6.1 5.6 4.3 4.0 4.8 6.5
y = 6.2 4.0 5.8 5.9 5.7 6.0 6.2 5.7 5.4 5.8
(mean of x = 5.67, mean of y = 5.11)
Are the means of the underlying
distributions the same?
(True answer?)
Sunday, 19 April 2009
14. 1. Compute test statistic
2. Compute p-value, by evaluating F at
the test-statistic
3. (Question: what is the distribution of
the p-value if the null hypothesis is
true?)
Sunday, 19 April 2009
15. P-value
P value gives us the probability, under the
null hypothesis, that we would have seen a
value equal to or more extreme than the
value we observed.
Strength of evidence for rejecting the null
hypothesis.
But we need a cut off to make a yes-no
decision. How do we choose that cut off?
Sunday, 19 April 2009
16. Errors
What are the possible errors we can
make?
False positive. Choose alternative when
null is correct. (aka Type 1)
False negative. Choose null when
alternative is true. (aka Type 2)
Sunday, 19 April 2009
17. Terminology
Probability of a false positive called α
Probability of false negative called 1 - β
How are the two related?
Usually care more about false positives.
Usually pick arbitrary cut-off of what?
Sunday, 19 April 2009
18. Testing overview
Write down null and alternative
hypotheses.
Compute test statistic.
Convert to p-value.
Compare p-value to alpha cut off.
Sunday, 19 April 2009
20. y
y y
y y
y
6.5 y
y
yy
y y
y y
y
y y
y
y
y y y
y
y y
y y y y
y y yy yy
y yy y
yyy
yy yyyy y y
yyy
yy
6.0 y
y y
yy y y y
y y y
yy y
y y y
y
yy y y
y
y
y y y
yyy yy
yy y
y
y
y y
y
y
x x
y
x
y
5.5 x
yxx
x
x x x
x x
x x x
xx
x x
x xx
x x
x x
xx y
x x
x x x x xx x
x
x x x
x x
xx
5.0 x
x x x xx
xx x
xx
x x
x
x
x x xx
x x
x x
xx
x xx
xx x x x xx x
x xx
x x
x
x
x x x xx
x
4.5 x x xx
x x
20 40 60 80 100
Sunday, 19 April 2009
26. y
x
y x
x
x y
x
5.5 y x
y y x
x
x y
y
x
yy y
x xyx x xx
xy
xx
x x y yy
y y x
yyy xx
xx x
y y xyy xy
xx x y
xy x x
yx y
y yx x
y
y x
y yy yy
yy
y x y xy xx xy
yx
yx y x y
y
y
y yy yx
xx
xxx
5.0 y
yyx
x x y
yy x y
x yy
yx y
yy x y
x
x xx
x x x
y yyy
x
x x
y y
xy yx
x
x
xy
x y x xxx y
x
y xx
y y
x x yy y y
yx x
x xy
x x
xx y y y y
xx
x y
y y
x
4.5 y
yx
x
y x
y
20 40 60 80 100
Sunday, 19 April 2009
27. 1.0
0.5
sameerence
0.0
−0.5
20 40 60 80 100
Can you think of another test-
statistic based on this plot?
Sunday, 19 April 2009
31. The rest of testing
For a given situation, need to know a good
test-statistic and the distribution under the
null.
Lots of standard cases, which you can
now derive, or look up in a book.
In a final, I will either explicitly ask you to
derive it, or I’ll give you the test statistic
and null distribution.
Sunday, 19 April 2009
32. Next time
Graded tests back.
Information about the final.
(Incl. study session)
What you I do with statistics (stat405).
Other courses / Majoring in statistics.
Celebrate being done.
Sunday, 19 April 2009