SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
These curves (and the Excel program on the right column of this
blog) can be used for type 1 and 2 errors, power, confidence intervals
and severity, but I’m just going to illustrate power.
EXAMPLE SET 1: illustration using the Excel program
The curves in the Excel example depict the hypotheses in test T+, the
one-sided Normal test of
H0: μ ≤12 (blue curve) against H1: μ >12 (red curve).
Null (or test) hypothesis H0: X is Normal (µ, σ /√n), and µ = 12.
Alternative H1: X is Normal (µ, σ /√n), and µ >12.

Each hypothesis, being statistical, assigns probabilities to the
different values of X, and thus to the different values of the sample
mean X . Those are the heights of the curves. Areas under the curve
are probabilities for events like { X > 12.4}.
EXAMPLE SET 1: Throughout, keep σ = 2, and n = 100. So σ /√n =
2/10. Note: σ /√n is the standard deviation of X , I will sometimes
abbreviate this as σx because it’s too hard to get an x-bar as a
subscript of σ. (Ignore the lines under the curves for now, just look at
the curves and the boxes on the left.)
Let’s specify the test rule: the test T+ will reject H0 whenever X > 12 +
2 σ /√n
T+ reject H0 if and only if { X > 12.4}.
For { X < 12.4}. “do not reject” (i.e., accept) H0 .
{ X > 12.4}. is the rejection region of the test.
This test has a probability of rejecting H0 under the assumption we
are in the world of the first curve, i.e., assuming μ =12, of .023 (with a
slight approximation). This is the probability of a type 1 error, alpha.
(1) P[{ X > 12.4}; μ =12] = .023
In the picture, this is shown as the turquoise area under the curve. It
is also written as the 4th entry in the turquoise box on the left. The
turquoise box lists: the population mean (as given in the null of the
test, which we keep at 12) the population SD, σ, which we keep at 2,
the sample size, n, which (for Example Set 1) we keep at 100. Below
that is alpha, which we keep at .023 for this example set.
The only thing to be changed is the alternative (red) curve. The point
is to see the shifts in the probability test T+ rejects the null under
different assumptions for the alternative. Even though the alternative
is still H1: μ >12, we will compute the probability of rejection under
different point values μ1 greater than 12.
(2) P[{ X > 12.4}; μ = μ1] = The Power of test T+ (to reject H0) when
the true value of μ is μ1
We can abbreviate this as POW( μ1 ).
_______________________________________________________
The grey box on the left shows the sample mean X . For Example set
1, we will not change this. We assume X just makes it to the cut-off
for rejection, i.e., 12.4.
X = 12.4.

The gray box also shows the corresponding p-value which is ~.023
_______________________________________________________
Example 1.1
Finally we get to the good part. The figure below shows the case
where the alternative is μ = 12.2. That is, it sets μ1 = 12.2 in the
bottom, peachy-colored, box on the left.
We would like to know
(2a) P[{ X > 12.4}; μ = 12.2] = The probability test T+ would reject H0
when the true value of μ is 12.2.
POW( 12.2).
What is it? It is the green area under the red curve, which is .16. Why
do we look at the area under the red curve, rather than under the blue
curve? Because to the right of “;” in (2a) is μ = 12.2. It’s asking for a
calculation assuming we are in the hypothesized red world. So
imagine you’re in the red curve world where μ = 12.2: the probability
{ X > 12.4} is .16.
Example 1.2
Now consider the alternative μ = 12.3.
(2b) P[{ X > 12.4}; μ = 12.3] = The probability test T+ would reject H0
when the true value of μ is 12.3.
=.3
= POW( 12.3).
Example 1.3
Now consider the alternative μ = 12.4.
(2c) P[{ X > 12.4}; μ = 12.4] = The probability test T+ would reject H0
when the true value of μ is 12.4.
=.5
= POW( 12.4).
Example 1.4
Now consider the alternative μ = 12.5.
(2d) P[{ X > 12.4}; μ = 12.5] = .7
= POW( 12.5).

Contenu connexe

Plus de jemille6

D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfjemille6
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfjemille6
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022jemille6
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inferencejemille6
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?jemille6
 
What's the question?
What's the question? What's the question?
What's the question? jemille6
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metasciencejemille6
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...jemille6
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Twojemille6
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...jemille6
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testingjemille6
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredgingjemille6
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probabilityjemille6
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severityjemille6
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)jemille6
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)jemille6
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...jemille6
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (jemille6
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...jemille6
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...jemille6
 

Plus de jemille6 (20)

D. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdfD. Mayo JSM slides v2.pdf
D. Mayo JSM slides v2.pdf
 
reid-postJSM-DRC.pdf
reid-postJSM-DRC.pdfreid-postJSM-DRC.pdf
reid-postJSM-DRC.pdf
 
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
 
Causal inference is not statistical inference
Causal inference is not statistical inferenceCausal inference is not statistical inference
Causal inference is not statistical inference
 
What are questionable research practices?
What are questionable research practices?What are questionable research practices?
What are questionable research practices?
 
What's the question?
What's the question? What's the question?
What's the question?
 
The neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and MetascienceThe neglected importance of complexity in statistics and Metascience
The neglected importance of complexity in statistics and Metascience
 
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
 
On Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the TwoOn Severity, the Weight of Evidence, and the Relationship Between the Two
On Severity, the Weight of Evidence, and the Relationship Between the Two
 
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
 
Comparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple TestingComparing Frequentists and Bayesian Control of Multiple Testing
Comparing Frequentists and Bayesian Control of Multiple Testing
 
Good Data Dredging
Good Data DredgingGood Data Dredging
Good Data Dredging
 
The Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of ProbabilityThe Duality of Parameters and the Duality of Probability
The Duality of Parameters and the Duality of Probability
 
Error Control and Severity
Error Control and SeverityError Control and Severity
Error Control and Severity
 
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Causalities (refs)
 
The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)The Statistics Wars and Their Casualties (w/refs)
The Statistics Wars and Their Casualties (w/refs)
 
On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...On the interpretation of the mathematical characteristics of statistical test...
On the interpretation of the mathematical characteristics of statistical test...
 
The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (The role of background assumptions in severity appraisal (
The role of background assumptions in severity appraisal (
 
The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...The two statistical cornerstones of replicability: addressing selective infer...
The two statistical cornerstones of replicability: addressing selective infer...
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
 

Using Severity Excel Program for Power calculations

  • 1. These curves (and the Excel program on the right column of this blog) can be used for type 1 and 2 errors, power, confidence intervals and severity, but I’m just going to illustrate power.
  • 2. EXAMPLE SET 1: illustration using the Excel program The curves in the Excel example depict the hypotheses in test T+, the one-sided Normal test of H0: μ ≤12 (blue curve) against H1: μ >12 (red curve). Null (or test) hypothesis H0: X is Normal (µ, σ /√n), and µ = 12. Alternative H1: X is Normal (µ, σ /√n), and µ >12. Each hypothesis, being statistical, assigns probabilities to the different values of X, and thus to the different values of the sample mean X . Those are the heights of the curves. Areas under the curve are probabilities for events like { X > 12.4}. EXAMPLE SET 1: Throughout, keep σ = 2, and n = 100. So σ /√n = 2/10. Note: σ /√n is the standard deviation of X , I will sometimes abbreviate this as σx because it’s too hard to get an x-bar as a subscript of σ. (Ignore the lines under the curves for now, just look at the curves and the boxes on the left.) Let’s specify the test rule: the test T+ will reject H0 whenever X > 12 + 2 σ /√n T+ reject H0 if and only if { X > 12.4}. For { X < 12.4}. “do not reject” (i.e., accept) H0 . { X > 12.4}. is the rejection region of the test. This test has a probability of rejecting H0 under the assumption we are in the world of the first curve, i.e., assuming μ =12, of .023 (with a slight approximation). This is the probability of a type 1 error, alpha. (1) P[{ X > 12.4}; μ =12] = .023 In the picture, this is shown as the turquoise area under the curve. It is also written as the 4th entry in the turquoise box on the left. The turquoise box lists: the population mean (as given in the null of the test, which we keep at 12) the population SD, σ, which we keep at 2, the sample size, n, which (for Example Set 1) we keep at 100. Below that is alpha, which we keep at .023 for this example set.
  • 3. The only thing to be changed is the alternative (red) curve. The point is to see the shifts in the probability test T+ rejects the null under different assumptions for the alternative. Even though the alternative is still H1: μ >12, we will compute the probability of rejection under different point values μ1 greater than 12. (2) P[{ X > 12.4}; μ = μ1] = The Power of test T+ (to reject H0) when the true value of μ is μ1 We can abbreviate this as POW( μ1 ). _______________________________________________________ The grey box on the left shows the sample mean X . For Example set 1, we will not change this. We assume X just makes it to the cut-off for rejection, i.e., 12.4. X = 12.4. The gray box also shows the corresponding p-value which is ~.023 _______________________________________________________
  • 4. Example 1.1 Finally we get to the good part. The figure below shows the case where the alternative is μ = 12.2. That is, it sets μ1 = 12.2 in the bottom, peachy-colored, box on the left. We would like to know (2a) P[{ X > 12.4}; μ = 12.2] = The probability test T+ would reject H0 when the true value of μ is 12.2. POW( 12.2). What is it? It is the green area under the red curve, which is .16. Why do we look at the area under the red curve, rather than under the blue curve? Because to the right of “;” in (2a) is μ = 12.2. It’s asking for a calculation assuming we are in the hypothesized red world. So imagine you’re in the red curve world where μ = 12.2: the probability { X > 12.4} is .16.
  • 5. Example 1.2 Now consider the alternative μ = 12.3. (2b) P[{ X > 12.4}; μ = 12.3] = The probability test T+ would reject H0 when the true value of μ is 12.3. =.3 = POW( 12.3).
  • 6. Example 1.3 Now consider the alternative μ = 12.4. (2c) P[{ X > 12.4}; μ = 12.4] = The probability test T+ would reject H0 when the true value of μ is 12.4. =.5 = POW( 12.4).
  • 7. Example 1.4 Now consider the alternative μ = 12.5. (2d) P[{ X > 12.4}; μ = 12.5] = .7 = POW( 12.5).