Postgraduate Statistics Course for Evidence-Based Managers

Postgraduate Course
Evidence-Based Management
(Some) statistics
for managers who hate statistics

Postgraduate Course
Why do we need statistics?
1. How does my population look like?
2. Is there a difference?
3. Is there a model that ‘fits’?

Postgraduate Course
Some statistics
Some statistic terms
1. Sample vs population
2. Variables
3. Levels of measurement
4. Central tendency
5. Hypothesis
Some statistic models
6. Mean
7. Variance, standard deviation
8. Confidence intervals
9. Statistical significance
10. Statistical power
11.Effect sizes 12.Critical appraisal

Postgraduate Course
1. Sample vs population

Postgraduate Course
Sample vs population
We want to know about these
(population: N)
We have to work with these
(sample: n)
population mean: μ
selection
sample mean: X
_
fit?

Postgraduate Course
Law of large numbers
The larger the sample size (or the number of
observations), the more accurate the predictions of the
characteristics of the whole population, and smaller
the expected deviation in comparisons of outcomes.
As a general principle it means that, in the long run,
the average (mean) of a large number of observations
will be close to (or: may be taken as the best estimate
of) the 'true mean’ of the population.

Postgraduate Course
Sample size: why does it matter?
 Law of the large numbers: a reliable and accurate
representation of the population
 Statistical power: to prevent a type 2 error / false
negative

Don‟t confuse: representativeness and reliability
The sample size has no direct relationship with
representativeness; even a large random sample can be
insufficiently representative.
Postgraduate Course

Postgraduate Course
2. Variables
Postgraduate Course

Postgraduate Course
Variables
Postgraduate Course
Variable: anything that can be measured and can
differ across entities or time
Independent variable: predictor variable (value does
not depend on any other variables)
Dependent variable: outcome variable (value
depends on other variables)

Postgraduate Course
3. Level of measurement
Postgraduate Course

Postgraduate Course
Level of measurement
Postgraduate Course
Relationship between what is being measured and
the numbers that represent what is being measured.

Postgraduate Course
Categorical
Continuous
Nominal
Ordinal
Interval
Ratio
Level of measurement

Postgraduate Course
Nominal scale
Classification of categorical data. There is no order to the
values, they are just given a name („nomen‟) or a number.
The numbers can‟t be used to calculate … (you can‟t
calculate the mean of fruit) .. only frequencies
1 = Apples
2 = Oranges
3 = Pineapples
4 = Banana’s
5 = Pears
6 = Mango’s

Postgraduate Course
Ordinal scale
Classification of categorical data. Values can be
rank-ordered, but the distance between the
values have no meaning. The numbers can
only be used to calculate a modus or a median
1. Full Professor
2. Associate professor
3. Assistant professor
4. PhD
5. Master
6. Bachelor

Postgraduate Course
Interval scale
Classification of continuous data. Values can
be rank-ordered, and the distance between
the values have meaning. However, there is
no natural zero point
1. John (1932)
2. Denise (1945)
3. Mary (1952
4. Marc (1964)
5. Jeffrey (1978)
6. Sarah (1982)

Postgraduate Course
Ratio scale
Classification of continuous data. Values can
be rank-ordered, the distance between the
values have meaning and there is a natural
zero point.
1. Jeffrey (192 cm)
2. John (187 cm)
3. Sarah (180 cm
4. Marc (179 cm)
5. Mary (171 cm)
6. Denise (165 cm)

Postgraduate Course
Nominal Ordinal Interval Ratio
Classification Yes Yes Yes Yes
Rank-order No Yes Yes Yes
Fixed and equal intervals No No Yes Yes
Natural 0 point No No No Yes
Nominal Ordinal Interval Ratio
Mode Yes Yes Yes Yes
Median No Yes Yes Yes
Mean No No Yes Yes
Levels of measurement
Categorical Continuous

Postgraduate Course
Levels of measurement
Ordinal or interval? Can I calculate a mean?
Q3: Every organization is unique, hence the findings from scientific
research are not applicable.
☐ Strongly agree
☐ Somewhat agree
☐ Neither agree or disagree
☐ Somewhat disagree
☐ Strongly disagree

Postgraduate Course
4. Central tendency
The aim is to find a single number that characterises the typical value of
the variable in the sample. Which one you use depends in part on the
level of measurement of the variable.

Postgraduate Course
Central tendency
Central tendency of a set of data / numbers
(what number is most representative of the dataset / population?)
7, 9, 9, 9, 10, 11,11, 13, 13
 Mean = 10,2
 Median = 10
 Mode = 9

Postgraduate Course
Central tendency
Central tendency of a set of data / numbers
(what number is most representative of the dataset / population?)
3, 3, 3, 3, 3, 3, 100
 Mean = 16,9
 Median = 3
 Mode = 3

Postgraduate Course
5. Hypothesis

Postgraduate Course
“It is easy to obtain evidence in favor of virtually any theory,
but such „corroboration‟ should count scientifically only if it
is the positive result of a genuinely „risky‟ prediction, which
might conceivably have been false.
… A theory is scientific only if it is refutable
by a conceivable event. Every genuine test
of a scientific theory, then, is logically an
attempt to refute or to falsify it.”
Hypothesis: falsifiability
Carl Popper

Postgraduate Course
Hypothesis
 Null hypothesis (H0): Big Brother contestants and
members of the public will not differ in their scores on
personality disorder questionnaires
 Alternative hypothesis (H1): Big Brother contestants will
score higher on personality disorder questionnaires
than members of the public.

Postgraduate Course
Hypothesis: type I vs type II error
null hypothesis
is true
& was rejected
(type I error)
α
null hypothesis
is false
& was rejected
(correct conclusion)
null hypothesis
is true
& was accepted
(correct conclusion)
null hypothesis
is false
& was accepted
(type II error)
β
H0 is true H0 is false
reject H0
accept H0

Postgraduate Course
Statistic models

Postgraduate Course
Statistic models: prediction
likely not likely

Postgraduate Course
6. The mean
The most widely used statistic model
μX
_
or
sample population

Postgraduate Course
The mean
EBMgt Lecturer
NumberofFriends

Postgraduate Course
The mean
Assessing the fit of the mean
 Sum of squared errors (SS): (-
1,6) + (-0,6) + (0,4) + (0,4) + (1,4) = 5,2
 Variance (s ): = = 1,3
 Standard deviation (s): √s = 1,14
2 2 2 2 2
2 SS
N-1
5,2
4
2

Postgraduate Course
The second most widely used statistic model
σs or
sample population
7. Standard Deviation

Postgraduate Course
Standard Deviation

Postgraduate Course
110
IQ
Postgraduate Course
Standard Deviation
Which class would you
prefer to teach?
130 170

Postgraduate Course
110 130
IQ
S=10
S=20
S=60
170
Postgraduate Course
Standard Deviation

Postgraduate CoursePostgraduate Course
Standard Deviation

Postgraduate CoursePostgraduate Course
So, what does
“two standard deviations of the mean”
mean?
Standard Deviation

Postgraduate Course
8. Confidence intervals
Postgraduate Course

Postgraduate Course
A confidence interval gives an estimated range
of values which is likely to include an unknown
population parameter (e.g. the mean).
Confidence intervals are usually calculated so
that this percentage is 95% (95% CI)
Confidence intervals

Postgraduate Course
When you see a 95% confidence interval for a
mean, think of it like this: if we‟d collected 100
samples and calculated the mean for each
sample, than for 95 of these samples the mean
would fall within the confidence interval.

Postgraduate Course
1,96!

Postgraduate Course

Postgraduate Course
2008 2009
4,5
4,0
3,5
5,0
3,0
“According to the federal
government, the
unemployment rate has
dropped from 4.3% to 3.8%.”
95% CI= 4,1 - 3,5.
This means the
unemployment rate could
have increased from 4.0 to
4,1 !

Postgraduate Course
When a point estimate (e.g. mean,
percentage) is given, always check:
 standard deviation
or
 confidence interval

Postgraduate Course
9. Statistical significance

Postgraduate Course
Statistical significance
Sir Ronald A. Fisher
1890 - 1962

Significant = the probability of incorrectly rejecting the
null hypothesis (= Type I error, α)
p = 0,05 / p = 0,01
Postgraduate Course
(1 in 20 / 1 in 100)

Postgraduate Course

110 130
Postgraduate Course
1. Is there a difference / an effect?
2. How certain is it that the difference / effect found is not a
chance finding?
X
_
0 X
_
1

Testing multiple hypothesis
When you test 20 different hypotheses (or independent
variables), there is a high chance that at least one will be
statistically significant.
example:
Does apples, bacon, cheese, eggs, fish, garlic, hazelnuts, ice
cream, ketchup, lamb, melons, nuts, oranges, peanut butter,
roasted food, salt, tofu, vinegar, wine or yoghurt cause
cancer?
Postgraduate Course

Significance testing:
always prospective, never retrospective
Postgraduate Course

Statistical significant ≠ practical relevant
Postgraduate Course
Effect size

Postgraduate Course
10. Statistical power

Sample size Effect size
(Significant increase in IQ)
4 10
25 4
100 2
10.000 0,2
Postgraduate Course
Statistical power
The statistical power: the power to detect a meaningful
effect, given sample size, significance level, and effect size.

Postgraduate Course
Overpowered: sample size too large, high
probability of making a Type I error
Underpowered: sample size too small, high
probability of making a Type II error.
Statistical power

Postgraduate Course
11. Effect size

Postgraduate Course
Effect size
Effect size: a standardized measure of the
magnitude of effect, independent of
sample size
standardized > makes it possible to compare effect sizes
across different studies that have measured different
variables, or have used different scales of measurement

Postgraduate Course
Effect sizes
 Cohen‟s d
 Pearson‟s r
 other - Hedges‟ g
- Glass‟ Δ
- odds ratio OR
- relative risk RR

Postgraduate Course
Effect sizes
 Cohen‟s d
Effect size based on means or distances
between/among means
Interpretation
< .10 = small
.30 = moderate
> .50 = large

Postgraduate Course
Effect sizes
 Pearson‟s r
Effect size based on ‘variance explained’
Interpretation
< .10 = small (explains 1% of the total variance)
.30 = moderate (explains 9% of the total variance)
> .50 = large (explains 25% of the total variance)

Postgraduate Course
12. Critical appraisal
When you critically appraise a study, what characteristics
of the findings will you consider to determine its statistical
significance and magnitude?

Postgraduate Course
Critical appraisal
When you critically appraise a study, what characteristics
of the findings will you consider to determine its statistical
significance and magnitude?
 p-value
 confidence interval
 sample size / power
 effect size
 practical relevance

Postgraduate Statistics Course for Evidence-Based Managers

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (17)

Similaire à Postgraduate Statistics Course for Evidence-Based Managers

Similaire à Postgraduate Statistics Course for Evidence-Based Managers (20)

Plus de Center for Evidence-Based Management

Plus de Center for Evidence-Based Management (20)

Dernier

Dernier (20)

Postgraduate Statistics Course for Evidence-Based Managers