2. Outline
• What Statistical Tests to Use?
• Correlation Tests
• t-Tests
• To play around with the data, please download
the file: Statistics-Inferential.xlsx
Download from https://goo.gl/eY8j6N or
http://www.filehosting.org/file/details/491184/Statist
ics-Inferential.xlsx
Scan the QR code or
4. Decision on the Statistical Tests
• Depends on
The design of the research
• To see the relationship of the variables?
• To see if there are any changes in the participants
after certain treatment?
• Etc.
Can the results be generalized?
• Assumptions – conclusions – actions
6. Case: I couldn’t meet Ast today at 1.30 PM
• Assumptions
12-1 PM official lunch time in SWCU
Everybody needs lunch
Classes at FLL usually go from 11 AM – 1 PM then
from 2-4 PM
• Conclusions
Every lecturer in SWCU will have lunch at 12-1 PM
Every lecturer may teach 11 AM – 1 PM then from 2-4
PM
• Action
See Ast between 1-2 PM
7. But…
• Assumptions
Ast hates me for God knows what reasons
• Conclusions
He will not see me at all
• Action
That’s probably why he refuses to see me at 1.30
PM today.
8. How do you know your assumptions are right?
• It’s regulation/convention
But are you sure it’s regulated in SWCU and FLL?
• It’s what usually happens in SWCU and FLL
Offices are closed between 12-1 PM
Lecturers are seen at campus cafes having lunch
during 12-1 PM
Schedule of classes
• Where did your assumption go wrong?
How can you be so sure that Ast hates you?
9. What has Ast to do with ResMeth?
• Assumption must be correct, otherwise the
conclusion will not be correct
• What made your conclusion wrong in the
case of Ast?
Feelings and not what NORMALLY happens
either by regulation/convention in the
POPULATION (SWCU/FLL)
• Remember NORMAL DISTRIBUTION?
10. Looking back at previous meetings…
• The aim of doing quantitative research is to
generalize the results for the population
• Assumption
Population normal distribution
Sample normal distribution
• Conclusion
If my sample is normally distributed, I can expect to
generalize it to the population
• Action
My research recommendations can be applied in the
population
11. Parametric vs. Non-Parametric Tests
• Some statistical tests are parametric tests based on
the normal distribution
• A parametric test requires parametric data from
one of the large catalogue of distributions that
statisticians have described (regulation/convention)
• Parametric data certain assumptions must be
true.
A parametric test for NON parametric data inaccurate
results
• very important check the assumptions before
deciding which statistical test is appropriate
13. • Positively related one up, the other up
• Not related at all same no matter what
• Negatively related one up, the other
down
How 2 variables could be related?
14. Correlational Tests
• Parametric Test
Pearson’s Product Moment Correlation
• Non-Parametric
Spearman’s Correlation Coefficient
Kendall’s tau (τ)
• To decide:
Check the assumptions
1 assumption violated non-parametric
15. What are the underlying assumptions?
1. Related pairs
2. Scale of measurements
3. Normality
4. Linearity
5. Homoscedasticity
Testing:
1 & 2 design of the research
3-5 testable using graphic & tests
16. Related Pairs
• Data must be collected from related pairs
• 1 data from one variable, 1 data from the
other variable
• E.g. Relationship between gender and
English competence
Arif has data for gender “male” and for English
competence “84 points”
17. Scale of Measurements
• Interval or ratio
• Do you still remember what they are?
Continuous
Not categorical
• E.g. Arif
Gender nominal (categorical)
Competence ratio (continuous)
• One assumption violated!
Go to non-parametric (Spearman’s or Kendall’s)
18. Warning!
• Difference in literature
Coakes (2005) both variables must be
continuous - interval
Field (2009) interval or one variable can be
categorical – binary
• I’m inclined to Coakes
The scatterplot when one variable is interval and
the other is binary is not homoscedasticity (I’ll
show you later why this matters)
20. Normality & Linearity
• In SPSS (relatively easier)
Together with descriptive statistics report &
linearity
• Test by:
Graphic
Normality tests
21. Normality and Linearity
• Analyze | Descriptive Statistics | Explore
Select the variable you want to test
Statistics: tick
• Descriptives
Plots: tick
• Histogram
• Normality plots with tests
22. Normality
• From Kolmogorov-Smirnov (K-S) & Shapiro-Wilk
(S-W)
Sig. <.05 significantly different from normal
distribution
competence sig. = .008 <.05 data not normal
Shapiro – Wilk is more powerful (maybe K-S sig, S-W
not sig.)
23. Normality
• Graphic – Histogram
not bell-shaped not
normal
• Psst.. Normality line
here is added as a
guide.
How? Try right clicking
the graphic & edit the
content. Find this icon
in the bar:
24. Normality
• Is your data normally distributed?
0
2
4
6
8
10
12
14
46 47 52 74 79
Series1
Poly. (Series1)
25. Linearity
• How your data for
each variable falls
in a linear line
• MS Excel – not
possible
• SPSS – yes!
See the test of
normality
26. Homoscedascity
• How your data clustered into certain areas
when two variables are related
• To see if they have similar variance along
the linear line
• Why this is important?
Not wide difference between data
Too wide --> not normal
27. Homoscedasticity
• MS Excel – not possible
• SPSS – yes!
Graph | Legacy Dialogs |
Scatter/Dot | Simple Scatter
Choose the two variables for
X axis and Y axis
• Psst.. Linear line here is
added as a guide.
How? Try right clicking the
graphic & edit the content.
Find this icon in the bar:
28. Homoscedasticity
Gender vs. Competence
• Heteroscedasticity
• Not normal
Competence vs. Graduation
• Homoscedasticity
• Maybe normal
Can’t do categorical variable!
Coakes wins!
29. Once you’ve done all of this assumption checking…
• Select the correlational test the data falls
into
• Our correlational tests are bivariate
correlation
Between 2 variables
• We’re not dealing with partial correlation
(between 2 variables plus one or more
controlling variables) later when you’re
more ‘grown up’ in statistics
30. • Pearson product-moment correlation
(standardized measurement)
Symbol : r or R
-1 to +1
To measure size of the effect
• ± 0.1 small effect
• ± 0.3 medium effect
• ± 0.5 large effect
•
How do we measure relationships?
31. Pearson’s Correlation Coefficient
• Using MS Excel – Data | Data Analysis |
Correlation
• Downsides
Only for Pearson’s, not Spearman’s or Kendall’s
No indicator of significance of relationship
Only the strength of correlation coefficient
Competence Graduation
Competence 1
Graduation 0.954149422 1
32. • Analyze | Correlate |
Bivariate
• Input the variables
used in Variables
• Default: Pearson
• Options: Spearman
and Kendall
• One- vs. two-tailed
One-tailed
directional
hypothesis (the more
x, the more y)
Two-tailed not
sure
Bivariate Correlation (Using SPSS)
33. • Interpretation of
the result table
** significant
correlation
r value Pearson
Correlation value
Significant or not
Sig. <.05
• What does this
numbers mean?
Pearson’s Correlation Coefficient
34. • Correlation result ≠ causality
• Third-variable problem
Maybe there is an influence of third variable
• Direction of causality
No clear indication which variable causes the
other variable to change
Warning: Causality!!!
35. • Non-parametric statistic
Not normal data distribution, etc.
Not interval data ordinal data
• Interpretation of the result table
** significant correlation
rs -- Correlation coefficient value
Significant or not Sig. <.05
Spearman’s Correlation Coefficient
36. • Non-parametric statistic
Small data set which when it is ranked it has
many scores with the same rank
More accurate generalization than Spearman’s
• Interpretation of the result table
** significant correlation
τ – Correlation coefficient value
Significant or not Sig. <.05
Kendall’s tau (τ)
37. • Tell:
How big
Significant value
• Important Notes:
No zero before the decimal point for correlation
coefficient (for example -- .87 NOT 0.87)
Correlation coefficient in different letters (r, rs, or τ)
One-tailed must be reported
Standard criteria for p value (probabilities) -- .05, .01
and .001
How to Report Correlation Coefficients
38. • Pearson’s
There is a significant correlation between X variable
and Y variable, r = .87, p (one-tailed) <.05
• Spearman’s
X variable is significantly correlated with Y variable,
rs = .87 (p <.01)
• Kendall’s
There was a positive relationship between X variable
and Y variable, τ = .47, p<.05
Example of Reports
40. What is it for?
• Looking at the effect(s) of one variable to
another
• By systematically changing some aspect of
that variable
• To compare two means of the data
41. Comparing 2 means of data
• Between-group, between-subjects or
independent design
DIFFERENT participants to different
experimental manipulations
• A repeated-measures design
SAME participants to different experimental
manipulations at different points in time
42. Comparing 2 Means Using t-Tests
Different participants
Between groups, between subjects, or independent
design
Single Sample
From one sample
compared to the
population
Test scores of a group in
a semester compared to
previous group’s scores
Independent or
Two- Sample
Two samples with
different conditions
Test scores of 2 groups
with different teachers
after a semester
Same participants
Repeated measures
design
Paired- or
Dependent sample
From two samples
of the same
condition
The scores of a group
before and after a
semester
43. Assumptions of the t-tests
1. Scale of Measurement – continuous interval
2. Random sampling
3. Normality
4. Additional for Independent t-test
1. Independent of groups – inclusion into one group
only, and not the other group
2. Homogeneity of variance – Levene’s test
(presented in SPSS results for independent t-test)
44. Single Sample t-Test
• Comparing the mean of a
data set with a set means
of other aggregate data
• MS Excel no!
• SPSS Analyze |
Compare Means | One
Sample t-Test
Input the Test Variable
compared
Input the Test Value
(aggregate data)
45. Single Sample t-Test: Results & Report
• Reporting:
There is no significant difference in the graduation grade
between this year’s participants with previous year’s
participants ( t(19) = .493, p>.05), although this year’s
participants have slightly higher grade (Mean Difference =
1.4)
Significant sig. <.05
t positive this data > previous aggregate data
46. Using MS Excel for Other t-Tests
• Only for
Paired-sample T-Test
Independent T-Test
• Assuming equal variance
• Assuming non-equal variance
Reject or accept the null hypothesis there is
no difference of means in the two variables
47. Paired-Samples t-Test
• Comparing the means of the same group
participants under two conditions
• Samples two sets of data, but paired
(from the same participants)
• E.g. The pre-test vs. post-test scores of a
group participants
• E.g. The scores of a group participants after
being taught using picture vs. film
48. Paired-Sample t-Test in MSExcel
• H0 = there is no difference
between the two groups
• Data | Data Analysis | t-
Test: Paired two Sample
for Means | Select Variable
1 & 2 | Select Output
Range
• P (T<=) two-tail <t Critical
two-tail = reject H0
What’s the result?
• t Stat is minus
the pre (competence) <the post (graduation)
49. Paired-Samples t-Test in SPSS
• Analyze | Compare
Means | Paired-
Samples T-Test |
Input the two
variables
50. Results
• Paired-Samples Statistics
• Paired-Samples Correlations
Pearson’s r and sig. (r see effect,
significant <.05)
• Paired-Samples Test
Mean = difference of means between groups
t value = minus first variable has smaller mean
df = sample size – 1 (degree of freedom)
Sig. = significant p <.05
51. Results
Pearson’s r
significant sig. <.05
Correlation size of
effect
significant sig. <.05
t minus first variable has smaller mean
52. Reporting on Results
On average, the participants has significantly
higher scores on variable graduation grade (M=
71.40, SE = 2.001), than on variable competence
score (M= 67.95, SE = 2.328, t(19) = .00, p<.05)
with large effect r = .954)
Legend
• M – mean
• SE – standard error
• t (19) – df
• r – this formula (large effect)
53. Independent T-test
• Compare the means of two groups’ participants
in two different conditions
• The groups are independent of each other
MS Excel – always assume unequal variances or
do F-Test Two Sample for Variance to decide if they
are equal/unequal, then choose appropriate
independent t-test
SPSS -- checked using Levene’s test in the results of
independent t-test
• E.g. the scores of two groups’ participants after
being taught using pictures vs. film
54. Independent T-test using MSExcel
• Data | Data Analysis | t-Test:
Two-Sample Assuming
Unequal Variances | Select
Variable 1 & 2 (by group) |
Select Output Range
• H0 = there is no difference
between the two groups
• P (T<=) two-tail <t Critical
two-tail = reject H0
What’s the result?
• t Stat is minus
Pictures group < film group
55. Independent T-test Using SPSS
• Analyze | Compare
Means |
Independent-
Samples t-Test |
Insert the test
variable & grouping
variable
56. Results
• Group Statistics
• Independent Samples Test
Homogeneity of Variances using Levene’s test –
should be NOT significant (groups are similar) sig
>.05 See sig. of equal variances assumed
(otherwise See not assumed)
Mean = difference of means between groups
t value = minus first group has smaller mean
df = sample size – 1 (degree of freedom)
Sig. = significant p <.05
57. Results
Sig. > .05 group is similar
(good!) equal variances
assumed
significant sig. >.05
Mean Difference minus first group
has smaller mean
58. Reporting on Results
• On average, participants that were taught
using film had higher scores (M=72,
SE=2.921), than those taught using pictures
(M=70.80, SE=2. 878). This difference was
not significant t(18)=-.773, p>.05.
• Legend – same as in dependent t-test
59. Confused?
• Ask now
• Ask me – F 505 by appointments
• Email me – neny@staff.uksw.edu
• Twit me -- @nenyish
• This presentation file is available at: