SlideShare une entreprise Scribd logo
1  sur  112
Télécharger pour lire hors ligne
SPSS: Basic to Intermediate
Hiram Ting & Ernest Cyril de Run
16-17 May 2015, Kuching
Organized by Sarawak Research Society
Acknowledgement
Gratitude to Prof Ernest Cyril de Run and
Prof Thurasamy Ramayah for providing useful information
during the preparation of the workshop slides.
Content
Installation of SPSS
Introduction to SPSS
Understanding of Analysis
Preliminary Decision
Data Entry
Data Cleaning
Frequency
Cross-tabulation
Normality Test
Reliability Test
Validity Test
Handling Qualitative Data
Test of Independence
Test for Goodness of Fit
Test of Difference
• T-test
• ANOVA
Test of Relationship
• Pearson Correlation
• Linear Regressions
• Multiple Regressions
Factor Analysis
Presentation of Findings
Syntax
Preparation
• Install SPSS.
• Download workshop materials folder.
• Open SPSS Workshop 16-17 May 2015 file in the folder.
• Open SPSS to check whether it works as a full version.
Preparation
Hands-on Exercise
• Install SPSS (set-up)
• Click ‘OK’ for every step.
• Copy and paste license number, or
• Copy and paste crack files in your program folder.
Introduction to SPSS
What is SPSS?
• Statistical Package for the Social Sciences (SPSS) is a widely used
program for statistical analysis in social sciences. It is used by
market researchers, health researchers, survey companies,
government, education researchers, marketing organizations, data
miners and others. It is regarded as the first generation technique.
Introduction to SPSS
What is SPSS?
Statistics included in the software:
• Descriptive statistics: Cross tabulation, Frequencies, Descriptives,
Explore, Descriptive Ratio Statistics.
• Bivariate statistics: Means, t-test, ANOVA, Correlation.
• Prediction for numerical outcomes: Linear regression.
• Prediction for identifying groups: Factor analysis, cluster
analysis, Discriminant analysis.
• Non-parametric tests and others.
When is SPSS Useful
SPSS is useful for:
• Data entry
• Data cleaning
• Descriptive analysis and output
• Parametric and non-parametric test – tests of relationship and
difference
• Data division based on factors and groups
• Quantitative research with observed variables
• Qualitative research with coded themes
Understanding of Analysis
Understanding of Analysis
Before using SPSS, it is important to understand some of the
fundamental things in research and data analysis techniques.
• Types of data
• Levels of measurement
• Types of variable
• Key terms in research
• Types of analysis
• Missing values
Understanding of Analysis
Types of data
• Numeric
• String (Categorical)
Levels of measurement
• Nominal
• Ordinal
• Interval
• Ratio
• Continuous
Understanding of Analysis
Understanding of Analysis
Types of variable
• Independent
• Dependent
• Moderating
• Mediating
• Control
• Endogenous, exogenous
Understanding of Analysis
A
B
C
D G
F
E
J1
J2
J3
J4 H I
Understanding of Analysis
Key terms in research
• Theory
• Concept
• Construct
• Variable, item/indicator
• Model, framework
• Operational definition
Understanding of Analysis
• A theory of systematically interrelated concepts, definitions, and
propositions that are advanced to explain and predict phenomena
(facts).
• A model is defined as a representation of a system that is
constructed to study some aspect of the system or the system as a
whole.
• Theory’s role is explanation whereas a model’s role is
representation.
• While theoretical framework is the theory on which the study is
based, conceptual framework is the operationalization of the theory.
It is the researcher’s own position on the problem and gives
direction to the study.
Understanding of Analysis
• A concept is a generally accepted collection of meanings or
characteristics associated with certain events, objects, conditions,
situations and behavior.
• A construct is an image or abstract idea specifically invented for a
given research and/or theory building purpose.
• A variable can be defined as any aspect of a theory that can vary or
change as part of the interaction within the theory.
• An operational definition is a definition stated in terms of specific
criteria for testing or measurement. Their characteristics and how
they are to be observed must be specified.
Understanding of Analysis
Understanding of Analysis
Types of Analysis
• Parametric
 Normal distribution is assumed
• Non-parametric
 Distribution free
• Types of variables involved
 Univariate, bivariate, multivariate
Understanding of Analysis
Handling blank responses/missing values
• Initial screening
 If the whole page is missing, discard the questionnaire.
 If the whole section is missing, discard the questionnaire.
 If important responses are missing (e.g. key questions using single
item), discard the questionnaire.
 If straight-lining or answering pattern is found, discard the
questionnaire.
Understanding of Analysis
Handling blank responses/missing values
• Data cleaning
 If > 25% missing, remove the observation.
Hair et al. (2014) advocate for less than15%.
 Using the midpoint of the scale.
 Replacing blank responses with a value.
 Mean of those responding or respondents.
 Using Expected Maximization (EM).
Preliminary Decision
Instrument design
• Levels of measurement
• Types of scale
• Single or multiple items
• Positive or negative worded statements
• Structured, semi-structured or unstructured
• Wordings (e.g. double negatives, double-barrelled, culture-specific
terms, long complex questions)
Preliminary Decision
Distribution and collection of data
 Sampling technique
 Paper questionnaire, mail or online
 Interview or self-administered questionnaire
 Response rate: distributed, collected and usable copies
Preliminary Decision
In any report the first thing that is normally reported is the response
rates. When the response rate is low it raises question about the
representativeness of the sample. Another reason is the problem of non
response. Would the responses of those who have not responded be
different form those who responded?
Preliminary Decision
Data analysis and interpretation
 Confidence level
 Significant level
 One-tailed or two-tailed
 Types of analytical method
 Hypothesis development and testing
Preliminary Decision
Addressing Errors
 Random sampling errors
 Systematic errors/non-sampling errors
• Administrative errors: sample selection, administrator, data
processing
• Respondent errors: non-response, response bias - deliberate
falsification, unconscious misinterpretation
 Common method variance and social desirability
Preliminary Decision
Pre-test
 The purpose is to ensure instrument is well-designed, hence the
statements/questions would be understood and responded to the
manner which they were developed for.
 Using pilot study.
 Using debriefing or protocol method.
 Issue with sample size.
Data Entry
An Overview
• SPSS Data Editor
• SPSS Viewer (Output)
• Variable View
Includes Name, Type, Width, Decimals, Label, Values, Missing,
Columns, Align, Measure
• Data View
Data Entry
Data Entry
Rules for naming of variables
• Variable names:
• Must be unique (i.e. each variable must have a different name)
• Must begin with a letter (not a number)
• Cannot include full stops, spaces or symbols (! , ? * “)
• Cannot include words used as commands by SPSS (all, ne, eq, to, le,
lt, by, or, gt, and, not, ge, with)
• Cannot exceed 64 characters.
Data Entry
Hands-on Exercise
• Open SPSS.
• Open Questionnaire Sample.
• Begin with ‘Variable View’, fill up the first row with information
provided in Data Entry Exercise.
• Continue with the second and third rows.
• Continue with the fourth to sixth rows.
• Move to ‘Data View’, fill up the blanks with responses of five
respondents.
Data Cleaning
Hands-on Exercise
• Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Frequencies’.
• Move every variable from left column to right column, click ‘OK’.
• Read the output and check.
• Addressing missing values using EM.
Useful Features
Hands-on Exercise
• Sort the data file
Go to ‘Data’, click ‘Sort Cases’, choose ‘Ascending’ or ‘Descending’
• Split the data file
Go to ‘Data’, click ‘Split File’ and ‘Compare Group’
• Select cases
Go to ‘Data’, click ‘Select Cases’, ‘If Condition is Satisfied’ and ‘If’. For
example, GEN = 1 to select only male respondents
Useful Features
Data Transformation
Reason for transformation
 to improve interpretation and compatibility with other data sets
 to enhance symmetry and stabilize spread
 improve linear relationship between the variables (Standardized
score)
Data Transformation
Hands-on Exercise
• Recode
 The purpose is to
redefine categories of
data.
 Go to ‘Transform’, click
‘Recode into Different
Variables’.
Data Transformation
Hands-on Exercise
• Compute
 The purpose is to
create a new variable.
 Go to ‘Transform’, click
‘Compute Variable’.
Descriptive Analysis
• The purpose is to describe the distribution of the variable of interest.
• It includes Frequencies and Cross-tabulation for nominal or
categorical data, and Descriptives (Mean and Standard Deviation) for
continuous data.
Frequencies
• The purpose is to provide frequency counts. It is useful in presenting
respondents profile and categorical findings.
Hands-on Exercise
• Open Data Analysis Exercise
• Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Frequencies’.
• Splitting dataset is useful when presenting findings based on
categories in separation. Go to ‘Data’, click ‘Split File’
SAMPLE
Cross-tabulation
• The purpose is a joint frequency distribution of cases based on two or
more categorical variables.
• Chi-square will be explained in later slides.
Hands-on Exercise
• Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Crosstabs’. Select
the variables on for ‘Row’ and ‘Column’. In ‘Cell’, click ‘Percentages’.
Cross-tabulation
Descriptives
• The purpose is to provide statistical summary of descriptive findings.
• ‘Kurtosis’ and ‘Skewness’ are useful to assess data distribution.
Hands-on Exercise
• Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Descriptives’.
• Click ‘Option’, check ‘Mean’ and ‘Std. Deviation’.
Descriptives
SAMPLE
Normality Test
• Parametric test assumes data is normally distributed.
• Assessing normality using Q-Q Plots.
• Hands-on Exercise: Go to ‘Analyze’ and click Q-Q Plots.
• Assessing normality using Explore.
• Hands-on Exercise: Go to ‘Analyze’, and click ‘Explore’.
• Assessing outliers using Scatterplot.
• Hands-on Exercise: Go to ‘Graphs’, and click ‘Legacy Dialogs’ and
‘Scatter/Dot’.
Normality Test
• Skewness value provides an indication of the symmetry of the
distribution. Kurtosis, on the other hand, provides information about
the ‘peakedness’ of the distribution.
• If the distribution is perfectly normal, you would obtain a skewness
and kurtosis value of 0 (rather an uncommon occurrence in the social
sciences).
• With reasonably large samples, skewness will not ‘make a
substantive difference in the analysis’ (Tabachnick & Fidell 2007, p.
80). Kurtosis can result in an underestimate of the variance, but this
risk is also reduced with a large sample (200+ cases: see Tabachnick
& Fidell 2007, p. 80).
Normality Test
Normality Test
General guideline
• From 5% Trimmed Mean, compare the original mean and the new
trimmed mean to assess whether extreme scores are having a strong
influence on the mean.
• The values for asymmetry and kurtosis between -2 and +2 are
considered acceptable in order to prove normal univariate distribution
(George & Mallery, 2010).
• For sample more than 100, use Kolmogorov-Smirnoff test; for sample
less than 100, use Shapiro Wilk test. A non-significant result (Sig.
value of more than .05) indicates normality.
• Shape of histogram, Q-Q plots and boxplot.
• Outliers appear as little circles with a number attached. Outliers are
cases with scores that are quite different from the remainder of the
sample, either much higher or much lower.
Normality Test
Goodness of Measures
Reliability and Validity
Goodness of Measures
Reliability and Validity
Goodness of Measures
Reliability
• Typically, in any research we use a number of questions (sometimes)
referred to as items to measure a particular variable.
• Cronbach's Alpha is a measure of how well each individual item in a
scale correlates with the sum of the remaining items. It measures
consistency among individual items in a scale.
• Reliability refers to the degree of consistency, as Kerlinger (1986)
puts it; if a scale possesses a high reliability the scale is
homogeneous. According to Nunnally (1978) alpha values equal to or
greater than 0.70 are considered to be a sufficient condition. Thus, it
can be concluded that these measures possess sufficient reliability.
Goodness of Measures
• Go to ‘Analyze’, click ‘Scale’ and ‘Reliability Analysis’.
Goodness of Measures
Several types of validity
• Content/Face validity
• Convergent validity
• Discriminant validity
• Criterion-related validity
SAMPLE
Goodness of Measures
Content validity
• Content validity refers to the extent to which an instrument covers the
meanings included in the concept (Babbie, 1992). Researchers,
rather than by statistical testing, subjectively judge content validity
(Chow and Lui, 2001). The content validity of the proposed instrument
is at least sufficient because the instrument is carefully refined from a
proven instrument with an exhaustive literature review process (Chow
and Lui, 2001). This can also be tested during the pre-test by using
subjects who are qualified (academicians and practitioners) to rate
whether the content of each factor was well represented by the
measurement items (Saraph et al., 1989). As Nunnally (1967) put it
content validity depends on how well the researchers created
measurement items to cover the domain of the variable being
measured.
Goodness of Measures
Convergent validity
• According to Campbell and Fiske (1959) convergent validity refers to
all items measuring a construct actually loading on a single construct.
• The criteria used by Igbaria et al., 1995 to identify and interpret
factors were: each item should load 0.50 or greater on one factor and
0.35 or lower on the other factor.
• These results confirm that each of these constructs is unidimensional
and factorially distinct and that all items used to measure a particular
construct loaded on a single factor.
Goodness of Measures
Goodness of Measures
Discriminant validity
• Discriminant validity refers to the extent to which measures of 2
different constructs are relatively distinctive, that their correlation
values were neither an absolute value of 0 nor 1 (Campbell and Fiske,
1959). Correlation analysis is used. If all the factors are not perfectly
correlated where their correlation coefficients range between 0 or 1,
we can conclude that discriminant validity has been established.
Goodness of Measures
Criterion-related validity
• Criterion related validity refers to the extent to which the factors
measured are related to pre-specified criteria (Saraph et al., 1986).
This is also called as nomological validity or external validity. We can
also do this by running a multiple regression analysis and looking at
the Multiple R value (correlation coefficient), the values we are looking
for are any values higher that 0.5.
Handling Qualitative Data
Handling Qualitative Data
• If data is collected using qualitative methods, such as interview and
focus group, coding process is required to quantify the themes before
using SPSS to perform any analysis.
Handling Qualitative Data
Handling Qualitative Data
Example: Beliefs about the use of Instagram
1. I enjoy using Instagram coz it is fun.
2. Taking and uploading pictures are what attracts me.
3. When I am bored, I play Instagram.
4. I am enthralled by its ease of use.
5. I find hashtag a useful function of Instagram.
6. It is easy to use, even my little brother is using it.
How many theme(s) can you identify?
Test of Independence
• Chi-square test is used when you wish to explore the relationship
between 2 categorical variables with each having 2 or more
categories.
• It is also a statistical method assessing the goodness of fit between a
set of observed values and those expected hypothetically. It is used
when the parameter to be tested is proportion and there is no
assumption of normality.
• If the level of significance is set at 0.05, then p-value of less than 0.05
means rejection of null hypothesis.
Test of Independence
Chi square test for independence
• Example: Is the proportion of male employees with high intention to
share information the same as the proportion of female with high
intention to share information?
 For Gender, we have (1= Male/2=Female) whereas for Level, we
have(1=Low/2=High)
 As such, we will have a (2 X 2) contingency table
Test of Independence
Test of Independence
Chi square test for goodness of fit
• Example: A researcher would like to test the association between
cigarette smoking and lung cancer. After randomly selecting smokers,
it is found that 25 out of 65 heavy smokers are at high risk of
developing lung cancer while for light smokers the figure is 20 out of
124.
• Ho: There is no association
Ha: There is association
• Findings: X2 = 11.66; p-value = 0.0016
• Decision: Reject null hypothesis
• Conclusion: Smoking and lung cancer risk are associated
SAMPLE
Test of Independence
Hands-on Exercise
Test for Independence/Relatedness
• Go to ‘Analyze’, click ‘Cross-tabulation’.
Test for Goodness of Fit
• Go to ‘Analyze’ , click ‘Non-parametric Test’ and ‘Dialog Legacy’.
Test of Difference
Test of Difference
• Parametric Techniques: t-test; paired t-test, one-way ANOVA, two-
way ANOVA
• Non-Parametric Techniques: Mann-Whitney/Wilcoxon rank sum test;
Wilcoxon signed rank sum test, Kruskal Wallis; Friedman test
Test of Difference
Independent Sample T-test
• Comparing two populations/groups using Mean
Paired Samples T-test
• Comparing the Mean of two related populations/groups
One-way ANOVA
• Comparing the Mean of more than two populations/groups
Hands-on Exercise
• Go to ‘Analyze’, click ‘Compare Means’.
SAMPLE
Test of Difference
How to interpret the findings.
 For Independent t-test, if the Levene test is significant (Sig. value is
less than 0.05), this indicates the variance of the two samples is
significantly different.
 For Paired t-test, the two variables correlate if Sig. value is less than
0.05.
 If the t-test is significant (Sig. 2-tailed value is less than 0.05), this
indicates the two samples are significantly different in the variable
under investigation.
Test of Difference
• Example: A one-way between-group ANOVA was conducted to test
whether intention to share differed by level of education.
Test of Difference
Test of Difference
Test of Difference
Test of Difference
How to interpret the findings.
• There was a statistically significant differences at the p< 0.05 level in
intention scores for the 4 educational levels [F(3,188) = 2.728,
p=0.045]. Despite reaching statistical significance, the actual
difference in mean scores between the groups was quite small. The
effect size, calculated using the eta squared, was 0.04. Post-hoc
comparison using the Duncan’s range test indicated that the mean
score for Masters (M=3.51, SD=0.92) and First degree (M=3.82,
SD=0.62) was statistically different from PhD (M=4.40, SD=0.46).
Those with Diploma education (M=3.89, SD=0.47) did not differ
statistically from the PhD group.
SAMPLE
Test of Relationship
Correlation
• Correlation is used to denote association between two quantitative
variables, assuming that the association is linear. It provides
information about the strength and direction of relationship.
• Strength: 0.10-0.29 (small), 0.30-0.49 (medium), 0.50 (large)
Hands-on Exercise
• Go to ‘Analyze’, click ‘Correlate’ and ‘Bivariate’.
• Select ‘Pearson’ (for continuous data) and ‘One-tailed’.
• If the Sig. value is less than 0.05, then the two variables are
significantly correlated. Only then the strength and direction of
relationship are looked at.
Test of Relationship
How to present the findings.
• There was a strong positive correlation between intention to share
and actual sharing [r=0.76, n=192, p<0.01] with high levels of
intention associated with high levels of actual sharing.
SAMPLE
Test of Relationship
Regressions
• Simple linear regression is used when we would like to see the impact
of a single independent variable on a dependent variable.
• Multiple linear regression is used when we would like to see the
impact of more than one independent variable on a dependent
variable.
• Multiple regression analysis is a statistical technique that can be used
to analyze the relationship between a single dependent variable
(continuous) and several independent variables (continuous or even
nominal). In the case of nominal independent variables, dummy
variables are introduced.
Test of Relationship
• In standard multiple regression, all of the independent variables are
entered into the regression equation at the same time
• Multiple R and R² measure the strength of the relationship between
the set of independent variables and the dependent variable. An F
test is used to determine if the relationship can be generalized to the
population represented by the sample.
• A t-test is used to evaluate the individual relationship between each
independent variable and the dependent variable.
Test of Relationship
Things to consider:
• Strong Theory (conceptual or theoretical)
• Measurement Error
The degree to which the variable is an accurate and consistent
measure of the concept being studied. If the error is high than even
the best predictors may not be able to achieve sufficient predictive
accuracy.
• Specification error
Inclusion of irrelevant variables or the omission of relevant variables
from the set of independent variables.
Test of Relationship
Assumptions
• Normality
One of the basic assumptions is the normality which can be assessed
by plotting the histogram. If the histogram shows not much deviation
then we can assume the data follows a normal distribution.
Test of Relationship
• Normality of the error terms
The second assumption is that the
error term must be normally
distributed. This can be assessed by
looking at the normal P-P plot. The
idea is that the points should be as
close as possible to the diagonal
line. If they are then we can assume
that the error terms are normally
distributed.
Test of Relationship
• Linearity
The third assumption is the relationship between the independent
variables and the dependent variable must be linear. This is
assessed by looking at the partial plots. The idea is to see if we can
draw a straight line on the scatter plot that is generated.
Test of Relationship
• Constant Variance –
Homoscedasticity
The fourth assumption is that the
variance must be constant
(Homoscedasticity) as opposed to
not constant (Heterosedasciticity).
Heterosedasciticity is generally
observed when we see a
consistent pattern when we plot
the studentized residual (SRESID)
against the predicted value of Y
(ZPRED).
Test of Relationship
• Multicollinearity
The fifth assumption is the collinearity problem. This is a problem
when the independent variables are highly correlated among one
another, generally at r > 0.8 to 0.9 which is termed as
multicollinearoty. To assess this assumption we will look at two
indicators. The first one is the VIF and tolerance. A low tolerance
value of < 0.1 will result in a VIF value of > 10 as VIF is actually
1/Tolerance. If the value is more than 10 we can suspect there is a
problem of multicollinearity.
Test of Relationship
• Multicollinearity
The second value that we should look at is the conditional index. If
this value exceeds 30 we can also suspect the presence of
multicollinearity. When the value is more than 30 we should also
look across the variance proportions and see if we can spot any 2 or
more variables with a value of 0.9 and above excluding the
constant. If there are 2 or more than only we can conclude there is
multicollinearity.
Test of Relationship
• Independence of the error term - Autocorrelation
This is an assumption that is particularly a problem with time series
data and not for cross sectional data. We assume that each
predicted value is independent, which means that the predicted
value is not related to any other prediction; that is, they are not
sequenced by any variable such as time. This can be assessed by
looking at the Durbin Watson value. If the D-W value is between 1.5
– 2.5 then we can assume there is no problem.
Test of Relationship
• Outliers
These are values which are extremely large and influential that they
can influence the results of the regression. Usually the threshold is
set at ± 3 standard deviations. Although this is the default some
researchers may set a threshold of ± 2.5 to get better predictive
power. This assumption can be easily identified by looking at
whether there are casewise diagnostics.
SAMPLE
Test of Relationship
Hands-on Exercise
• Go to ‘Analyze’, click
‘Regression’ and
‘Linear’
Descriptive Statistics
3.15 2.653 113
2.12 1.084 113
2.90 1.575 113
HOW OFTEN R ATTENDS
RELIGIOUS SERVICES
STRENGTH OF
AFFILIATION
HOW OFTEN DOES R
PRAY
Mean Std. Dev iation N
The minimum ratio of valid cases to independent variables for multiple
regression is 5 to 1. With 113 valid cases and 2 independent variables,
the ratio for this analysis is 56.5 to 1, which satisfies the minimum
requirement.
Different authors tend to give different guidelines concerning the
number of cases required for multiple regression. Stevens (1996, p.
72) recommends that ‘for social science research, about 15
participants per predictor are needed for a reliable equation’.
Test of Relationship
ANOVAb
374.757 2 187.379 49.824 .000a
413.685 110 3.761
788.442 112
Regression
Residual
Total
Model
1
Sum of
Squares df Mean Square F Sig.
Predictors: (Constant), HOW OFTEN DOES R PRAY, STRENGTH OF AFFILIATIONa.
Dependent Variable: HOW OFTEN R ATTENDS RELIGIOUS SERVICESb.
The probability of the F statistic (49.824) for the
overall regression relationship is <0.001, less than or
equal to the level of significance of 0.05. We reject
the null hypothesis that there is no relationship
between the set of independent variables and the
dependent variable (R² = 0). We support the
research hypothesis that there is a statistically
significant relationship between the set of
independent variables and the dependent variable.
Test of Relationship
Model Summary
.689a
.475 .466 1.939
Model
1
R R Square
Adjusted
R Square
Std. Error of
the Estimate
Predictors: (Constant), HOW OFTEN DOES R PRAY,
STRENGTH OF AFFILIATION
a.
Look in the Model Summary box and check the value given under the
heading R Square. This tells you how much of the variance in the
dependent variable is explained by the model. The rule of thumb: a
correlation less than or equal to 0.20 is characterized as very weak;
greater than 0.20 and less than or equal to 0.40 is weak; greater than 0.40
and less than or equal to 0.60 is moderate; greater than 0.60 and less
than or equal to 0.80 is strong; and greater than 0.80 is very strong.
You will notice an Adjusted R Square value in the output. When a small
sample is involved, the R square value in the sample tends to be a rather
optimistic overestimation of the true value in the population (see
Tabachnick & Fidell 2007). The Adjusted R square statistic ‘corrects’ this
value to provide a better estimate of the true population value.
Test of Relationship
Coefficientsa
7.167 .442 16.206 .000
-1.138 .194 -.465 -5.857 .000
-.554 .134 -.329 -4.145 .000
(Constant)
STRENGTH OF
AFFILIATION
HOW OFTEN
DOES R PRAY
Model
1
B Std. Error
Unstandardized
Coeff icients
Beta
Standardized
Coeff icients
t Sig.
Dependent Variable: HOW OFTEN R ATTENDS RELIGIOUS SERVICESa.
For the independent variable strength of affiliation, the
probability of the t statistic (-5.857) for the b
coefficient is <0.001 which is less than or equal to the
level of significance of 0.05. We reject the null
hypothesis that the slope associated with strength of
affiliation is equal to zero (b = 0) and conclude that
there is a statistically significant relationship between
strength of affiliation and frequency of attendance at
religious services.
Test of Relationship
Coefficientsa
7.167 .442 16.206 .000
-1.138 .194 -.465 -5.857 .000
-.554 .134 -.329 -4.145 .000
(Constant)
STRENGTH OF
AFFILIATION
HOW OFTEN
DOES R PRAY
Model
1
B Std. Error
Unstandardized
Coeff icients
Beta
Standardized
Coeff icients
t Sig.
Dependent Variable: HOW OFTEN R ATTENDS RELIGIOUS SERVICESa.
The beta coefficient associated with strength of affiliation is negative, indicating
an inverse relationship in which higher numeric values for strength of affiliation
are associated with lower numeric values for frequency of attendance at
religious services.
To compare the different variables it is important that you look at the
standardised coefficients, not the unstandardised ones. ‘Standardised’ means
that these values for each of the different variables have been converted to the
same scale so that you can compare them. If you were interested in
constructing a regression equation, you would use the unstandardised
coefficient values listed as B.
Factor Analysis
• The purpose is to define the underlying structure in a data matrix;
analyze the structure of interrelationships among a large number of
variables by defining a set of common underlying dimensions called
factors.
• Factor analysis in SPSS is exploratory in function. The analysis is
driven by data, rather than theory.
• Sample size: preferably >100 cases or the ratio of 20:1
(case/variable).
Factor Analysis
• Important decisions includes:
 Correlation matrix: KMO and Barlett’s test, Anti-image
 Methods of extracting factors: Principal component
 Latent root/eigenvalues criterion (>1)
 Apriori criterion on number to be extracted
 Percentage of variance explained (>50)
 Rotation: Varimax, Promax
 Loading significance (> 0.3 if 350 cases, > 0.5 if 120)
Factor Analysis
SAMPLE
Factor Analysis
Hands-on Exercise
• Go to ‘Analyze’, click ‘Data Reduction’ and ‘Factor’.
Using Syntax
• Syntax in SPSS is the program language.
• If you need to repeat your analysis, you can save the command
language in a ‘Syntax’ file so that you can run an analysis at a later
date or to repeat various analyses.
• Whenever you run an analysis, you will notice that there is a Paste
button. When you click on the paste button, a syntax file will open with
the syntax for the analysis that you intended to do.
Hands-on Exercise
• Go to ‘File’, click ‘New/Open’ and ‘Syntax’.
Using Syntax
Thank You
Thank You
Next workshop
22 May : Advanced PLS-SEM
23-24 May : Advanced SPSS, Process
Join us at Sarawak Research Society on
Thank You
Hiram Ting, PhD
Email: hiramparousia@gmail.com
Ernest Cyril de Run, PhD
Email: drernest@feb.unimas.my

Contenu connexe

Tendances

Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
Nursing Path
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysis
sristi1992
 

Tendances (20)

Inferential statistics.ppt
Inferential statistics.pptInferential statistics.ppt
Inferential statistics.ppt
 
Spss
SpssSpss
Spss
 
Level Of Measurement
Level Of MeasurementLevel Of Measurement
Level Of Measurement
 
"A basic guide to SPSS"
"A basic guide to SPSS""A basic guide to SPSS"
"A basic guide to SPSS"
 
Data Analysis and Statistics
Data Analysis and StatisticsData Analysis and Statistics
Data Analysis and Statistics
 
Spss an introduction
Spss  an introductionSpss  an introduction
Spss an introduction
 
Data analysis
Data analysisData analysis
Data analysis
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Introduction to Statistics
Introduction to StatisticsIntroduction to Statistics
Introduction to Statistics
 
Math 102- Statistics
Math 102- StatisticsMath 102- Statistics
Math 102- Statistics
 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
 
What is statistics
What is statisticsWhat is statistics
What is statistics
 
DATA Types
DATA TypesDATA Types
DATA Types
 
Spss training notes
Spss training notesSpss training notes
Spss training notes
 
(Manual spss)
(Manual spss)(Manual spss)
(Manual spss)
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysis
 
Data management in Stata
Data management in StataData management in Stata
Data management in Stata
 
Data Analysis Using Spss T Test
Data Analysis Using Spss   T TestData Analysis Using Spss   T Test
Data Analysis Using Spss T Test
 
Basic Statistics & Data Analysis
Basic Statistics & Data AnalysisBasic Statistics & Data Analysis
Basic Statistics & Data Analysis
 
SPSS How to use Spss software
SPSS How to use Spss softwareSPSS How to use Spss software
SPSS How to use Spss software
 

En vedette

Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notes
David mbwiga
 
Quantitative analysis using SPSS
Quantitative analysis using SPSSQuantitative analysis using SPSS
Quantitative analysis using SPSS
Alaa Sadik
 
Business Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysisBusiness Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysis
Ahsan Khan Eco (Superior College)
 
Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSS
Phi Jack
 
Research Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSSResearch Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSS
GB Technical University
 
How to enter and analyze questionnaire (survey) data in SPSS
How to enter and analyze questionnaire (survey) data in SPSSHow to enter and analyze questionnaire (survey) data in SPSS
How to enter and analyze questionnaire (survey) data in SPSS
quantitative_specialists
 

En vedette (20)

Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notes
 
Data analysis using spss
Data analysis using spssData analysis using spss
Data analysis using spss
 
Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)Statistical Package for Social Science (SPSS)
Statistical Package for Social Science (SPSS)
 
Introduction to spss
Introduction to spssIntroduction to spss
Introduction to spss
 
Kofi nyanteng cleaning and screning data using spss
Kofi nyanteng   cleaning and screning data using spssKofi nyanteng   cleaning and screning data using spss
Kofi nyanteng cleaning and screning data using spss
 
Data cleaning and screening
Data cleaning and screeningData cleaning and screening
Data cleaning and screening
 
Basics of SPSS, Part 2
Basics of SPSS, Part 2Basics of SPSS, Part 2
Basics of SPSS, Part 2
 
Basics of SPSS, Part 1
Basics of SPSS, Part 1Basics of SPSS, Part 1
Basics of SPSS, Part 1
 
Brief Introduction to the 12 Steps of Evaluation Data Cleaning
Brief Introduction to the 12 Steps of Evaluation Data CleaningBrief Introduction to the 12 Steps of Evaluation Data Cleaning
Brief Introduction to the 12 Steps of Evaluation Data Cleaning
 
Quantitative analysis using SPSS
Quantitative analysis using SPSSQuantitative analysis using SPSS
Quantitative analysis using SPSS
 
Introduction to SPSS 22 - step by steps
Introduction to SPSS 22 - step by stepsIntroduction to SPSS 22 - step by steps
Introduction to SPSS 22 - step by steps
 
Business Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysisBusiness Research Methods. data collection preparation and analysis
Business Research Methods. data collection preparation and analysis
 
SPSS statistics - how to use SPSS
SPSS statistics - how to use SPSSSPSS statistics - how to use SPSS
SPSS statistics - how to use SPSS
 
Introduction To SPSS
Introduction To SPSSIntroduction To SPSS
Introduction To SPSS
 
Data cleansing
Data cleansingData cleansing
Data cleansing
 
Research Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSSResearch Methodology (MBA II SEM) - Introduction to SPSS
Research Methodology (MBA II SEM) - Introduction to SPSS
 
Data Cleaning Techniques
Data Cleaning TechniquesData Cleaning Techniques
Data Cleaning Techniques
 
Quantitative Data Analysis
Quantitative Data AnalysisQuantitative Data Analysis
Quantitative Data Analysis
 
Basic guide to SPSS
Basic guide to SPSSBasic guide to SPSS
Basic guide to SPSS
 
How to enter and analyze questionnaire (survey) data in SPSS
How to enter and analyze questionnaire (survey) data in SPSSHow to enter and analyze questionnaire (survey) data in SPSS
How to enter and analyze questionnaire (survey) data in SPSS
 

Similaire à Workshop on SPSS: Basic to Intermediate Level

Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysis
ILRI-Jmaru
 
Data analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptxData analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptx
Juma675663
 

Similaire à Workshop on SPSS: Basic to Intermediate Level (20)

5.Measurement and scaling technique.pptx
5.Measurement and scaling technique.pptx5.Measurement and scaling technique.pptx
5.Measurement and scaling technique.pptx
 
Analyzing survey data
Analyzing survey dataAnalyzing survey data
Analyzing survey data
 
APSY3206 Lecture 1.pptx
APSY3206 Lecture 1.pptxAPSY3206 Lecture 1.pptx
APSY3206 Lecture 1.pptx
 
Introduction to Data Analysis for Nurse Researchers
Introduction to Data Analysis for Nurse ResearchersIntroduction to Data Analysis for Nurse Researchers
Introduction to Data Analysis for Nurse Researchers
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysis
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
lecture-8.pdf
lecture-8.pdflecture-8.pdf
lecture-8.pdf
 
Presentation 7.pptx
Presentation 7.pptxPresentation 7.pptx
Presentation 7.pptx
 
Statistics for Librarians, Session 4: Statistics best practices
Statistics for Librarians, Session 4: Statistics best practicesStatistics for Librarians, Session 4: Statistics best practices
Statistics for Librarians, Session 4: Statistics best practices
 
Data analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptxData analysis plan in medicine and nurse.pptx
Data analysis plan in medicine and nurse.pptx
 
Mini datathon - Bengaluru
Mini datathon - BengaluruMini datathon - Bengaluru
Mini datathon - Bengaluru
 
2 statistics, measurement, graphical techniques
2 statistics, measurement, graphical techniques2 statistics, measurement, graphical techniques
2 statistics, measurement, graphical techniques
 
Research and Data Analysi-1.pptx
Research and Data Analysi-1.pptxResearch and Data Analysi-1.pptx
Research and Data Analysi-1.pptx
 
Practical applications and analysis in Research Methodology
Practical applications and analysis in Research Methodology Practical applications and analysis in Research Methodology
Practical applications and analysis in Research Methodology
 
Scales of Measurements.pptx
Scales of Measurements.pptxScales of Measurements.pptx
Scales of Measurements.pptx
 
Introduction to Data Management in Human Ecology
Introduction to Data Management in Human EcologyIntroduction to Data Management in Human Ecology
Introduction to Data Management in Human Ecology
 
Chapter Eight Quantitative Methods
Chapter Eight Quantitative MethodsChapter Eight Quantitative Methods
Chapter Eight Quantitative Methods
 
Data analysis using spss
Data analysis using spssData analysis using spss
Data analysis using spss
 
research Qualitative vs. quantitative research
research Qualitative vs. quantitative researchresearch Qualitative vs. quantitative research
research Qualitative vs. quantitative research
 
Nursing Data Analysis.pptx
Nursing Data Analysis.pptxNursing Data Analysis.pptx
Nursing Data Analysis.pptx
 

Plus de Hiram Ting (9)

Research Proposal Seminar
Research Proposal SeminarResearch Proposal Seminar
Research Proposal Seminar
 
Workshop Slides on Research Proposal and Procedure 190415
Workshop Slides on Research Proposal and Procedure 190415Workshop Slides on Research Proposal and Procedure 190415
Workshop Slides on Research Proposal and Procedure 190415
 
Workshop Slides on Research Proposal and Procedure 180415
Workshop Slides on Research Proposal and Procedure 180415Workshop Slides on Research Proposal and Procedure 180415
Workshop Slides on Research Proposal and Procedure 180415
 
InHouse Training 141114
InHouse Training 141114InHouse Training 141114
InHouse Training 141114
 
In house training 151114 qualitative research
In house training 151114 qualitative researchIn house training 151114 qualitative research
In house training 151114 qualitative research
 
In house training 141114 qualitative research
In house training 141114 qualitative researchIn house training 141114 qualitative research
In house training 141114 qualitative research
 
Sharing My PhD Experience
Sharing My PhD ExperienceSharing My PhD Experience
Sharing My PhD Experience
 
Qualitative data analysis
Qualitative data analysisQualitative data analysis
Qualitative data analysis
 
Webometrics Unimas
Webometrics UnimasWebometrics Unimas
Webometrics Unimas
 

Dernier

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Krashi Coaching
 

Dernier (20)

Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 

Workshop on SPSS: Basic to Intermediate Level

  • 1. SPSS: Basic to Intermediate Hiram Ting & Ernest Cyril de Run 16-17 May 2015, Kuching Organized by Sarawak Research Society
  • 2. Acknowledgement Gratitude to Prof Ernest Cyril de Run and Prof Thurasamy Ramayah for providing useful information during the preparation of the workshop slides.
  • 3. Content Installation of SPSS Introduction to SPSS Understanding of Analysis Preliminary Decision Data Entry Data Cleaning Frequency Cross-tabulation Normality Test Reliability Test Validity Test Handling Qualitative Data Test of Independence Test for Goodness of Fit Test of Difference • T-test • ANOVA Test of Relationship • Pearson Correlation • Linear Regressions • Multiple Regressions Factor Analysis Presentation of Findings Syntax
  • 4. Preparation • Install SPSS. • Download workshop materials folder. • Open SPSS Workshop 16-17 May 2015 file in the folder. • Open SPSS to check whether it works as a full version.
  • 5. Preparation Hands-on Exercise • Install SPSS (set-up) • Click ‘OK’ for every step. • Copy and paste license number, or • Copy and paste crack files in your program folder.
  • 6. Introduction to SPSS What is SPSS? • Statistical Package for the Social Sciences (SPSS) is a widely used program for statistical analysis in social sciences. It is used by market researchers, health researchers, survey companies, government, education researchers, marketing organizations, data miners and others. It is regarded as the first generation technique.
  • 7. Introduction to SPSS What is SPSS? Statistics included in the software: • Descriptive statistics: Cross tabulation, Frequencies, Descriptives, Explore, Descriptive Ratio Statistics. • Bivariate statistics: Means, t-test, ANOVA, Correlation. • Prediction for numerical outcomes: Linear regression. • Prediction for identifying groups: Factor analysis, cluster analysis, Discriminant analysis. • Non-parametric tests and others.
  • 8. When is SPSS Useful SPSS is useful for: • Data entry • Data cleaning • Descriptive analysis and output • Parametric and non-parametric test – tests of relationship and difference • Data division based on factors and groups • Quantitative research with observed variables • Qualitative research with coded themes
  • 10. Understanding of Analysis Before using SPSS, it is important to understand some of the fundamental things in research and data analysis techniques. • Types of data • Levels of measurement • Types of variable • Key terms in research • Types of analysis • Missing values
  • 11. Understanding of Analysis Types of data • Numeric • String (Categorical) Levels of measurement • Nominal • Ordinal • Interval • Ratio • Continuous
  • 13. Understanding of Analysis Types of variable • Independent • Dependent • Moderating • Mediating • Control • Endogenous, exogenous
  • 14. Understanding of Analysis A B C D G F E J1 J2 J3 J4 H I
  • 15. Understanding of Analysis Key terms in research • Theory • Concept • Construct • Variable, item/indicator • Model, framework • Operational definition
  • 16. Understanding of Analysis • A theory of systematically interrelated concepts, definitions, and propositions that are advanced to explain and predict phenomena (facts). • A model is defined as a representation of a system that is constructed to study some aspect of the system or the system as a whole. • Theory’s role is explanation whereas a model’s role is representation. • While theoretical framework is the theory on which the study is based, conceptual framework is the operationalization of the theory. It is the researcher’s own position on the problem and gives direction to the study.
  • 17. Understanding of Analysis • A concept is a generally accepted collection of meanings or characteristics associated with certain events, objects, conditions, situations and behavior. • A construct is an image or abstract idea specifically invented for a given research and/or theory building purpose. • A variable can be defined as any aspect of a theory that can vary or change as part of the interaction within the theory. • An operational definition is a definition stated in terms of specific criteria for testing or measurement. Their characteristics and how they are to be observed must be specified.
  • 19. Understanding of Analysis Types of Analysis • Parametric  Normal distribution is assumed • Non-parametric  Distribution free • Types of variables involved  Univariate, bivariate, multivariate
  • 20. Understanding of Analysis Handling blank responses/missing values • Initial screening  If the whole page is missing, discard the questionnaire.  If the whole section is missing, discard the questionnaire.  If important responses are missing (e.g. key questions using single item), discard the questionnaire.  If straight-lining or answering pattern is found, discard the questionnaire.
  • 21. Understanding of Analysis Handling blank responses/missing values • Data cleaning  If > 25% missing, remove the observation. Hair et al. (2014) advocate for less than15%.  Using the midpoint of the scale.  Replacing blank responses with a value.  Mean of those responding or respondents.  Using Expected Maximization (EM).
  • 22. Preliminary Decision Instrument design • Levels of measurement • Types of scale • Single or multiple items • Positive or negative worded statements • Structured, semi-structured or unstructured • Wordings (e.g. double negatives, double-barrelled, culture-specific terms, long complex questions)
  • 23. Preliminary Decision Distribution and collection of data  Sampling technique  Paper questionnaire, mail or online  Interview or self-administered questionnaire  Response rate: distributed, collected and usable copies
  • 24. Preliminary Decision In any report the first thing that is normally reported is the response rates. When the response rate is low it raises question about the representativeness of the sample. Another reason is the problem of non response. Would the responses of those who have not responded be different form those who responded?
  • 25. Preliminary Decision Data analysis and interpretation  Confidence level  Significant level  One-tailed or two-tailed  Types of analytical method  Hypothesis development and testing
  • 26. Preliminary Decision Addressing Errors  Random sampling errors  Systematic errors/non-sampling errors • Administrative errors: sample selection, administrator, data processing • Respondent errors: non-response, response bias - deliberate falsification, unconscious misinterpretation  Common method variance and social desirability
  • 27. Preliminary Decision Pre-test  The purpose is to ensure instrument is well-designed, hence the statements/questions would be understood and responded to the manner which they were developed for.  Using pilot study.  Using debriefing or protocol method.  Issue with sample size.
  • 28. Data Entry An Overview • SPSS Data Editor • SPSS Viewer (Output) • Variable View Includes Name, Type, Width, Decimals, Label, Values, Missing, Columns, Align, Measure • Data View
  • 30. Data Entry Rules for naming of variables • Variable names: • Must be unique (i.e. each variable must have a different name) • Must begin with a letter (not a number) • Cannot include full stops, spaces or symbols (! , ? * “) • Cannot include words used as commands by SPSS (all, ne, eq, to, le, lt, by, or, gt, and, not, ge, with) • Cannot exceed 64 characters.
  • 31. Data Entry Hands-on Exercise • Open SPSS. • Open Questionnaire Sample. • Begin with ‘Variable View’, fill up the first row with information provided in Data Entry Exercise. • Continue with the second and third rows. • Continue with the fourth to sixth rows. • Move to ‘Data View’, fill up the blanks with responses of five respondents.
  • 32. Data Cleaning Hands-on Exercise • Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Frequencies’. • Move every variable from left column to right column, click ‘OK’. • Read the output and check. • Addressing missing values using EM.
  • 33. Useful Features Hands-on Exercise • Sort the data file Go to ‘Data’, click ‘Sort Cases’, choose ‘Ascending’ or ‘Descending’ • Split the data file Go to ‘Data’, click ‘Split File’ and ‘Compare Group’ • Select cases Go to ‘Data’, click ‘Select Cases’, ‘If Condition is Satisfied’ and ‘If’. For example, GEN = 1 to select only male respondents
  • 35. Data Transformation Reason for transformation  to improve interpretation and compatibility with other data sets  to enhance symmetry and stabilize spread  improve linear relationship between the variables (Standardized score)
  • 36. Data Transformation Hands-on Exercise • Recode  The purpose is to redefine categories of data.  Go to ‘Transform’, click ‘Recode into Different Variables’.
  • 37. Data Transformation Hands-on Exercise • Compute  The purpose is to create a new variable.  Go to ‘Transform’, click ‘Compute Variable’.
  • 38. Descriptive Analysis • The purpose is to describe the distribution of the variable of interest. • It includes Frequencies and Cross-tabulation for nominal or categorical data, and Descriptives (Mean and Standard Deviation) for continuous data.
  • 39. Frequencies • The purpose is to provide frequency counts. It is useful in presenting respondents profile and categorical findings. Hands-on Exercise • Open Data Analysis Exercise • Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Frequencies’. • Splitting dataset is useful when presenting findings based on categories in separation. Go to ‘Data’, click ‘Split File’
  • 41. Cross-tabulation • The purpose is a joint frequency distribution of cases based on two or more categorical variables. • Chi-square will be explained in later slides. Hands-on Exercise • Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Crosstabs’. Select the variables on for ‘Row’ and ‘Column’. In ‘Cell’, click ‘Percentages’.
  • 43. Descriptives • The purpose is to provide statistical summary of descriptive findings. • ‘Kurtosis’ and ‘Skewness’ are useful to assess data distribution. Hands-on Exercise • Go to ‘Analyze’, click ‘Descriptive Statistics’ and ‘Descriptives’. • Click ‘Option’, check ‘Mean’ and ‘Std. Deviation’.
  • 46. Normality Test • Parametric test assumes data is normally distributed. • Assessing normality using Q-Q Plots. • Hands-on Exercise: Go to ‘Analyze’ and click Q-Q Plots. • Assessing normality using Explore. • Hands-on Exercise: Go to ‘Analyze’, and click ‘Explore’. • Assessing outliers using Scatterplot. • Hands-on Exercise: Go to ‘Graphs’, and click ‘Legacy Dialogs’ and ‘Scatter/Dot’.
  • 47. Normality Test • Skewness value provides an indication of the symmetry of the distribution. Kurtosis, on the other hand, provides information about the ‘peakedness’ of the distribution. • If the distribution is perfectly normal, you would obtain a skewness and kurtosis value of 0 (rather an uncommon occurrence in the social sciences). • With reasonably large samples, skewness will not ‘make a substantive difference in the analysis’ (Tabachnick & Fidell 2007, p. 80). Kurtosis can result in an underestimate of the variance, but this risk is also reduced with a large sample (200+ cases: see Tabachnick & Fidell 2007, p. 80).
  • 49. Normality Test General guideline • From 5% Trimmed Mean, compare the original mean and the new trimmed mean to assess whether extreme scores are having a strong influence on the mean. • The values for asymmetry and kurtosis between -2 and +2 are considered acceptable in order to prove normal univariate distribution (George & Mallery, 2010). • For sample more than 100, use Kolmogorov-Smirnoff test; for sample less than 100, use Shapiro Wilk test. A non-significant result (Sig. value of more than .05) indicates normality. • Shape of histogram, Q-Q plots and boxplot. • Outliers appear as little circles with a number attached. Outliers are cases with scores that are quite different from the remainder of the sample, either much higher or much lower.
  • 53. Goodness of Measures Reliability • Typically, in any research we use a number of questions (sometimes) referred to as items to measure a particular variable. • Cronbach's Alpha is a measure of how well each individual item in a scale correlates with the sum of the remaining items. It measures consistency among individual items in a scale. • Reliability refers to the degree of consistency, as Kerlinger (1986) puts it; if a scale possesses a high reliability the scale is homogeneous. According to Nunnally (1978) alpha values equal to or greater than 0.70 are considered to be a sufficient condition. Thus, it can be concluded that these measures possess sufficient reliability.
  • 54. Goodness of Measures • Go to ‘Analyze’, click ‘Scale’ and ‘Reliability Analysis’.
  • 55. Goodness of Measures Several types of validity • Content/Face validity • Convergent validity • Discriminant validity • Criterion-related validity
  • 57. Goodness of Measures Content validity • Content validity refers to the extent to which an instrument covers the meanings included in the concept (Babbie, 1992). Researchers, rather than by statistical testing, subjectively judge content validity (Chow and Lui, 2001). The content validity of the proposed instrument is at least sufficient because the instrument is carefully refined from a proven instrument with an exhaustive literature review process (Chow and Lui, 2001). This can also be tested during the pre-test by using subjects who are qualified (academicians and practitioners) to rate whether the content of each factor was well represented by the measurement items (Saraph et al., 1989). As Nunnally (1967) put it content validity depends on how well the researchers created measurement items to cover the domain of the variable being measured.
  • 58. Goodness of Measures Convergent validity • According to Campbell and Fiske (1959) convergent validity refers to all items measuring a construct actually loading on a single construct. • The criteria used by Igbaria et al., 1995 to identify and interpret factors were: each item should load 0.50 or greater on one factor and 0.35 or lower on the other factor. • These results confirm that each of these constructs is unidimensional and factorially distinct and that all items used to measure a particular construct loaded on a single factor.
  • 60. Goodness of Measures Discriminant validity • Discriminant validity refers to the extent to which measures of 2 different constructs are relatively distinctive, that their correlation values were neither an absolute value of 0 nor 1 (Campbell and Fiske, 1959). Correlation analysis is used. If all the factors are not perfectly correlated where their correlation coefficients range between 0 or 1, we can conclude that discriminant validity has been established.
  • 61. Goodness of Measures Criterion-related validity • Criterion related validity refers to the extent to which the factors measured are related to pre-specified criteria (Saraph et al., 1986). This is also called as nomological validity or external validity. We can also do this by running a multiple regression analysis and looking at the Multiple R value (correlation coefficient), the values we are looking for are any values higher that 0.5.
  • 63. Handling Qualitative Data • If data is collected using qualitative methods, such as interview and focus group, coding process is required to quantify the themes before using SPSS to perform any analysis.
  • 65. Handling Qualitative Data Example: Beliefs about the use of Instagram 1. I enjoy using Instagram coz it is fun. 2. Taking and uploading pictures are what attracts me. 3. When I am bored, I play Instagram. 4. I am enthralled by its ease of use. 5. I find hashtag a useful function of Instagram. 6. It is easy to use, even my little brother is using it. How many theme(s) can you identify?
  • 66. Test of Independence • Chi-square test is used when you wish to explore the relationship between 2 categorical variables with each having 2 or more categories. • It is also a statistical method assessing the goodness of fit between a set of observed values and those expected hypothetically. It is used when the parameter to be tested is proportion and there is no assumption of normality. • If the level of significance is set at 0.05, then p-value of less than 0.05 means rejection of null hypothesis.
  • 67. Test of Independence Chi square test for independence • Example: Is the proportion of male employees with high intention to share information the same as the proportion of female with high intention to share information?  For Gender, we have (1= Male/2=Female) whereas for Level, we have(1=Low/2=High)  As such, we will have a (2 X 2) contingency table
  • 69. Test of Independence Chi square test for goodness of fit • Example: A researcher would like to test the association between cigarette smoking and lung cancer. After randomly selecting smokers, it is found that 25 out of 65 heavy smokers are at high risk of developing lung cancer while for light smokers the figure is 20 out of 124. • Ho: There is no association Ha: There is association • Findings: X2 = 11.66; p-value = 0.0016 • Decision: Reject null hypothesis • Conclusion: Smoking and lung cancer risk are associated
  • 71. Test of Independence Hands-on Exercise Test for Independence/Relatedness • Go to ‘Analyze’, click ‘Cross-tabulation’. Test for Goodness of Fit • Go to ‘Analyze’ , click ‘Non-parametric Test’ and ‘Dialog Legacy’.
  • 73. Test of Difference • Parametric Techniques: t-test; paired t-test, one-way ANOVA, two- way ANOVA • Non-Parametric Techniques: Mann-Whitney/Wilcoxon rank sum test; Wilcoxon signed rank sum test, Kruskal Wallis; Friedman test
  • 74. Test of Difference Independent Sample T-test • Comparing two populations/groups using Mean Paired Samples T-test • Comparing the Mean of two related populations/groups One-way ANOVA • Comparing the Mean of more than two populations/groups Hands-on Exercise • Go to ‘Analyze’, click ‘Compare Means’.
  • 76. Test of Difference How to interpret the findings.  For Independent t-test, if the Levene test is significant (Sig. value is less than 0.05), this indicates the variance of the two samples is significantly different.  For Paired t-test, the two variables correlate if Sig. value is less than 0.05.  If the t-test is significant (Sig. 2-tailed value is less than 0.05), this indicates the two samples are significantly different in the variable under investigation.
  • 77. Test of Difference • Example: A one-way between-group ANOVA was conducted to test whether intention to share differed by level of education.
  • 81. Test of Difference How to interpret the findings. • There was a statistically significant differences at the p< 0.05 level in intention scores for the 4 educational levels [F(3,188) = 2.728, p=0.045]. Despite reaching statistical significance, the actual difference in mean scores between the groups was quite small. The effect size, calculated using the eta squared, was 0.04. Post-hoc comparison using the Duncan’s range test indicated that the mean score for Masters (M=3.51, SD=0.92) and First degree (M=3.82, SD=0.62) was statistically different from PhD (M=4.40, SD=0.46). Those with Diploma education (M=3.89, SD=0.47) did not differ statistically from the PhD group.
  • 83. Test of Relationship Correlation • Correlation is used to denote association between two quantitative variables, assuming that the association is linear. It provides information about the strength and direction of relationship. • Strength: 0.10-0.29 (small), 0.30-0.49 (medium), 0.50 (large) Hands-on Exercise • Go to ‘Analyze’, click ‘Correlate’ and ‘Bivariate’. • Select ‘Pearson’ (for continuous data) and ‘One-tailed’. • If the Sig. value is less than 0.05, then the two variables are significantly correlated. Only then the strength and direction of relationship are looked at.
  • 84. Test of Relationship How to present the findings. • There was a strong positive correlation between intention to share and actual sharing [r=0.76, n=192, p<0.01] with high levels of intention associated with high levels of actual sharing.
  • 86. Test of Relationship Regressions • Simple linear regression is used when we would like to see the impact of a single independent variable on a dependent variable. • Multiple linear regression is used when we would like to see the impact of more than one independent variable on a dependent variable. • Multiple regression analysis is a statistical technique that can be used to analyze the relationship between a single dependent variable (continuous) and several independent variables (continuous or even nominal). In the case of nominal independent variables, dummy variables are introduced.
  • 87. Test of Relationship • In standard multiple regression, all of the independent variables are entered into the regression equation at the same time • Multiple R and R² measure the strength of the relationship between the set of independent variables and the dependent variable. An F test is used to determine if the relationship can be generalized to the population represented by the sample. • A t-test is used to evaluate the individual relationship between each independent variable and the dependent variable.
  • 88. Test of Relationship Things to consider: • Strong Theory (conceptual or theoretical) • Measurement Error The degree to which the variable is an accurate and consistent measure of the concept being studied. If the error is high than even the best predictors may not be able to achieve sufficient predictive accuracy. • Specification error Inclusion of irrelevant variables or the omission of relevant variables from the set of independent variables.
  • 89. Test of Relationship Assumptions • Normality One of the basic assumptions is the normality which can be assessed by plotting the histogram. If the histogram shows not much deviation then we can assume the data follows a normal distribution.
  • 90. Test of Relationship • Normality of the error terms The second assumption is that the error term must be normally distributed. This can be assessed by looking at the normal P-P plot. The idea is that the points should be as close as possible to the diagonal line. If they are then we can assume that the error terms are normally distributed.
  • 91. Test of Relationship • Linearity The third assumption is the relationship between the independent variables and the dependent variable must be linear. This is assessed by looking at the partial plots. The idea is to see if we can draw a straight line on the scatter plot that is generated.
  • 92. Test of Relationship • Constant Variance – Homoscedasticity The fourth assumption is that the variance must be constant (Homoscedasticity) as opposed to not constant (Heterosedasciticity). Heterosedasciticity is generally observed when we see a consistent pattern when we plot the studentized residual (SRESID) against the predicted value of Y (ZPRED).
  • 93. Test of Relationship • Multicollinearity The fifth assumption is the collinearity problem. This is a problem when the independent variables are highly correlated among one another, generally at r > 0.8 to 0.9 which is termed as multicollinearoty. To assess this assumption we will look at two indicators. The first one is the VIF and tolerance. A low tolerance value of < 0.1 will result in a VIF value of > 10 as VIF is actually 1/Tolerance. If the value is more than 10 we can suspect there is a problem of multicollinearity.
  • 94. Test of Relationship • Multicollinearity The second value that we should look at is the conditional index. If this value exceeds 30 we can also suspect the presence of multicollinearity. When the value is more than 30 we should also look across the variance proportions and see if we can spot any 2 or more variables with a value of 0.9 and above excluding the constant. If there are 2 or more than only we can conclude there is multicollinearity.
  • 95. Test of Relationship • Independence of the error term - Autocorrelation This is an assumption that is particularly a problem with time series data and not for cross sectional data. We assume that each predicted value is independent, which means that the predicted value is not related to any other prediction; that is, they are not sequenced by any variable such as time. This can be assessed by looking at the Durbin Watson value. If the D-W value is between 1.5 – 2.5 then we can assume there is no problem.
  • 96. Test of Relationship • Outliers These are values which are extremely large and influential that they can influence the results of the regression. Usually the threshold is set at ± 3 standard deviations. Although this is the default some researchers may set a threshold of ± 2.5 to get better predictive power. This assumption can be easily identified by looking at whether there are casewise diagnostics.
  • 98. Test of Relationship Hands-on Exercise • Go to ‘Analyze’, click ‘Regression’ and ‘Linear’ Descriptive Statistics 3.15 2.653 113 2.12 1.084 113 2.90 1.575 113 HOW OFTEN R ATTENDS RELIGIOUS SERVICES STRENGTH OF AFFILIATION HOW OFTEN DOES R PRAY Mean Std. Dev iation N The minimum ratio of valid cases to independent variables for multiple regression is 5 to 1. With 113 valid cases and 2 independent variables, the ratio for this analysis is 56.5 to 1, which satisfies the minimum requirement. Different authors tend to give different guidelines concerning the number of cases required for multiple regression. Stevens (1996, p. 72) recommends that ‘for social science research, about 15 participants per predictor are needed for a reliable equation’.
  • 99. Test of Relationship ANOVAb 374.757 2 187.379 49.824 .000a 413.685 110 3.761 788.442 112 Regression Residual Total Model 1 Sum of Squares df Mean Square F Sig. Predictors: (Constant), HOW OFTEN DOES R PRAY, STRENGTH OF AFFILIATIONa. Dependent Variable: HOW OFTEN R ATTENDS RELIGIOUS SERVICESb. The probability of the F statistic (49.824) for the overall regression relationship is <0.001, less than or equal to the level of significance of 0.05. We reject the null hypothesis that there is no relationship between the set of independent variables and the dependent variable (R² = 0). We support the research hypothesis that there is a statistically significant relationship between the set of independent variables and the dependent variable.
  • 100. Test of Relationship Model Summary .689a .475 .466 1.939 Model 1 R R Square Adjusted R Square Std. Error of the Estimate Predictors: (Constant), HOW OFTEN DOES R PRAY, STRENGTH OF AFFILIATION a. Look in the Model Summary box and check the value given under the heading R Square. This tells you how much of the variance in the dependent variable is explained by the model. The rule of thumb: a correlation less than or equal to 0.20 is characterized as very weak; greater than 0.20 and less than or equal to 0.40 is weak; greater than 0.40 and less than or equal to 0.60 is moderate; greater than 0.60 and less than or equal to 0.80 is strong; and greater than 0.80 is very strong. You will notice an Adjusted R Square value in the output. When a small sample is involved, the R square value in the sample tends to be a rather optimistic overestimation of the true value in the population (see Tabachnick & Fidell 2007). The Adjusted R square statistic ‘corrects’ this value to provide a better estimate of the true population value.
  • 101. Test of Relationship Coefficientsa 7.167 .442 16.206 .000 -1.138 .194 -.465 -5.857 .000 -.554 .134 -.329 -4.145 .000 (Constant) STRENGTH OF AFFILIATION HOW OFTEN DOES R PRAY Model 1 B Std. Error Unstandardized Coeff icients Beta Standardized Coeff icients t Sig. Dependent Variable: HOW OFTEN R ATTENDS RELIGIOUS SERVICESa. For the independent variable strength of affiliation, the probability of the t statistic (-5.857) for the b coefficient is <0.001 which is less than or equal to the level of significance of 0.05. We reject the null hypothesis that the slope associated with strength of affiliation is equal to zero (b = 0) and conclude that there is a statistically significant relationship between strength of affiliation and frequency of attendance at religious services.
  • 102. Test of Relationship Coefficientsa 7.167 .442 16.206 .000 -1.138 .194 -.465 -5.857 .000 -.554 .134 -.329 -4.145 .000 (Constant) STRENGTH OF AFFILIATION HOW OFTEN DOES R PRAY Model 1 B Std. Error Unstandardized Coeff icients Beta Standardized Coeff icients t Sig. Dependent Variable: HOW OFTEN R ATTENDS RELIGIOUS SERVICESa. The beta coefficient associated with strength of affiliation is negative, indicating an inverse relationship in which higher numeric values for strength of affiliation are associated with lower numeric values for frequency of attendance at religious services. To compare the different variables it is important that you look at the standardised coefficients, not the unstandardised ones. ‘Standardised’ means that these values for each of the different variables have been converted to the same scale so that you can compare them. If you were interested in constructing a regression equation, you would use the unstandardised coefficient values listed as B.
  • 103. Factor Analysis • The purpose is to define the underlying structure in a data matrix; analyze the structure of interrelationships among a large number of variables by defining a set of common underlying dimensions called factors. • Factor analysis in SPSS is exploratory in function. The analysis is driven by data, rather than theory. • Sample size: preferably >100 cases or the ratio of 20:1 (case/variable).
  • 104. Factor Analysis • Important decisions includes:  Correlation matrix: KMO and Barlett’s test, Anti-image  Methods of extracting factors: Principal component  Latent root/eigenvalues criterion (>1)  Apriori criterion on number to be extracted  Percentage of variance explained (>50)  Rotation: Varimax, Promax  Loading significance (> 0.3 if 350 cases, > 0.5 if 120)
  • 106. SAMPLE
  • 107. Factor Analysis Hands-on Exercise • Go to ‘Analyze’, click ‘Data Reduction’ and ‘Factor’.
  • 108. Using Syntax • Syntax in SPSS is the program language. • If you need to repeat your analysis, you can save the command language in a ‘Syntax’ file so that you can run an analysis at a later date or to repeat various analyses. • Whenever you run an analysis, you will notice that there is a Paste button. When you click on the paste button, a syntax file will open with the syntax for the analysis that you intended to do. Hands-on Exercise • Go to ‘File’, click ‘New/Open’ and ‘Syntax’.
  • 111. Thank You Next workshop 22 May : Advanced PLS-SEM 23-24 May : Advanced SPSS, Process Join us at Sarawak Research Society on
  • 112. Thank You Hiram Ting, PhD Email: hiramparousia@gmail.com Ernest Cyril de Run, PhD Email: drernest@feb.unimas.my