SlideShare a Scribd company logo
1 of 120
SPSS & Quantitative Data
Analysis
Kulbir Singh Birak
∗ SPSS is a computer program for analysing
quantitative data.
∗ This can range from basic descriptive statistics such as
the mean, mode, median and range to powerful tests
of significance (So whether we accept or reject a
hypothesis).
∗ What the data looks like, and what that means if
anything.
What is SPSS?
∗ You can access SPSS on the vast majority of PC’s at UCS, in these labs, the
Waterfront PC’s and the library PC’s
∗ Additionally, if you wish you can borrow a copy of SPSS to install on your
own home PC or laptop. There are 16 copies in the library you just need to
borrow the disc and input the license code that comes with it (license’s do
come to an end and when they do you can just come and borrow a new
version of SPSS or attain a new license code)
∗ Over night loan only or you can bring your laptops in and do there and then
∗ Windows version only, no Apple version
SPSS Access
Overview
∗ Why do numbers matter in
research design?
∗ Numbers allow you to do
two basic things:
- Count how often
“something”
happens
- Count how big an
issue “something” is
Overview
∗ Once you can count the
extent (how often) and
nature (for quantitative
research a numerical
descriptor of an
attribute) you can
already do some pretty
important things. You
can answer questions
such as:
∗ How common is an issue?
∗ For instance, are black
children over-represented in
care? Are black adults over-
represented in psychiatric
hospital?
∗ How serious is a particular
issue? Or how is it distributed
within a sample?
∗ For instance, how serious are
the concerns about children in
families allocated a social
worker?
Overview
Once you can count stuff you can start to answer other
important and interesting questions, for instance:
∗ Students may often come to you with various questions about
SPSS and difficulties that they are having
∗ If you are lucky enough to catch them early on a lot of
unnecessary frustration and stress about analysing data can be
avoided.
∗ The most important thing a student can do before they even
consider methodology, methods or analysis is to have a clear
research question/aim and hypotheses in place that
conceptualise and operationalise the variables they wish to
study.
SPSS and Quantitative Data
Some Basic Definitions
A variable is the “thing” that you’re
interested in studying e.g. depression,
gender differences, social deprivation,
specific crime rates, levels of
emotionality (how emotional someone
is) or different types of food!
• Things like depression, gender differences,
social deprivation, specific crime rates, levels of
emotionality and food type, etc. are called
“variables” because they vary.
● Some people are more depressed than others
● Some people are men, and others are women
● Some Social policies may be more successful than
others
● We may see different crimes committed in different
contexts, areas
● Some people are less emotional than others
● Food types can range from pizza to hamburgers to
filet mignon, or might be Thai, Ethiopian, Polish or
American cuisine, etc., etc.
TO “CONCEPTUALISE” A VARIABLE MEANS
TO MAKE CLEAR WHAT YOU MEAN BY THE
VARIABLE….
• For example, for the variable “food type,”
you need to be clear about whether you
mean
• (1) vegetarian or meat, OR
• (2) breakfast, lunch or dinner foods, OR
• (3) Ethiopian, Thai or American foods, OR
• (4) something else!
TO “OPERATIONALISE” A VARIABLE IS
TO DECIDE HOW YOU WILL MEASURE
IT
• For example, if the variable you’re
interested in is depression:
● Will you ask people to rate themselves, and if
so, on what sort of a scale?
● Alternatively, will you measure depression by
facial expression? By some behaviour that you
observe? In some other way?
TO “OPERATIONALISE” A VARIABLE IS
TO DECIDE HOW YOU WILL MEASURE
IT
• If the variable you’re studying is intelligence & you
don’t think Exam scores are a good measure of
intelligence, what measure WILL you use?
• Asking these sorts of questions is completing the
process of “operationalising” your variables.
• Conceptualisation & Operationalisation are
necessary for a Quantitative approach
Exploratory Descriptive Causal/Relationship
Exploratory research
is undertaken when
few or no previous
studies exist. The
aim is to look for
patterns, hypotheses
or ideas that can be
tested and will form
the basis for further
research.
Typical research
techniques would
include case studies,
observation and
reviews of previous
related studies and
data.
Data from exploratory studies
tends to be qualitative.
Expands on the Exploratory
Descriptive research
can be used to
identify and classify
the elements or
characteristics of
the subject, e.g.
number of days youth offenders
remained out of trouble.
Quantitative
techniques are most
often used to
collect, analyse and
summarise data.
Causal and Relationship
research focuses on being able
to predict/hypothesise cause and
effect between observed
behaviours, or relationships
between aspects of
behaviour/society/crime rates.
The idea is that Causal and
Relationship research is moving
a step beyond descriptive
research and the quantitative
data collected can be used and
analysed in a manner that allows
the researcher to infer a
significant effect/difference or
relationship
TYPES OF QUANTITATIVE RESEARCH
Aims and Objectives
• The Quantitative approach sets out at the start of a study
with a research question and a hypothesis/prediction
• Hypotheses are formal statements of predictions derived
from evidence from earlier research and/or theory.
• The null hypothesis (H0) is a statement of ‘no
difference/effect/change’ between the variables
• The experimental hypothesis (H1) is a statement of
difference/relationships between variables
QUANTITATIVE DESIGNS AND
HYPOTHESES
• Experimental Hypothesis: Students who
study for tests in study groups will score
significantly better on their exams than
students who did not study in study groups
• Null Hypothesis: There will be no significant
difference in exam results between students
who do and do not study in study groups
EXAMPLE OF HYPOTHESIS
∗ This clarity in the question and hypothesis can make life
markedly easier for yourselves and the student in the long run.
∗ However, I appreciate that this is not always the easiest/or will
not be the case for you more often than not.
∗ So what I will be covering with you today is a brief introduction
to the SPSS interface and as to how we would go about doing
the initial basics of data entry and beginning to explore
descriptive data.
∗ If we can I’ll also take you through examples of some basic
significance testing (otherwise I’ll put up so available)
∗ A light but important session.
∗ Going over the basics of how to input data, label your variables
so it is clear and how to create codebooks.
∗ It’s all about building up your confidence with the interface,
and developing good practise.
∗ It’s about doing the basics so as to avoid confusion later on,
e.g. inputting the data correctly for different types of analysis.
Data Entry and Descriptives
∗ Hopefully should be familiar with the idea of
descriptive data.
∗ As the name suggests they are what we use to describe
the data we have.
∗ There’s no point in knowing that the IQ scores between
two groups are significantly different if we don’t have a
way of describing the scores, and the difference.
∗ Measures of central tendency: Mean, mode, median etc.
∗ Measures of dispersion: Standard deviation etc.
Descriptive Stats
Levels of Measurement
∗ In 1946 Stevens proposed a theory of scales of
measurement.
∗ Nominal data (lowest level of measurement)
∗ Ordinal data (unable to differentiate points on
scale)
∗ Interval data (points on scale equal distance
apart)
∗ Ratio data (equal distance between points on
scale)
Nominal
∗ Provides the least exact information
∗ Participants are placed in categories
∗ Data that is categorical e.g. gender, colours, shoe type, play behaviour
∗ Variable must fit into one category
∗ Measure of frequency
∗ Numbers may be used but only as category labels
∗ Central tendency is described using the mode
∗ Data is represented using a frequency table or bar
chart
Examples: Nominal Data
∗ Type of Bicycle
∗ Mountain bike, road bike, chopper, folding, BMX.
∗ Ethnicity
∗ White British, Afro-Caribbean, Asian, Chinese,
other, etc. (note problems with these categories).
∗ Smoking status
∗ smoker, non-smoker
Ordinal
∗ Simplest true scale, orders measurements along a continuum
∗ Represent rank position in a group e.g. 1st
, 2nd
, 3rd
…10th
∗ No information on difference between positions
∗ Central tendency is described in terms of the median
∗ Dispersion can be measured using the range or inter-quartile
range (middle 50% of the distribution)
Ordinal Data
∗ A type of categorical data in which order is
important.
∗ Class of degree-1st
class, 2:1, 2:2, 3rd class, fail
∗ Degree of illness- none, mild, moderate,
acute, chronic.
∗ Opinion of students about stats classes-
Very unhappy, unhappy, neutral, happy, ecstatic!
Interval and ratio variables
∗ According to Fielding & Gilbert (2000) these are often used
interchangeably, and incorrectly by social scientists (pg15)
∗ Interval, ordered categories, no inherent concept of zero (Clark 2004),
we can calculate meaningful distance between categories, few real
examples of interval variables in social sciences (Fielding & Gilbert
2000:15)
∗ Ratio. A meaningful zero amount (e.g. income), possible to calculate
ratios so also has the interval property (e.g. someone earning £20,000
earns twice as much as someone who earns £10,000) (Fielding & Gilbert
2000:15)
∗ Difference between interval and ratio usually not important for statistical
analysis (Fielding & Gilbert 2000:15)
Interval variables- Examples
∗ Fahrenheit temperature scale- Zero is
arbitrary- 40 Degrees is not twice as hot as 20
degrees.
∗ IQ tests. No such thing as Zero IQ. 120 IQ not
twice as intelligent as 60.
∗ Question- Can we assume that attitudinal data
represents real, quantifiable measured
categories? (ie. That ‘very happy’ is twice as
happy as plain ‘happy’ or that ‘Very unhappy’
means no happiness at all). Statisticians not in
agreement on this.
Ratio variables-Examples
∗ Can be discrete or continuous data.
∗ The distance between any two adjacent units of
measurement (intervals) is the same and there is a
meaningful zero point (Papadopoulos, 2001)
∗ Income- someone earning £20,000 earns twice as much
as someone who earns £10,000.
∗ Height
∗ Unemployment rate- measured as the number of
jobseekers as a percentage of the labour force
(Papadopoulos, 2001).
∗ If you are still a little worried about your
understanding of Quantitative Data please see the
Key Information Handout in the Folder.
∗ By David Bowers (Learning Development)
∗ A reasonable summary of information about
quantitative data.
∗ Data types, appropriate measures of central tendency
etc.
Key Information Handout
∗ Everything we do today is about good practice.
∗ Following the steps today, and developing correct
inputting skills, will save you lots of problems and
heartache later.
∗ SPSS is fussy when it comes to the way data is
entered.
Importance of Good Practice
∗ As SPSS is a Quantitative Data analysis software you
often have to reduce information down to a
numerical state
∗ A Codebook allows you to keep a record of these
reductions and decisions
∗ A record of your own. Separate from SPSS.
∗ Electronic or on paper.
∗ A list of variables, full names, and how you have
coded data.
Codebook
∗ The codes you give data to allow SPSS to analyse it.
∗ You can’t enter text so some variables need to be
converted.
E.g.
∗ Gender: Female may become 1, Male may become 2.
∗ Relationship Status: Single may become 1, Married 2,
Divorced 3, Widowed 4…
Coding
∗ SPSS is fussy when it comes to the names you give variables.
∗ Can’t give them a full description in the main view.
∗ So you can give detailed labels in the special variable view.
∗ Along with a codebook it helps keep the information clear.
Labelling
∗ Available on email that was circulated to you all
∗ File: Data Entry Exercise 1 - Optimism Data
∗ We’ll be creating a codebook, setting up SPSS
according to the codebook, and then entering the
data.
1st
Exercise
∗ Good habits
∗ Create a new Folder on your Desktop
∗ Right-click on Desktop> New > Folder > “SPSS”
New Data Folder
∗ Start>All Programs>IBM SPSS Statistics 19.
∗ Depending on version may have a slightly different
name.
∗ GIVE IT TIME SPSS IS RENOWNED FOR TAKING AN
AGE TO OPEN UP – CLICKING AGAIN ONLY SLOWS IT
DOWN MORE AS IT’LL THEN TRY TO OPEN ANOTHER
SPSS WINDOW
Open SPSS
Open SPSS
∗ Optimism Scale data from 4 participants
∗ Firstly, we are going to prepare a codebook
Coding Data
Optimism Hand-out
∗ Rules for naming of variables
∗ Variable names:
∗ must be unique (i.e. each variable in a data set must
have a different name);
∗ must begin with a letter (not a number);
∗ cannot include full stops, spaces or other characters (!, ?
* ");
∗ cannot include words used as commands by SPSS (all,
ne, eq, to, le, lt, by, or, gt, and, not, ge, with)
Coding Data
Optimism scale items op1 to 4 Enter number circled 1 (strongly disagree)
to 5 (strongly agree)
Coding Data
∗ Now we have a codebook to keep things clear we can
set up SPSS so it is ready for the data.
∗ SPSS has 3 views: Data, Variable and Output.
∗ By switching to Variable we can define the variables
we need.
Creating a data file and inputting
data
Defining Variables
Variable View
Naming Variables
Decimals
Labels
Values
Enter the relevant
value and label as per
your codebook, then
click add. When all
have been entered,
click OK
Define the meaning
of the values used in
the codebook
(Gender) and click
add for each.
Values
Values
When entering likert
data always use the
limits of the scale (1-5)
even if you know that
participants may not
have entered some
responses. You also
need to decide whether
you are going o just
enter the range or every
labeled point.
Values
∗ Data comes in different types.
∗ Categorical (Nominal in SPSS)
∗ Ordinal
∗ Scale/Interval (Scale in SPSS)
∗ Different types/measures suit different tests,
different measures of central tendency, different
forms of visualisation.
∗ Makes knowing what type of data you have KEY for
successful data analysis.
Measures
Measures
Scale refers to
interval/ratio level of
measurement - There is
some debate about data
type in relation to likert
data … for our purposes,
leave this as Scale
Nominal refers to
catergorical
Measures
∗ Now you have the variables set up ready for the data you can
start to enter the actual data
∗ Go to the Data View
Inputting Data According to the
Codebook
Inputting Data According to the
Codebook
Inputting Data According to the
Codebook
Saving the File
Saving the File
∗ You’ve saved the data so now it is ‘safe’
∗ You can have a play around with it and try a few
different things.
∗ Delete a case
∗ Insert a case between existing cases
∗ Delete a variable
∗ Insert a variable between existing variables
∗ Try during the workshop/at home so you get more
confident with SPSS.
Playing around with the data
∗ Available on LearnUCS.
∗ Different experimental designs require a different style of
inputting.
∗ The structure you use will be different between Repeated
(Within-Group) and Independent (Between-Group)
experimental designs.
∗ Use the wrong structure and the analysis will fall down. It
will be meaningless at best.
2nd
Exercise: Inputting Repeated and
Independent Measures
∗ So, to recap
∗ Repeated Measures. The same participants experience all
treatments/are in all the groups/conditions.
∗ If you wanted to investigate the effect of music on taking
an IQ test participants would experience the no music
condition, and the music condition.
∗ Hopefully with some counterbalancing.
Repeated Measures
Repeated Measures
Repeated Measures
∗ Again to recap.
∗ Participants are split. One group will experience one
treatment/be in one group/condition.
∗ Another group will experience the other.
∗ Each condition will have a unique, non-shared, set of
participants.
Independent
Independent
Independent
Independent
Independent
Independent
Independent
∗ A quick trick to show you.
∗ Good for those who aren’t fond of a screen full of
numbers.
∗ If you have coded your variables correctly there is
a button you can press that will make the
numbers in your data view appear as the names
coded.
∗ For example the 1’s and 2’s for gender could
appear as Male and Female.
Labelling Trick
∗ Data Entry Exercise 1 – Optimism Data Input
∗ Data Entry Exercise 2 – Repeated and Independent
∗ Extra Data Entry Exercises
∗ Exercise 3 – Giving electric shocks
∗ Exercise 4 – Shooting people
∗ We’ve gone through 1 and 2 here. Try them on your
own.
∗ 3 and 4 for extra practice.
∗ Make sure you are comfortable with data input,
coding and labelling.
Exercises
∗ The theory and step-by-step guide will be covered in the slides following immediately
below.
∗ If you complete the first exercise move onto exercise 2.
Descriptive Exercise 1: survey.sav
The data is from a survey of staff about stress and emotions.
Generate the frequencies for 1) marital status and 2) level of education
Descriptive Exercise 2: staffsurvey.sav
The data is from a staff survey with likert scales for agreement and importance of
factors.
Generate appropriate descriptive statistics to answer the following questions:
(a) What percentage of the staff in this organisation are permanent employees? (Use the
variable employstatus.)
(b) What is the average length of service for staff in the organisation? (Use the variable service.)
(c) What percentage of respondents would recommend the organisation to others as a good
place to work? (Use the variable recommend.)
Lab Exercises
∗ The theory and step-by-step guide will be covered in the slides following immediately
below.
∗ If you complete the first exercise move onto exercise 2.
Descriptive Exercise 1: survey.sav
The data is from a survey of staff about stress and emotions.
Generate the frequencies for 1) marital status and 2) level of education
Descriptive Exercise 2: staffsurvey.sav
The data is from a staff survey with likert scales for agreement and importance of
factors.
Generate appropriate descriptive statistics to answer the following questions:
(a) What percentage of the staff in this organisation are permanent employees? (Use the
variable employstatus.)
(b) What is the average length of service for staff in the organisation? (Use the variable service.)
(c) What percentage of respondents would recommend the organisation to others as a good
place to work? (Use the variable recommend.)
Lab Exercises
∗ When you are trying to find your descriptive stats you
need to make sure you use the right ones.
∗ Certain types of data/measure, suit certain types of
measures of central tendency and dispersion.
∗ Use the wrong ones and your description of the
results will be confusing, wrong and won’t match
your inferential statistics.
Types of Variables & Descriptives
∗ Also known as Nominal variables in SPSS.
∗ Data that has been classified and categorised.
∗ So gender, a participant will belong to a particular
category of gender.
∗ Marital Status.
∗ Anything that you can create a discrete classification
of. You can even take a scale variable like age, and
force it into categories (18 and under, 18 – 25, 25 – 35
etc.).
Categorical Variables
∗ Measure of Central tendency to use for Categorical
data is the mode.
∗ Frequency of occurrence or amount.
∗ So using gender as an example you would use the
mode.
∗ 2 of the sample might be male, and 8 female.
∗ Mode = Female.
∗ 20% male, 80% female
Categorical
∗ In SPSS you should use the Frequency option when
you want the descriptive stats for a categorical
variable.
∗ Go to Descriptive Exercise 1 on LearnUCS.
Categorical and Frequency
∗ Save survey.sav to your SPSS folder on the Desktop
from LearnUCS
∗ Have a look at survey.sav questionnaire from
LearnUCS
∗ Open survey.sav dataset
Descriptive Exercise 1 - Survey
Survey Questionnaire
Frequencies
Frequencies
Frequencies
Frequency Output
Frequency Output
∗ This is where graphs and the results from tests
(descriptive and inferential) will appear.
∗ Also notes about when you have saved and opened
files too.
∗ If you want to keep what is in the output you must
save it specifically.
∗ Saving the data/variable will not save what is in the
output, and vice versa.
Output pages
∗ Aside from Categorical measures we also have
∗ Ordinal
∗ Scale/Interval (sometimes know as ratio too)
∗ These are also generally known as continuous
variables.
∗ Usually the mean or median are the measures of
central tendency used, and the standard deviation, or
error, the measure of dispersion.
Other measures
∗ Ranked or ordered data. Sometimes Likert scales.
∗ Has some similarity to categorical data (You might
consider grade brackets to be categories; A, B, C, D,
etc).
∗ But importantly they are ranked, so there is meaning
to the position. A is better than B, B better than C and
so on.
∗ The median is used here.
∗ Central point with an equal amount above/below.
Ordinal
∗ The median is used here.
∗ Central point with an equal amount above/below
∗ So if you had a collection of grades…
∗ 20 people had an A
∗ 10 had a B
∗ 10 had a C
∗ 10 had a D
∗ Then B would be the median grade, as 20 people had
higher, and 20 people had lower.
Ordinal
∗ Imagine we wished to find the median for the highest
educational level attained by a population
∗ In descriptive exercise 1 (survey) we would click on ‘Analyze’
Using Explore to See the Median
Using Explore to See the Median
Select
‘Descriptive
Statistics’ and
then ‘Explore’
from the Drop-
down menus
Using Explore to See the Median
1. When the below box opens move
‘highest educ completed’ from the left
pane to the ‘Dependent List’ section
2. Click on
‘Statistics’ and
choose ‘Outliers’ and
‘Continue’
3. Click on ‘Plots’
and choose
‘Histograms’ and
‘Normality Plots with
tests’ and ‘Continue’
4. Click on ‘OK’
Using Explore to See the Median
The resulting ‘Output’ in the
Output window will show
you a number of descriptive
stats.
We can see the median is 4
for the ‘highest educ
completed’ which means
‘some additional training’ is
the median for the highest
education completed for 439
participants who took part in
the survey.
∗ Interval – a scale with artificial limits, no true zero, and
usually some form of cap.
∗ Intervals are of equal size.
∗ IQ scores for example.
∗ Ratio – has a true zero, constant intervals and potentially
little or no cap.
∗ So timing scores on a task for example.
∗ SPSS doesn’t really differentiate between the two.
∗ Basically if it is a form of score it is likely to be scale.
Scale
∗ The mean is the normal measure of central tendency,
and the measure of dispersion the standard deviation.
∗ So 5 people take a maths test.
∗ They score 10, 20, 18, 12 and 5.
∗ The average would be 13 (total/number of cases)
Scale
∗ In SPSS we just need the descriptive option, rather
than the frequency option.
∗ So for example if we wished to find the mean and
standard deviation for ‘age’, ‘total optimism’, ‘total
mastery’, ‘total perceived stress’ and ‘total perceived
control of internal states’ (PCOISS), for participants
who answered the survey we are using for exercise 1.
Scale Descriptives
Descriptives
Descriptives
Descriptives
Descriptives Output
∗ Sometimes information will be left out of a questionnaire,
or the value lost, but you will still need to conduct an
analysis.
∗ What happens if someone doesn’t fill in the age box on a
questionnaire?
∗ Rather than get rid of all their data you can use the
‘Exclude cases pairwise’ option.
∗ It excludes the case (person) only if they are missing the
data required for the specific analysis. They will still be
included in any of the analyses for which they have the
necessary information.
Missing Data
∗ Exclude cases listwise
∗ A more extreme option.
∗ If the participant is missing any data then this option
should remove them entirely from the analysis.
∗ A matter of judgement as to which to use.
Missing Data
∗ Descriptive Exercise 1 – Survey
∗ Descriptive Exercise 2 – Staff Survey
Exercises
Adapted from Green, J. & D’Oliveira, M. (1999). Learning to use statistical tests in psychology. Buckingham, UK: Open University Press.
Differences ?
Categorical & Frequency
Data? Relationships ?
How many Independent
variables?
START
Within or
Between
participants in
each condition?
Two or more
Parametric:
Unrelated
t-test
Non-param:
Mann Whitney
Between
How many
experimental
conditions?
One
Factorial Within
Subjects (Repeated
Measures) ANOVA
Within
Factorial Mixed
Design (Split-Plot)
ANOVA
Both True
Between
Factorial Between
Groups ANOVA
3 or more
Within or
Between
participants in
each condition?
Two
Within or
Between
participants in
each condition?
Parametric: Non-param:
Oneway Friedman
Within Ss or
(Repeated Page’s L
measures) Trend Test
ANOVA
Within Between
Parametric: Non-param:
Oneway Kruskal-
Between Wallis or
Group Jonckheere
ANOVA Trend Test
Parametric: Non-Param:
Related Wilcoxon
t-test
Within
Parametric: Non-param:
Pearson's r Spearman's r
Flowchart for choosing basic
statistics
Summarising Univariate Data?
Descriptive statistics
(mean, standard deviation,
variance, etc)
1 or 2 sample
Chi-square
Within
McNemar
Between
∗ Coolican, H. (2014). Research Methods and Statistics in Psychology (6th
ed.). Hove, UK: Psychology Press.
∗ A good introduction to the quantitative statistics incorporated in the social
sciences. A comprehensive coverage of the statistics covered in research
methods at this level in a clear and comprehensive format.
∗ Pallant, J. (2013). SPSS: Survival Manual (5th
ed.). Maidenhead, UK: Open
University Press
∗ A textbook that is of help with the statistical programme SPSS whatever
your level, as it takes you through the analysis in a step-by-step clear and
concise manner that allows you to learn while you put into practice.
∗ Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics (4th
ed.).
London, UK: Sage
∗ An easy to engage with text that covers research methods and statistics in
a fashion that makes it easy to read and follow.
Recommended Reading
∗ You can use the below link to access the UCS library page that
has some useful videos showing how to use SPSS
∗ http://libguides.ucs.ac.uk/c.php?g=264784&p=1954991
∗ There is also a course that you can do (set up by Jen Versey our
Psychology technician and David Mullett from the library
support team)
∗ https://www.coursesites.com/webapps/Bb-sites-course-creation-BBLEA
383196_1
∗ There is always the IBM SPSS guide that you can access through
the help option in SPSS as a starting point.
Web Resources
Descriptive Statistics
∗ Descriptive statistics – are statistics that describe data. They
essentially summarise the data.
∗ They can be either numerical or graphic
∗ Numerical statistics come in 2 forms
∗ Measurement of central tendency
∗ Measurement of dispersion
Measure of Central Tendency
∗ Three measures of central tendency/ score, which we use is
dependent on our level of measurement. They are;
∗ Mean
∗ Arithmetic average/mean. Sum of all scores divided by the
number of scores
∗ Median
∗ The score that falls in the exact centre of the distribution
(middlemost score)
∗ Mode
∗ The most common/frequently occurring score
‘the mean’
∗ Formula for the mean is
_ Σx
x = N
_
x = the mean
Σ = the sum of
x = the scores
N = the number of scores in set
∗ Advantages
∗ Powerful statistic used in estimating population parameters for significant
differences and correlations. Most sensitive, and works at an interval level.
∗ Disadvantages
∗ Can be overly sensitive causing it to easily distort due to outlier values
‘the median’
∗ The measure of central tendency for ordinal data
∗ Shorthand may be Guildford’s (1956) Mdn
∗ It is the central value of a set
∗ A formula used to find the median is
N + 1
k = 2
∗ For odd number data sets this will reveal the central number
∗ For even number data sets this will reveal the two points of data that the median
falls between
∗ When you have a number of values the same in the data set you can use the same
method although it is not strictly correct. However, luckily for us as social scientists
there are statistical packages that will take care of this for us
‘the mode’
∗ The measure of central tendency for nominal scale data. We are
unable to calculate mean and median with this type of data, but
we can see what occurred most often/highest frequency
∗ There can be two modes, which we call bi-modal
∗ Advantages
∗ Most typical, unaffected by extremes, can be more informative than mean
with discrete scales
∗ Disadvantages
∗ Does not account for differences between values, can’t be used in
estimates of population parameters, not all that useful for small sets of
data, for bi-modal two modal values reported, difficult to estimate
accurately when data grouped into class intervals
Measures of
Spread/Dispersion
High
Variability
Low
Variability
‘the range’
∗ Report of the top/highest value and the bottom/lowest value
∗ To calculate what the range is (the difference between) you
subtract the lower value from the higher value and add 1
∗ Advantage
∗ Includes extremes, easy to calculate
∗ Disadvantages
∗ Can be distorted by extremes, can be unrepresentative of the distribution.
Doesn’t tell us whether values close to spaced out from mean
‘the interquartile and semi-interquartile
range’
∗ The interquartile range allows us a better insight into
how values fall in relation to the central tendency
∗ Instead of the full range, the interquartile range
represents the distance between the central 50%,
removing the bottom and top 25%. The values are
known as the 1st
and 3rd
quartiles or the 25th
and 75th
percentiles
Interquartile range
Q1 M Q3
3 3 4 5 6 8 10 13 14 16 19
∗ The interquartile range is: Q3 – Q1
∗ Semi-interquartile is half of that: Q3 – Q1
2
∗ Advantages
∗ Representative of central group of values, useful for ordinal data
∗ Disadvantages
∗ No account of extremes, inaccurate where there are large class
intervals
Standard deviation and variance
∗ These estimate from a sample how the values of a
population are distributed
∗ Standard deviation provides us with an average score
telling us how different the scores are from the mean
∗ Formula for standard deviation (std, SD, stdev)
)(
−1
2
−
=
∑
n
Xx
s
− 1
=
∑
n
s
2
d Or

More Related Content

What's hot

Quantitative data analysis
Quantitative data analysisQuantitative data analysis
Quantitative data analysisRonaldLucasia1
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of varianceRavi Rohilla
 
Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"Dalia El-Shafei
 
Basic Statistics in 1 hour.pptx
Basic Statistics in 1 hour.pptxBasic Statistics in 1 hour.pptx
Basic Statistics in 1 hour.pptxParag Shah
 
Statistical analysis using spss
Statistical analysis using spssStatistical analysis using spss
Statistical analysis using spssjpcagphil
 
Factor analysis
Factor analysis Factor analysis
Factor analysis Nima
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt finalpiyushdhaker
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Dr Athar Khan
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsEugene Yan Ziyou
 
Factor analysis
Factor analysisFactor analysis
Factor analysissaba khan
 
Parametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use whichParametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use whichGönenç Dalgıç
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAileen Balbido
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statisticsAttaullah Khan
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysissristi1992
 
Multidimensional scaling
Multidimensional scalingMultidimensional scaling
Multidimensional scalingH9460730008
 

What's hot (20)

Quantitative data analysis
Quantitative data analysisQuantitative data analysis
Quantitative data analysis
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Analysis of variance
Analysis of varianceAnalysis of variance
Analysis of variance
 
Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"Statistics "Descriptive & Inferential"
Statistics "Descriptive & Inferential"
 
Basic Statistics in 1 hour.pptx
Basic Statistics in 1 hour.pptxBasic Statistics in 1 hour.pptx
Basic Statistics in 1 hour.pptx
 
Statistical analysis using spss
Statistical analysis using spssStatistical analysis using spss
Statistical analysis using spss
 
Factor analysis
Factor analysis Factor analysis
Factor analysis
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt final
 
Binary OR Binomial logistic regression
Binary OR Binomial logistic regression Binary OR Binomial logistic regression
Binary OR Binomial logistic regression
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsStatistical inference: Statistical Power, ANOVA, and Post Hoc tests
Statistical inference: Statistical Power, ANOVA, and Post Hoc tests
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Parametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use whichParametric vs Nonparametric Tests: When to use which
Parametric vs Nonparametric Tests: When to use which
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Two sample t-test
Two sample t-testTwo sample t-test
Two sample t-test
 
Univariate & bivariate analysis
Univariate & bivariate analysisUnivariate & bivariate analysis
Univariate & bivariate analysis
 
Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Multidimensional scaling
Multidimensional scalingMultidimensional scaling
Multidimensional scaling
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 

Similar to Spss introductory session data entry and descriptive stats

When to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxWhen to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxAsokan R
 
WORKSHOP on research methods
WORKSHOP on research methodsWORKSHOP on research methods
WORKSHOP on research methodsBaloguEzenwo
 
Research Methods for MSC MPH.pptx
Research Methods for MSC MPH.pptxResearch Methods for MSC MPH.pptx
Research Methods for MSC MPH.pptxMohammedAbdela7
 
RESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdf
RESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdfRESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdf
RESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdfDeborahChimeremezeJo
 
321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)Iin Angriyani
 
PSYA4 - Research methods
PSYA4 - Research methodsPSYA4 - Research methods
PSYA4 - Research methodsNicky Burt
 
The Art and Science of Survey Research
The Art and Science of Survey ResearchThe Art and Science of Survey Research
The Art and Science of Survey ResearchSiobhan O'Dwyer
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.pptmanaswidebbarma1
 
The role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptThe role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptJakeCuenca10
 
Module 1 introduction to statistics
Module 1 introduction to statisticsModule 1 introduction to statistics
Module 1 introduction to statisticsChristine Concordia
 
introduction to statistical theory
introduction to statistical theoryintroduction to statistical theory
introduction to statistical theoryUnsa Shakir
 
CategoryPoor (Below Average)AverageAbove AverageLength of .docx
CategoryPoor (Below Average)AverageAbove AverageLength of .docxCategoryPoor (Below Average)AverageAbove AverageLength of .docx
CategoryPoor (Below Average)AverageAbove AverageLength of .docxtidwellveronique
 
Guide on how to write Result and Discussion.pptx
Guide on how to write Result and Discussion.pptxGuide on how to write Result and Discussion.pptx
Guide on how to write Result and Discussion.pptxNeurologyKota
 
INTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptxINTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptxAvilosErgelaKram
 

Similar to Spss introductory session data entry and descriptive stats (20)

When to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxWhen to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptx
 
WORKSHOP on research methods
WORKSHOP on research methodsWORKSHOP on research methods
WORKSHOP on research methods
 
research.pptx
research.pptxresearch.pptx
research.pptx
 
Methodology & IRB/URR
Methodology & IRB/URRMethodology & IRB/URR
Methodology & IRB/URR
 
Research Methods for MSC MPH.pptx
Research Methods for MSC MPH.pptxResearch Methods for MSC MPH.pptx
Research Methods for MSC MPH.pptx
 
RESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdf
RESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdfRESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdf
RESEARCH OBJECTIVES PROBLEMS AND HYPOTHESIS.pdf
 
321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)321423152 e-0016087606-session39134-201012122352 (1)
321423152 e-0016087606-session39134-201012122352 (1)
 
What is research
What is researchWhat is research
What is research
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
PSYA4 - Research methods
PSYA4 - Research methodsPSYA4 - Research methods
PSYA4 - Research methods
 
The Art and Science of Survey Research
The Art and Science of Survey ResearchThe Art and Science of Survey Research
The Art and Science of Survey Research
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
The role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.pptThe role of statistics and the data analysis process.ppt
The role of statistics and the data analysis process.ppt
 
Data collection
Data collectionData collection
Data collection
 
Module 1 introduction to statistics
Module 1 introduction to statisticsModule 1 introduction to statistics
Module 1 introduction to statistics
 
introduction to statistical theory
introduction to statistical theoryintroduction to statistical theory
introduction to statistical theory
 
CategoryPoor (Below Average)AverageAbove AverageLength of .docx
CategoryPoor (Below Average)AverageAbove AverageLength of .docxCategoryPoor (Below Average)AverageAbove AverageLength of .docx
CategoryPoor (Below Average)AverageAbove AverageLength of .docx
 
Methodology and IRB/URR
Methodology and IRB/URRMethodology and IRB/URR
Methodology and IRB/URR
 
Guide on how to write Result and Discussion.pptx
Guide on how to write Result and Discussion.pptxGuide on how to write Result and Discussion.pptx
Guide on how to write Result and Discussion.pptx
 
INTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptxINTRODUCTION TO STATISTICS.pptx
INTRODUCTION TO STATISTICS.pptx
 

More from e1033930

Advanced literature searching for health sciences
Advanced literature searching for health sciencesAdvanced literature searching for health sciences
Advanced literature searching for health sciencese1033930
 
Open access copyright and publishing - UoS guide
Open access copyright and publishing - UoS guideOpen access copyright and publishing - UoS guide
Open access copyright and publishing - UoS guidee1033930
 
Information sources overview
Information sources overviewInformation sources overview
Information sources overviewe1033930
 
Finding attributing and referencing online images
Finding attributing and referencing online images Finding attributing and referencing online images
Finding attributing and referencing online images e1033930
 
Information sources quiz for nursing and health
Information sources quiz for nursing and healthInformation sources quiz for nursing and health
Information sources quiz for nursing and healthe1033930
 
University of Suffolk Library
University of Suffolk LibraryUniversity of Suffolk Library
University of Suffolk Librarye1033930
 
Evaluating information on the web
Evaluating information on the webEvaluating information on the web
Evaluating information on the webe1033930
 
Reflective writing general workshop
Reflective writing general workshopReflective writing general workshop
Reflective writing general workshope1033930
 
Personal effectiveness presentation
Personal effectiveness presentationPersonal effectiveness presentation
Personal effectiveness presentatione1033930
 

More from e1033930 (9)

Advanced literature searching for health sciences
Advanced literature searching for health sciencesAdvanced literature searching for health sciences
Advanced literature searching for health sciences
 
Open access copyright and publishing - UoS guide
Open access copyright and publishing - UoS guideOpen access copyright and publishing - UoS guide
Open access copyright and publishing - UoS guide
 
Information sources overview
Information sources overviewInformation sources overview
Information sources overview
 
Finding attributing and referencing online images
Finding attributing and referencing online images Finding attributing and referencing online images
Finding attributing and referencing online images
 
Information sources quiz for nursing and health
Information sources quiz for nursing and healthInformation sources quiz for nursing and health
Information sources quiz for nursing and health
 
University of Suffolk Library
University of Suffolk LibraryUniversity of Suffolk Library
University of Suffolk Library
 
Evaluating information on the web
Evaluating information on the webEvaluating information on the web
Evaluating information on the web
 
Reflective writing general workshop
Reflective writing general workshopReflective writing general workshop
Reflective writing general workshop
 
Personal effectiveness presentation
Personal effectiveness presentationPersonal effectiveness presentation
Personal effectiveness presentation
 

Recently uploaded

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our WorldEduminds Learning
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxAleenaJamil4
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 

Recently uploaded (20)

毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Learn How Data Science Changes Our World
Learn How Data Science Changes Our WorldLearn How Data Science Changes Our World
Learn How Data Science Changes Our World
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
detection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptxdetection and classification of knee osteoarthritis.pptx
detection and classification of knee osteoarthritis.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 

Spss introductory session data entry and descriptive stats

  • 1. SPSS & Quantitative Data Analysis Kulbir Singh Birak
  • 2. ∗ SPSS is a computer program for analysing quantitative data. ∗ This can range from basic descriptive statistics such as the mean, mode, median and range to powerful tests of significance (So whether we accept or reject a hypothesis). ∗ What the data looks like, and what that means if anything. What is SPSS?
  • 3. ∗ You can access SPSS on the vast majority of PC’s at UCS, in these labs, the Waterfront PC’s and the library PC’s ∗ Additionally, if you wish you can borrow a copy of SPSS to install on your own home PC or laptop. There are 16 copies in the library you just need to borrow the disc and input the license code that comes with it (license’s do come to an end and when they do you can just come and borrow a new version of SPSS or attain a new license code) ∗ Over night loan only or you can bring your laptops in and do there and then ∗ Windows version only, no Apple version SPSS Access
  • 4. Overview ∗ Why do numbers matter in research design? ∗ Numbers allow you to do two basic things: - Count how often “something” happens - Count how big an issue “something” is
  • 5. Overview ∗ Once you can count the extent (how often) and nature (for quantitative research a numerical descriptor of an attribute) you can already do some pretty important things. You can answer questions such as: ∗ How common is an issue? ∗ For instance, are black children over-represented in care? Are black adults over- represented in psychiatric hospital? ∗ How serious is a particular issue? Or how is it distributed within a sample? ∗ For instance, how serious are the concerns about children in families allocated a social worker?
  • 6. Overview Once you can count stuff you can start to answer other important and interesting questions, for instance:
  • 7. ∗ Students may often come to you with various questions about SPSS and difficulties that they are having ∗ If you are lucky enough to catch them early on a lot of unnecessary frustration and stress about analysing data can be avoided. ∗ The most important thing a student can do before they even consider methodology, methods or analysis is to have a clear research question/aim and hypotheses in place that conceptualise and operationalise the variables they wish to study. SPSS and Quantitative Data
  • 8. Some Basic Definitions A variable is the “thing” that you’re interested in studying e.g. depression, gender differences, social deprivation, specific crime rates, levels of emotionality (how emotional someone is) or different types of food!
  • 9. • Things like depression, gender differences, social deprivation, specific crime rates, levels of emotionality and food type, etc. are called “variables” because they vary. ● Some people are more depressed than others ● Some people are men, and others are women ● Some Social policies may be more successful than others ● We may see different crimes committed in different contexts, areas ● Some people are less emotional than others ● Food types can range from pizza to hamburgers to filet mignon, or might be Thai, Ethiopian, Polish or American cuisine, etc., etc.
  • 10. TO “CONCEPTUALISE” A VARIABLE MEANS TO MAKE CLEAR WHAT YOU MEAN BY THE VARIABLE…. • For example, for the variable “food type,” you need to be clear about whether you mean • (1) vegetarian or meat, OR • (2) breakfast, lunch or dinner foods, OR • (3) Ethiopian, Thai or American foods, OR • (4) something else!
  • 11. TO “OPERATIONALISE” A VARIABLE IS TO DECIDE HOW YOU WILL MEASURE IT • For example, if the variable you’re interested in is depression: ● Will you ask people to rate themselves, and if so, on what sort of a scale? ● Alternatively, will you measure depression by facial expression? By some behaviour that you observe? In some other way?
  • 12. TO “OPERATIONALISE” A VARIABLE IS TO DECIDE HOW YOU WILL MEASURE IT • If the variable you’re studying is intelligence & you don’t think Exam scores are a good measure of intelligence, what measure WILL you use? • Asking these sorts of questions is completing the process of “operationalising” your variables. • Conceptualisation & Operationalisation are necessary for a Quantitative approach
  • 13. Exploratory Descriptive Causal/Relationship Exploratory research is undertaken when few or no previous studies exist. The aim is to look for patterns, hypotheses or ideas that can be tested and will form the basis for further research. Typical research techniques would include case studies, observation and reviews of previous related studies and data. Data from exploratory studies tends to be qualitative. Expands on the Exploratory Descriptive research can be used to identify and classify the elements or characteristics of the subject, e.g. number of days youth offenders remained out of trouble. Quantitative techniques are most often used to collect, analyse and summarise data. Causal and Relationship research focuses on being able to predict/hypothesise cause and effect between observed behaviours, or relationships between aspects of behaviour/society/crime rates. The idea is that Causal and Relationship research is moving a step beyond descriptive research and the quantitative data collected can be used and analysed in a manner that allows the researcher to infer a significant effect/difference or relationship TYPES OF QUANTITATIVE RESEARCH
  • 15. • The Quantitative approach sets out at the start of a study with a research question and a hypothesis/prediction • Hypotheses are formal statements of predictions derived from evidence from earlier research and/or theory. • The null hypothesis (H0) is a statement of ‘no difference/effect/change’ between the variables • The experimental hypothesis (H1) is a statement of difference/relationships between variables QUANTITATIVE DESIGNS AND HYPOTHESES
  • 16. • Experimental Hypothesis: Students who study for tests in study groups will score significantly better on their exams than students who did not study in study groups • Null Hypothesis: There will be no significant difference in exam results between students who do and do not study in study groups EXAMPLE OF HYPOTHESIS
  • 17. ∗ This clarity in the question and hypothesis can make life markedly easier for yourselves and the student in the long run. ∗ However, I appreciate that this is not always the easiest/or will not be the case for you more often than not. ∗ So what I will be covering with you today is a brief introduction to the SPSS interface and as to how we would go about doing the initial basics of data entry and beginning to explore descriptive data. ∗ If we can I’ll also take you through examples of some basic significance testing (otherwise I’ll put up so available)
  • 18. ∗ A light but important session. ∗ Going over the basics of how to input data, label your variables so it is clear and how to create codebooks. ∗ It’s all about building up your confidence with the interface, and developing good practise. ∗ It’s about doing the basics so as to avoid confusion later on, e.g. inputting the data correctly for different types of analysis. Data Entry and Descriptives
  • 19. ∗ Hopefully should be familiar with the idea of descriptive data. ∗ As the name suggests they are what we use to describe the data we have. ∗ There’s no point in knowing that the IQ scores between two groups are significantly different if we don’t have a way of describing the scores, and the difference. ∗ Measures of central tendency: Mean, mode, median etc. ∗ Measures of dispersion: Standard deviation etc. Descriptive Stats
  • 20. Levels of Measurement ∗ In 1946 Stevens proposed a theory of scales of measurement. ∗ Nominal data (lowest level of measurement) ∗ Ordinal data (unable to differentiate points on scale) ∗ Interval data (points on scale equal distance apart) ∗ Ratio data (equal distance between points on scale)
  • 21. Nominal ∗ Provides the least exact information ∗ Participants are placed in categories ∗ Data that is categorical e.g. gender, colours, shoe type, play behaviour ∗ Variable must fit into one category ∗ Measure of frequency ∗ Numbers may be used but only as category labels ∗ Central tendency is described using the mode ∗ Data is represented using a frequency table or bar chart
  • 22. Examples: Nominal Data ∗ Type of Bicycle ∗ Mountain bike, road bike, chopper, folding, BMX. ∗ Ethnicity ∗ White British, Afro-Caribbean, Asian, Chinese, other, etc. (note problems with these categories). ∗ Smoking status ∗ smoker, non-smoker
  • 23. Ordinal ∗ Simplest true scale, orders measurements along a continuum ∗ Represent rank position in a group e.g. 1st , 2nd , 3rd …10th ∗ No information on difference between positions ∗ Central tendency is described in terms of the median ∗ Dispersion can be measured using the range or inter-quartile range (middle 50% of the distribution)
  • 24. Ordinal Data ∗ A type of categorical data in which order is important. ∗ Class of degree-1st class, 2:1, 2:2, 3rd class, fail ∗ Degree of illness- none, mild, moderate, acute, chronic. ∗ Opinion of students about stats classes- Very unhappy, unhappy, neutral, happy, ecstatic!
  • 25. Interval and ratio variables ∗ According to Fielding & Gilbert (2000) these are often used interchangeably, and incorrectly by social scientists (pg15) ∗ Interval, ordered categories, no inherent concept of zero (Clark 2004), we can calculate meaningful distance between categories, few real examples of interval variables in social sciences (Fielding & Gilbert 2000:15) ∗ Ratio. A meaningful zero amount (e.g. income), possible to calculate ratios so also has the interval property (e.g. someone earning £20,000 earns twice as much as someone who earns £10,000) (Fielding & Gilbert 2000:15) ∗ Difference between interval and ratio usually not important for statistical analysis (Fielding & Gilbert 2000:15)
  • 26. Interval variables- Examples ∗ Fahrenheit temperature scale- Zero is arbitrary- 40 Degrees is not twice as hot as 20 degrees. ∗ IQ tests. No such thing as Zero IQ. 120 IQ not twice as intelligent as 60. ∗ Question- Can we assume that attitudinal data represents real, quantifiable measured categories? (ie. That ‘very happy’ is twice as happy as plain ‘happy’ or that ‘Very unhappy’ means no happiness at all). Statisticians not in agreement on this.
  • 27. Ratio variables-Examples ∗ Can be discrete or continuous data. ∗ The distance between any two adjacent units of measurement (intervals) is the same and there is a meaningful zero point (Papadopoulos, 2001) ∗ Income- someone earning £20,000 earns twice as much as someone who earns £10,000. ∗ Height ∗ Unemployment rate- measured as the number of jobseekers as a percentage of the labour force (Papadopoulos, 2001).
  • 28.
  • 29. ∗ If you are still a little worried about your understanding of Quantitative Data please see the Key Information Handout in the Folder. ∗ By David Bowers (Learning Development) ∗ A reasonable summary of information about quantitative data. ∗ Data types, appropriate measures of central tendency etc. Key Information Handout
  • 30. ∗ Everything we do today is about good practice. ∗ Following the steps today, and developing correct inputting skills, will save you lots of problems and heartache later. ∗ SPSS is fussy when it comes to the way data is entered. Importance of Good Practice
  • 31. ∗ As SPSS is a Quantitative Data analysis software you often have to reduce information down to a numerical state ∗ A Codebook allows you to keep a record of these reductions and decisions ∗ A record of your own. Separate from SPSS. ∗ Electronic or on paper. ∗ A list of variables, full names, and how you have coded data. Codebook
  • 32. ∗ The codes you give data to allow SPSS to analyse it. ∗ You can’t enter text so some variables need to be converted. E.g. ∗ Gender: Female may become 1, Male may become 2. ∗ Relationship Status: Single may become 1, Married 2, Divorced 3, Widowed 4… Coding
  • 33. ∗ SPSS is fussy when it comes to the names you give variables. ∗ Can’t give them a full description in the main view. ∗ So you can give detailed labels in the special variable view. ∗ Along with a codebook it helps keep the information clear. Labelling
  • 34. ∗ Available on email that was circulated to you all ∗ File: Data Entry Exercise 1 - Optimism Data ∗ We’ll be creating a codebook, setting up SPSS according to the codebook, and then entering the data. 1st Exercise
  • 35. ∗ Good habits ∗ Create a new Folder on your Desktop ∗ Right-click on Desktop> New > Folder > “SPSS” New Data Folder
  • 36. ∗ Start>All Programs>IBM SPSS Statistics 19. ∗ Depending on version may have a slightly different name. ∗ GIVE IT TIME SPSS IS RENOWNED FOR TAKING AN AGE TO OPEN UP – CLICKING AGAIN ONLY SLOWS IT DOWN MORE AS IT’LL THEN TRY TO OPEN ANOTHER SPSS WINDOW Open SPSS
  • 38. ∗ Optimism Scale data from 4 participants ∗ Firstly, we are going to prepare a codebook Coding Data
  • 40. ∗ Rules for naming of variables ∗ Variable names: ∗ must be unique (i.e. each variable in a data set must have a different name); ∗ must begin with a letter (not a number); ∗ cannot include full stops, spaces or other characters (!, ? * "); ∗ cannot include words used as commands by SPSS (all, ne, eq, to, le, lt, by, or, gt, and, not, ge, with) Coding Data
  • 41. Optimism scale items op1 to 4 Enter number circled 1 (strongly disagree) to 5 (strongly agree) Coding Data
  • 42. ∗ Now we have a codebook to keep things clear we can set up SPSS so it is ready for the data. ∗ SPSS has 3 views: Data, Variable and Output. ∗ By switching to Variable we can define the variables we need. Creating a data file and inputting data
  • 48. Values Enter the relevant value and label as per your codebook, then click add. When all have been entered, click OK Define the meaning of the values used in the codebook (Gender) and click add for each.
  • 50. Values When entering likert data always use the limits of the scale (1-5) even if you know that participants may not have entered some responses. You also need to decide whether you are going o just enter the range or every labeled point.
  • 52. ∗ Data comes in different types. ∗ Categorical (Nominal in SPSS) ∗ Ordinal ∗ Scale/Interval (Scale in SPSS) ∗ Different types/measures suit different tests, different measures of central tendency, different forms of visualisation. ∗ Makes knowing what type of data you have KEY for successful data analysis. Measures
  • 53. Measures Scale refers to interval/ratio level of measurement - There is some debate about data type in relation to likert data … for our purposes, leave this as Scale Nominal refers to catergorical
  • 55. ∗ Now you have the variables set up ready for the data you can start to enter the actual data ∗ Go to the Data View Inputting Data According to the Codebook
  • 56. Inputting Data According to the Codebook
  • 57. Inputting Data According to the Codebook
  • 60. ∗ You’ve saved the data so now it is ‘safe’ ∗ You can have a play around with it and try a few different things. ∗ Delete a case ∗ Insert a case between existing cases ∗ Delete a variable ∗ Insert a variable between existing variables ∗ Try during the workshop/at home so you get more confident with SPSS. Playing around with the data
  • 61. ∗ Available on LearnUCS. ∗ Different experimental designs require a different style of inputting. ∗ The structure you use will be different between Repeated (Within-Group) and Independent (Between-Group) experimental designs. ∗ Use the wrong structure and the analysis will fall down. It will be meaningless at best. 2nd Exercise: Inputting Repeated and Independent Measures
  • 62. ∗ So, to recap ∗ Repeated Measures. The same participants experience all treatments/are in all the groups/conditions. ∗ If you wanted to investigate the effect of music on taking an IQ test participants would experience the no music condition, and the music condition. ∗ Hopefully with some counterbalancing. Repeated Measures
  • 65. ∗ Again to recap. ∗ Participants are split. One group will experience one treatment/be in one group/condition. ∗ Another group will experience the other. ∗ Each condition will have a unique, non-shared, set of participants. Independent
  • 72. ∗ A quick trick to show you. ∗ Good for those who aren’t fond of a screen full of numbers. ∗ If you have coded your variables correctly there is a button you can press that will make the numbers in your data view appear as the names coded. ∗ For example the 1’s and 2’s for gender could appear as Male and Female. Labelling Trick
  • 73.
  • 74. ∗ Data Entry Exercise 1 – Optimism Data Input ∗ Data Entry Exercise 2 – Repeated and Independent ∗ Extra Data Entry Exercises ∗ Exercise 3 – Giving electric shocks ∗ Exercise 4 – Shooting people ∗ We’ve gone through 1 and 2 here. Try them on your own. ∗ 3 and 4 for extra practice. ∗ Make sure you are comfortable with data input, coding and labelling. Exercises
  • 75. ∗ The theory and step-by-step guide will be covered in the slides following immediately below. ∗ If you complete the first exercise move onto exercise 2. Descriptive Exercise 1: survey.sav The data is from a survey of staff about stress and emotions. Generate the frequencies for 1) marital status and 2) level of education Descriptive Exercise 2: staffsurvey.sav The data is from a staff survey with likert scales for agreement and importance of factors. Generate appropriate descriptive statistics to answer the following questions: (a) What percentage of the staff in this organisation are permanent employees? (Use the variable employstatus.) (b) What is the average length of service for staff in the organisation? (Use the variable service.) (c) What percentage of respondents would recommend the organisation to others as a good place to work? (Use the variable recommend.) Lab Exercises
  • 76. ∗ The theory and step-by-step guide will be covered in the slides following immediately below. ∗ If you complete the first exercise move onto exercise 2. Descriptive Exercise 1: survey.sav The data is from a survey of staff about stress and emotions. Generate the frequencies for 1) marital status and 2) level of education Descriptive Exercise 2: staffsurvey.sav The data is from a staff survey with likert scales for agreement and importance of factors. Generate appropriate descriptive statistics to answer the following questions: (a) What percentage of the staff in this organisation are permanent employees? (Use the variable employstatus.) (b) What is the average length of service for staff in the organisation? (Use the variable service.) (c) What percentage of respondents would recommend the organisation to others as a good place to work? (Use the variable recommend.) Lab Exercises
  • 77. ∗ When you are trying to find your descriptive stats you need to make sure you use the right ones. ∗ Certain types of data/measure, suit certain types of measures of central tendency and dispersion. ∗ Use the wrong ones and your description of the results will be confusing, wrong and won’t match your inferential statistics. Types of Variables & Descriptives
  • 78. ∗ Also known as Nominal variables in SPSS. ∗ Data that has been classified and categorised. ∗ So gender, a participant will belong to a particular category of gender. ∗ Marital Status. ∗ Anything that you can create a discrete classification of. You can even take a scale variable like age, and force it into categories (18 and under, 18 – 25, 25 – 35 etc.). Categorical Variables
  • 79. ∗ Measure of Central tendency to use for Categorical data is the mode. ∗ Frequency of occurrence or amount. ∗ So using gender as an example you would use the mode. ∗ 2 of the sample might be male, and 8 female. ∗ Mode = Female. ∗ 20% male, 80% female Categorical
  • 80. ∗ In SPSS you should use the Frequency option when you want the descriptive stats for a categorical variable. ∗ Go to Descriptive Exercise 1 on LearnUCS. Categorical and Frequency
  • 81. ∗ Save survey.sav to your SPSS folder on the Desktop from LearnUCS ∗ Have a look at survey.sav questionnaire from LearnUCS ∗ Open survey.sav dataset Descriptive Exercise 1 - Survey
  • 86.
  • 89. ∗ This is where graphs and the results from tests (descriptive and inferential) will appear. ∗ Also notes about when you have saved and opened files too. ∗ If you want to keep what is in the output you must save it specifically. ∗ Saving the data/variable will not save what is in the output, and vice versa. Output pages
  • 90. ∗ Aside from Categorical measures we also have ∗ Ordinal ∗ Scale/Interval (sometimes know as ratio too) ∗ These are also generally known as continuous variables. ∗ Usually the mean or median are the measures of central tendency used, and the standard deviation, or error, the measure of dispersion. Other measures
  • 91. ∗ Ranked or ordered data. Sometimes Likert scales. ∗ Has some similarity to categorical data (You might consider grade brackets to be categories; A, B, C, D, etc). ∗ But importantly they are ranked, so there is meaning to the position. A is better than B, B better than C and so on. ∗ The median is used here. ∗ Central point with an equal amount above/below. Ordinal
  • 92. ∗ The median is used here. ∗ Central point with an equal amount above/below ∗ So if you had a collection of grades… ∗ 20 people had an A ∗ 10 had a B ∗ 10 had a C ∗ 10 had a D ∗ Then B would be the median grade, as 20 people had higher, and 20 people had lower. Ordinal
  • 93. ∗ Imagine we wished to find the median for the highest educational level attained by a population ∗ In descriptive exercise 1 (survey) we would click on ‘Analyze’ Using Explore to See the Median
  • 94. Using Explore to See the Median Select ‘Descriptive Statistics’ and then ‘Explore’ from the Drop- down menus
  • 95. Using Explore to See the Median 1. When the below box opens move ‘highest educ completed’ from the left pane to the ‘Dependent List’ section 2. Click on ‘Statistics’ and choose ‘Outliers’ and ‘Continue’ 3. Click on ‘Plots’ and choose ‘Histograms’ and ‘Normality Plots with tests’ and ‘Continue’ 4. Click on ‘OK’
  • 96. Using Explore to See the Median The resulting ‘Output’ in the Output window will show you a number of descriptive stats. We can see the median is 4 for the ‘highest educ completed’ which means ‘some additional training’ is the median for the highest education completed for 439 participants who took part in the survey.
  • 97. ∗ Interval – a scale with artificial limits, no true zero, and usually some form of cap. ∗ Intervals are of equal size. ∗ IQ scores for example. ∗ Ratio – has a true zero, constant intervals and potentially little or no cap. ∗ So timing scores on a task for example. ∗ SPSS doesn’t really differentiate between the two. ∗ Basically if it is a form of score it is likely to be scale. Scale
  • 98. ∗ The mean is the normal measure of central tendency, and the measure of dispersion the standard deviation. ∗ So 5 people take a maths test. ∗ They score 10, 20, 18, 12 and 5. ∗ The average would be 13 (total/number of cases) Scale
  • 99. ∗ In SPSS we just need the descriptive option, rather than the frequency option. ∗ So for example if we wished to find the mean and standard deviation for ‘age’, ‘total optimism’, ‘total mastery’, ‘total perceived stress’ and ‘total perceived control of internal states’ (PCOISS), for participants who answered the survey we are using for exercise 1. Scale Descriptives
  • 104. ∗ Sometimes information will be left out of a questionnaire, or the value lost, but you will still need to conduct an analysis. ∗ What happens if someone doesn’t fill in the age box on a questionnaire? ∗ Rather than get rid of all their data you can use the ‘Exclude cases pairwise’ option. ∗ It excludes the case (person) only if they are missing the data required for the specific analysis. They will still be included in any of the analyses for which they have the necessary information. Missing Data
  • 105. ∗ Exclude cases listwise ∗ A more extreme option. ∗ If the participant is missing any data then this option should remove them entirely from the analysis. ∗ A matter of judgement as to which to use. Missing Data
  • 106.
  • 107. ∗ Descriptive Exercise 1 – Survey ∗ Descriptive Exercise 2 – Staff Survey Exercises
  • 108. Adapted from Green, J. & D’Oliveira, M. (1999). Learning to use statistical tests in psychology. Buckingham, UK: Open University Press. Differences ? Categorical & Frequency Data? Relationships ? How many Independent variables? START Within or Between participants in each condition? Two or more Parametric: Unrelated t-test Non-param: Mann Whitney Between How many experimental conditions? One Factorial Within Subjects (Repeated Measures) ANOVA Within Factorial Mixed Design (Split-Plot) ANOVA Both True Between Factorial Between Groups ANOVA 3 or more Within or Between participants in each condition? Two Within or Between participants in each condition? Parametric: Non-param: Oneway Friedman Within Ss or (Repeated Page’s L measures) Trend Test ANOVA Within Between Parametric: Non-param: Oneway Kruskal- Between Wallis or Group Jonckheere ANOVA Trend Test Parametric: Non-Param: Related Wilcoxon t-test Within Parametric: Non-param: Pearson's r Spearman's r Flowchart for choosing basic statistics Summarising Univariate Data? Descriptive statistics (mean, standard deviation, variance, etc) 1 or 2 sample Chi-square Within McNemar Between
  • 109. ∗ Coolican, H. (2014). Research Methods and Statistics in Psychology (6th ed.). Hove, UK: Psychology Press. ∗ A good introduction to the quantitative statistics incorporated in the social sciences. A comprehensive coverage of the statistics covered in research methods at this level in a clear and comprehensive format. ∗ Pallant, J. (2013). SPSS: Survival Manual (5th ed.). Maidenhead, UK: Open University Press ∗ A textbook that is of help with the statistical programme SPSS whatever your level, as it takes you through the analysis in a step-by-step clear and concise manner that allows you to learn while you put into practice. ∗ Field, A. (2013). Discovering Statistics Using IBM SPSS Statistics (4th ed.). London, UK: Sage ∗ An easy to engage with text that covers research methods and statistics in a fashion that makes it easy to read and follow. Recommended Reading
  • 110. ∗ You can use the below link to access the UCS library page that has some useful videos showing how to use SPSS ∗ http://libguides.ucs.ac.uk/c.php?g=264784&p=1954991 ∗ There is also a course that you can do (set up by Jen Versey our Psychology technician and David Mullett from the library support team) ∗ https://www.coursesites.com/webapps/Bb-sites-course-creation-BBLEA 383196_1 ∗ There is always the IBM SPSS guide that you can access through the help option in SPSS as a starting point. Web Resources
  • 111. Descriptive Statistics ∗ Descriptive statistics – are statistics that describe data. They essentially summarise the data. ∗ They can be either numerical or graphic ∗ Numerical statistics come in 2 forms ∗ Measurement of central tendency ∗ Measurement of dispersion
  • 112. Measure of Central Tendency ∗ Three measures of central tendency/ score, which we use is dependent on our level of measurement. They are; ∗ Mean ∗ Arithmetic average/mean. Sum of all scores divided by the number of scores ∗ Median ∗ The score that falls in the exact centre of the distribution (middlemost score) ∗ Mode ∗ The most common/frequently occurring score
  • 113. ‘the mean’ ∗ Formula for the mean is _ Σx x = N _ x = the mean Σ = the sum of x = the scores N = the number of scores in set ∗ Advantages ∗ Powerful statistic used in estimating population parameters for significant differences and correlations. Most sensitive, and works at an interval level. ∗ Disadvantages ∗ Can be overly sensitive causing it to easily distort due to outlier values
  • 114. ‘the median’ ∗ The measure of central tendency for ordinal data ∗ Shorthand may be Guildford’s (1956) Mdn ∗ It is the central value of a set ∗ A formula used to find the median is N + 1 k = 2 ∗ For odd number data sets this will reveal the central number ∗ For even number data sets this will reveal the two points of data that the median falls between ∗ When you have a number of values the same in the data set you can use the same method although it is not strictly correct. However, luckily for us as social scientists there are statistical packages that will take care of this for us
  • 115. ‘the mode’ ∗ The measure of central tendency for nominal scale data. We are unable to calculate mean and median with this type of data, but we can see what occurred most often/highest frequency ∗ There can be two modes, which we call bi-modal ∗ Advantages ∗ Most typical, unaffected by extremes, can be more informative than mean with discrete scales ∗ Disadvantages ∗ Does not account for differences between values, can’t be used in estimates of population parameters, not all that useful for small sets of data, for bi-modal two modal values reported, difficult to estimate accurately when data grouped into class intervals
  • 117. ‘the range’ ∗ Report of the top/highest value and the bottom/lowest value ∗ To calculate what the range is (the difference between) you subtract the lower value from the higher value and add 1 ∗ Advantage ∗ Includes extremes, easy to calculate ∗ Disadvantages ∗ Can be distorted by extremes, can be unrepresentative of the distribution. Doesn’t tell us whether values close to spaced out from mean
  • 118. ‘the interquartile and semi-interquartile range’ ∗ The interquartile range allows us a better insight into how values fall in relation to the central tendency ∗ Instead of the full range, the interquartile range represents the distance between the central 50%, removing the bottom and top 25%. The values are known as the 1st and 3rd quartiles or the 25th and 75th percentiles
  • 119. Interquartile range Q1 M Q3 3 3 4 5 6 8 10 13 14 16 19 ∗ The interquartile range is: Q3 – Q1 ∗ Semi-interquartile is half of that: Q3 – Q1 2 ∗ Advantages ∗ Representative of central group of values, useful for ordinal data ∗ Disadvantages ∗ No account of extremes, inaccurate where there are large class intervals
  • 120. Standard deviation and variance ∗ These estimate from a sample how the values of a population are distributed ∗ Standard deviation provides us with an average score telling us how different the scores are from the mean ∗ Formula for standard deviation (std, SD, stdev) )( −1 2 − = ∑ n Xx s − 1 = ∑ n s 2 d Or

Editor's Notes

  1. Statistical Package/Program for Social Sciences Or Something Painful Students Suffer
  2. Imp – Can use mode on categories as well but you lose important information in regards to the ranking.