2. Summarising Univariate
Data
Measures of Centre
Mean: the arithmetical average of all the individual
values in a set of values
Sum of all data values/number of data values
Median: the middle value is an ordered data set
Mode: the most frequently occurring value in a data set
Outliers: extreme values that don’t fit in with the data
set
The mean is affected by outliers whereas the mode
and median aren’t
The median is the best measure of centre when you
are working with skewed data or data that contains
outliers (eg. House prices)
K McMullen 2012
3. Summarising univariate
data
Range: The difference between the largest and
smallest value
MAX value minus MIN value
Is affected by outliers
Interquartile range (IQR): represents the spread
of the middle 50%
IQR= Upper quartile (Q3) minus Lower quartile
(Q1)
Upper quartile: The middle of the upper half
Lower quartile: The middle of the lower half
K McMullen 2012
4. Summarising Univariate
Data
Standard Deviation (s): measures the spread of
the distribution of data around the mean
Easily calculated using CAS
Used when dealing with normal distributions
The range, IQR and standard deviation all
measure the spread of data
The larger the spread the more varied the data
set is
K McMullen 2012
5. Summarising Univariate
Data
Five number summary: the five number is
summary identifies the important components of
a box plot
The summary must be stated in this order:
Minimum value
Lower quartile
Median
Upper quartile
Maximum
All 5 of these components must be displayed on a
box plot
K McMullen 2012