Contenu connexe
Similaire à Making sense of data - Learning Lab slides (20)
Plus de Improvement Skills Consulting Ltd. (20)
Making sense of data - Learning Lab slides
- 1. Making sense of data
A “Learning Lab” for
Third Sector Organisations
Facilitated by: Ian J Seath
© 2019 Copyright ISC Ltd.
- 3. I help organisations with…
ContinuousImprovement
Strategy & Planning
Project Management
Process Improvement
Leadership & People
Development
© 2019 Copyright ISC Ltd.
- 4. © 2019 Copyright ISC Ltd.
Some basic principles for
displaying data
How to select appropriate
types of chart to analyse
and present data
Decide how big a sample to
choose in order to be
confident in the results
Why an “average” could be
very misleading
A simple chart to identify
priorities and achieve focus
Data – Information – Knowledge - Wisdom
Tables
Reports
Charts
Infographics
Dashboards
Interactive&
ScrollableCharts
DataStories&
Visualisations
- 5. MOST PEOPLE HATE MATHS!
“A Mathematician is a device for turning coffee into theorems.”
Paul Erdos (Hungarian Mathematician)
© 2019 Copyright ISC Ltd.
- 6. © 2019 Copyright ISC Ltd.
An aeroplane flies round the four sides of a 100
mile square
It flies at 100 mph on side 1, 200 mph on side 2,
300 mph on side 3 and 400 mph on side 4.
What is its average speed?
100 m.p.h.
300 m.p.h.
200 m.p.h.400 m.p.h.
100
miles
square
- 7. The Golden Rules of Measurement
No measurement without recording
No recording without analysis
No analysis without action
© 2019 Copyright ISC Ltd.
- 9. Which of these is easier to identify the
biggest percentage increase in funding
types?
Restricted funding has increased from £47,250 to
£63,210 whereas unrestricted funding increased
from £30,150 to £46,430 last year.
Restricted funding has increased from £47,000 to
£63,000 whereas unrestricted funding increased
from £30,000 to £46,000 last year.
© 2019 Copyright ISC Ltd.
- 10. Badly presented data makes it hard to
understand & improve performance
It’s much easier on the eye and to do a bit of mental
arithmetic on the second example and say that an
increase of £16,000 in restricted funds is about a third
(33%) and an increase of £16,000 in unrestricted funds
is about half (50%)
Very few people need absolutely precise numbers
(Actuaries, Accountants, Scientists and Engineers are
common exceptions)
So, for most management information, rounded data will be
easier to handle
© 2019 Copyright ISC Ltd.
- 11. Tip 1: Round to “2 effective digits”
Applied with common sense, rounding to two
effective digits usually makes numbers easier
to cope with and to get a quicker understanding
of what’s going on
© 2019 Copyright ISC Ltd.
47000
63000
30000
46000
0
20000
40000
60000
80000
100000
120000
Last year This year
Income(£)
Restricted & Unrestricted Funding
Restricted Unrestricted
- 12. Which is easier to read?
© 2019 Copyright ISC Ltd.
Income (£) Surplus (£)
2019 250,000 24,000
2018 220,000 20,000
2017 180,000 16,000
2016 140,000 10,000
2015 100,000 6,500
2015 2016 2017 2018 2019
Income (£) 100,000 140,000 180,000 220,000 250,000
Surplus (£) 6,500 10,000 16,000 20,000 24,000
A
B
- 13. Tip 2: Columns of data are almost
always easier to read than rows
Put the latest data, or the biggest numbers, at
the top of the table
N.B. You may not be able to do this if the data is
time-based
Columns of data allow the eye to scan up and
down more easily
© 2019 Copyright ISC Ltd.
- 14. How would you improve this?
© 2019 Copyright ISC Ltd.
Project
spending (£k)
Q1 Q2 Q3 Q4
Project A 34.4 32.1 27.7 32.2
Project B 148.6 139.6 144.3 166.5
Project C 305.7 284.4 245.3 377.8
Project D 25.8 29.2 24.9 27.8
Project E 256.7 242.1 212.9 243.0
Project F 68.5 73.3 67.9 84.6
- 15. Better?
© 2019 Copyright ISC Ltd.
Project
spending (£k)
Q1 Q2 Q3 Q4 Total
Project C 310 280 250 380 1220
Project E 260 240 210 240 950
Project B 150 140 140 170 600
Project F 69 73 68 85 295
Project A 34 32 28 32 126
Project D 26 29 25 28 108
Total 849 794 721 935 3299
- 16. Which chart(s) would you use?
© 2019 Copyright ISC Ltd.
0
50
100
150
200
250
300
350
400
Project C Project E Project B Project F Project A Project D
£(k)
Project Spending (£)
Q1 Q2 Q3 Q4
0
200
400
600
800
1000
Q1 Q2 Q3 Q4
£(k)
Project Spending (£)
Project C Project E Project B Project F Project A Project D
0
100
200
300
400
Q1 Q2 Q3 Q4
£(k)
Project Spending (£)
Project C Project E Project B
Project F Project A Project D
- 17. What can you conclude from this?
© 2019 Copyright ISC Ltd.
# of Beneficiary
requests
Q1 Q2 Q3 Q4
Support
Type A
370 350 320 350
Support
Type B
160 150 150 180
Support
Type C
47 51 46 63
Support
Type D
42 40 36 40
What else would you want to know?
- 18. Better, plus charts?
© 2019 Copyright ISC Ltd.
# of Beneficiary
requests
Q1 Q2 Q3 Q4 Total Average
Support Type A 370 350 320 350 1390 348
Support Type B 160 150 150 180 640 160
Support Type C 47 51 46 63 207 52
Support Type D 42 40 36 40 158 40
Total 619 591 552 633 2395
Average 155 148 138 158
0
100
200
300
400
Q1 Q2 Q3 Q4
Beneficiary Requests
Support Type A Support Type B
Support Type C Support Type D
0
50
100
150
200
250
300
350
400
Support Type
A
Support Type
B
Support Type
C
Support Type
D
Beneficiary Requests
Q1 Q2 Q3 Q4
- 19. Tip 3: When to use Tables for data
Use tables when you have ten, or fewer data
points, or if you need people to see the exact
numerical values in your results
Round the data to two effective digits unless
readers need the precise numbers
© 2019 Copyright ISC Ltd.
- 20. Charts or Tables? - Summary
Charts Tables
Fewer than 6 data points
7 – 10 data points
More than 10 data points
Need to see individual values
Need to show trends over time
Need to show the distribution / variation in data
More than 1 independent variable
© 2019 Copyright ISC Ltd.
- 21. HOW MUCH DATA IS
ENOUGH?
“Anecdotes are not statistics.”
© 2019 Copyright ISC Ltd.
- 22. Sampling
In many cases, we obtain data through sampling; often
because it is simply not possible to measure every single
item, or to log every activity, transaction or contact
The purpose of sampling is to collect an unbiased subset
which will give you a manageable amount of data
When you take samples, they should be representative
(statistically valid and reliable) and economic to collect
(quick and cost-effective)
© 2019 Copyright ISC Ltd.
- 23. Population vs. Sample
© 2019 Copyright ISC Ltd.
Beneficiary Satisfaction
Unhappy Happy
If we surveyed every single
beneficiary over a year to find
out how happy they were with
our support, this is what we
might find.
- 24. Contact Centre’s sample of 10 people
© 2019 Copyright ISC Ltd.
Customer Satisfaction
Unhappy Happy
How happy are our
beneficiaries according
to our staff?
- 25. Volunteers’ sample of 10 people
© 2019 Copyright ISC Ltd.
Customer Satisfaction
Unhappy Happy
How happy are our
beneficiaries according
to our volunteers?
- 26. Validation: depends on sample size
© 2019 Copyright ISC Ltd.
Customer Satisfaction
Unhappy Happy
Your ability to validate
beneficiary satisfaction data
depends on sample size.
If you pick too small a sample
you could, purely by chance,
find very different results and
draw the wrong conclusions.
- 28. If you have 6000 beneficiaries per year
© 2019 Copyright ISC Ltd.
+ or - 3
500 beneficiaries/month
75 beneficiaries/month
You might, therefore, say if 83% of
beneficiaries are ‘Happy’:
“We are 95% confident that between
80 and 86% of beneficiaries are Happy”
You can also work out
the CI for a known
sample size
- 29. Terms you need to understand
Confidence Interval (Margin of Error)
The plus-or-minus figure usually reported in
newspaper or television opinion poll results
If you pick a CI of 5 and 83% of your sample picks
‘Happy’, you can be “sure” that the 78-88% of the
entire population would have picked ‘Happy’
Confidence Level
Tells you how “sure” you can be that the population
would pick an answer within the Confidence Interval
A 95% CL is most commonly used and means, for the
example above, you can be 95% sure that the true
population is between 78 and 88%
© 2019 Copyright ISC Ltd.
- 30. Tip 4: Sampling guidelines
With static populations (e.g. customers, staff), use
random sampling; for example using Random Number
Tables to decide what (and when) to sample
Random sampling means that every unit in a population will have
an equal probability of being chosen in the sample
With time-based data, collect data in sub-groups of 5
values, equally spaced in time (e.g. services are
delivered, or transactions are carried out continuously
over a period of time – call handling in a contact centre)
If it is not feasible to take sub-groups, take individual values at
regular intervals; e.g. every 10th or 100th
© 2019 Copyright ISC Ltd.
- 33. Do you know what “average” means?
The length of time (in days) taken for 10 grant
applications to be processed was recorded
What was the average time it took
(from application received to completion)?
© 2019 Copyright ISC Ltd.
Grant
1
Grant
2
Grant
3
Grant
4
Grant
5
Grant
6
Grant
7
Grant
8
Grant
9
Grant
10
6 6.5 7 7 7 7.5 8 8 10 13
- 34. Mean, Median and Mode
Arithmetic Mean - the sum of values divided by the number of
values, often called “the average” (8.0 in our example)
Median - the middle value when all the values are arranged in order
[or the mean of the two middle values if there is an even number in
the list] (7.25 in our example)
Mode - the most frequently occurring value (7 in our example)
If the Mean = the Median, the data is distributed symmetrically
The Median and Mode are not affected by extreme values in a set of
data, unlike the Mean
© 2019 Copyright ISC Ltd.
- 35. Which “average” would you use & why?
© 2019 Copyright ISC Ltd.
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
No.ofcases
Time to repond (Days)
Time to respond to Grant Application (Days)
N = 33
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
No.ofcases
Time to repond (Days)
Time to respond to Grant Application (Days)
N = 33
A B
- 36. Some more questions…
Which one would you want to be held accountable for managing?
Where would you set a Service Level Agreement?
© 2019 Copyright ISC Ltd.
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
No.ofcases
Time to repond (Days)
Time to respond to Grant Application (Days)
N = 33
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10
No.ofcases
Time to repond (Days)
Time to respond to Grant Application (Days)
N = 33
Mean Median Mean Median
3.6 3 4.7 5
- 37. You also need to understand Variation
© 2019 Copyright ISC Ltd.
Bell-shaped Skewed
PlateauBi-modal
- 38. What a Histogram might tell you
Bell-shaped - a symmetrically shaped distribution which typically
represents data randomly distributed, but clustered around a central
value
Positive or negative skews - where the average value of the whole
set of data is to the left (-) or right (+) of the central value. Look out
for specification limits at the boundaries of the distribution which
might be causing data to be dropped from the population. More
extreme shapes are also known as “precipices”
Bimodal - where there are two peaks. Usually indicates two sets of
data (e.g. two teams or locations), with different Means have been
mixed
Plateau - occurs where several sets of data have been mixed (e.g.
from a number of customers/locations/groups)
© 2019 Copyright ISC Ltd.
- 41. Tip 5: When to use Graphs for data
Use graphs when you have more than ten
data points, or if you want to show people
“the big picture”, not detailed data
Use graphs when you need to show trends,
over time
Don’t clutter a graph with too many different
sets of data; it’s usually better to split the data
into separate graphs
© 2019 Copyright ISC Ltd.
- 42. Pie Charts
The data points in a Pie
Chart are displayed as a
percentage of the whole
pie
Good for: showing
proportions, at a glance
Not good for: showing
trends or comparisons
over time
© 2019 Copyright ISC Ltd.
- 43. Bar Charts
In Bar Charts, categories are
typically organised along the
horizontal axis and values up
the vertical axis
Bar Charts illustrate
comparisons among individual
items and may be “stacked” or
“100% stacked”
Good for: showing quantities of
responses in different
categories; often best when
sorted into biggest to smallest
Not good for: showing trends
over time (use a Line Graph)
© 2019 Copyright ISC Ltd.
- 44. Histograms
In Histograms, a variable (e.g.
Time) is displayed along the
horizontal axis and frequency up
the vertical axis
Good for: showing the variation
in a set of data and to help
decide if the Mean or Median
are the best choice of average
to quote
Not good for: showing variations
over time
N.B. Excel also calls these “Bar
Charts”
© 2019 Copyright ISC Ltd.
- 46. Pareto Analysis
© 2019 Copyright ISC Ltd.
20%
80%
80% of problems or errors are often due to only 20% of the
causes (The “Vital Few”)
The remaining 80% of causes account for only 20% of the
problems or errors (The “Trivial Many”)
CausesProblem
Occurrences
Also known as the 80:20 rule
20%
80% The “Vital
Few”
Causes
The “Trivial
Many”
Causes
Most of
the
problems
- 47. Pareto Diagram
A Pareto Diagram is a
particular type of Bar Chart
Category data is presented in
decreasing size, from left to
right and a Cumulative % line
is also drawn
Good for: showing the 80:20
Rule – highlighting the few
categories that account for the
majority of performance or
issues
Not good for: showing data
over time (but sometimes worth
showing “before” and “after”)
© 2019 Copyright ISC Ltd.
- 48. Example: Sources of Income
© 2019 Copyright ISC Ltd.
Inc. (£k) Cum. £k Inc. (%) Cum. %
National Lottery 241 241 47% 47%
Trusts & Foundations 110 351 21% 69%
Fundraising events 44 395 9% 77%
Local Authority Grant 31 426 6% 83%
1-off donations 28 454 5% 89%
Commercial sponsors 18 472 4% 92%
Regular individual donations 15 487 3% 95%
Merchandise 13 500 3% 98%
Major donors 7 507 1% 99%
Legacies 5 512 1% 100%
Total 512
- 51. Line Graphs
In a Line Graph, time data is
distributed evenly along the
horizontal axis, and all value
data is distributed up the vertical
axis
Good for: showing how results
have changed over time (trends)
Not good for: comparing lots of
different sets of results (too
many lines make it hard to see
what's going on)
N.B. Excel enables you to
overlay a statistically derived
trend line
© 2019 Copyright ISC Ltd.
- 52. What can you conclude from this data?
© 2019 Copyright ISC Ltd.
Is weekly caseload
increasing, decreasing,
or not changing?
Week
New
Cases Week
New
Cases
1 35 11 45
2 79 12 37
3 125 13 102
4 85 14 47
5 60 15 52
6 3 16 16
7 138 17 9
8 120 18 86
9 116 19 60
10 40 20 66
- 53. 4 Week Moving Average
© 2019 Copyright ISC Ltd.
Weekly caseload is decreasing
- 54. 4 Week Moving Average
© 2019 Copyright ISC Ltd.
Week
New
Cases Week
New
Cases
1 35 11 45
2 79 12 37
3 125 13 102
4 85 14 47
5 60 15 52
6 3 16 16
7 138 17 9
8 120 18 86
9 116 19 60
10 40 20 66
Quick & dirty:
Weeks 1-10 average = 80
Weeks 11-20 average = 52
- 56. © 2019 Copyright ISC Ltd.
ian.seath@improvement-skills.co.uk
07850 728506
@ianjseath
uk.linkedin.com/in/ianjseath
Prepared for Measuring the Good and Coalition for Efficiency by
Ian J Seath
Improvement Skills Consulting Ltd.
www.improvement-skills.co.uk
Notes de l'éditeur
- Time to fly the square = 1 + .5 + .33 + .25 hours
= 2.08 hours
400 miles / 2.08 hours = 192 mph
- 63-47 = 16
16/48 is a third (33%)
46-30 = 16
15/30 is a half (50%)
- Mean = 8 days
- 73% in 5 days (left)
76% in 5 days (right)