3. Copyright BBC 2008 - Confidential
Clustering Users - Process
3
Data Preparation
The data set consisted of sampled users who
had at least 2 streaming play events. Genres
and formats were represented as flags (1 –
yes/0 – no).
Cluster Method
Various options were considered for clustering
but after experimentation, a two stage
approach was adopted. This involved using K-
means to define a relatively large number of
clusters (25-50) and then apply a hierarchical
method to refine the groups.
Issues
• Data reduction techniques were not effective
at producing standardised variables for inputs
• Hierarchical clustering took a long time to
process for relatively small data sets (c.3,000
users)
• Main genre and formats alone were
considered not to produce enough
differentiation
Solution 1
K-Means cluster of the main genre. Default options, 50
cluster solution. Sub genre as inputs for 4,200 users
with 2+ play events.
Tthe output from K-means was an input into the
hierarchical technique to overcome excessive processing
times. Experimented with several methods (SAS,
2008a) including average, single linkage, Wards, and
maximum likelihood (ML) to produce the clearest cluster
solution. A dendogram was produced for each method
and ML and Wards were the most effective 7 and 6
cluster solutions respectively.
Profiling suggested that clusters included programme
specific groupings (Ashes-to-Ashes and Torchwood),
genres (art / media / culture / documentary, soaps,
comedy and children’s ) and a general group.
Solution 2
Same process as solution 1 but increased the number of
clusters to 11, extended the classification to include Top
Gear, Science / Nature / Technology and splitting
comedy between sketch and sitcoms.
Solution 3
Re-ran the initial clustering as K-mediods using an
option to minimise the mean absolute differences in the
data and cluster medians (SAS, 2008b). This was
expected to produce more robust results for binary
data. The results are shown over the page.
4. Copyright BBC 2008 - Confidential
Clustering Users – Final Dendogram
using Wards Method
4Source: iPlayer Sample – 1st
/2nd
March
Cut-off
1 2
3 4
5
6 7 8 9 10
Cut-off
5. Copyright BBC 2008 - Confidential
Clustering Users - Results Ii
5Source: iPlayer Sample - 1st
/2nd
March
Rank Programme 1 2 3 4 5 6 7 8 9 10
1 EastEnders 66 71 39 59 6 63 47 519 78 52
2 Friday Night with Jonathan Ross Series 14 46 42 110 123 133 127 82 78 105 57
5 Ashes to Ashes 20 7 131 0 1 0 0 0 892 146
6 That Mitchell and Webb Look Series 2 39 44 189 518 10 0 108 0 109 80
7 Torchwood Series 2 72 53 157 0 2 0 0 0 57 1119
9 Little Britain Series 2 127 99 209 475 14 0 107 0 71 58
10 Top Gear Series 8 95 62 150 101 21 786 25 0 74 52
11 Top Gear Series 10 101 73 176 82 22 771 29 0 77 57
12 Life in Cold Blood 139 67 158 0 5 110 878 0 96 66
18 Arthur Series 12 522 731 10 0 5 0 0 0 35 4
20 Doctors Series 9 60 74 46 37 6 29 35 546 93 66
23 Horizon 114 43 314 0 8 114 913 0 69 41
26 The Pingu Show 393 765 25 0 1 0 0 0 52 0
27 ChuckleVision Series 17 466 739 22 0 4 0 0 0 46 0
28 The Armstrong and Miller Show 63 40 212 527 9 0 101 0 102 69
29 Bizarre ER 108 86 141 58 138 63 82 86 59 150
30 Spartacus 133 37 1887 0 67 133 69 38 14 16
37 In The Night Garden Series 1 285 811 0 0 0 0 0 0 48 0
50 The Secret World of Haute Couture 81 56 419 86 136 101 131 51 68 28
62 The Hard Sell 134 51 1213 0 110 93 84 32 93 8
70 Escape to the Country Series 6 121 15 98 0 298 0 3 0 33 39
71 Homes Under the Hammer Series 10 115 29 57 0 300 0 6 0 19 42
79 Rogue Traders Series 6 162 64 134 0 261 0 24 0 64 20
100 The One Show 127 82 126 22 224 48 44 41 24 41
Cluster Genre specific clusters
1- Family (Children & General)
2 – Children’s
3 – Art, Media and Culture
4 – Comedy (Sketch Shows)
5 – Magazine & Lifestyle
7 – Science & Nature
8 – Soaps
Programme specific clusters
6 – Top Gear
9 – Ashes to Ashes
10 – Torchwood
Index values are used to show
the relative number of plays
against the cluster size. A value
of 100 is average while over
100 shows a greater propensity
than expected.
Key Index
500+
400-499
300-399
200-299
150-199
149 or less
6. Copyright BBC 2008 - Confidential
Clustering Users - Results Ii
5Source: iPlayer Sample - 1st
/2nd
March
Rank Programme 1 2 3 4 5 6 7 8 9 10
1 EastEnders 66 71 39 59 6 63 47 519 78 52
2 Friday Night with Jonathan Ross Series 14 46 42 110 123 133 127 82 78 105 57
5 Ashes to Ashes 20 7 131 0 1 0 0 0 892 146
6 That Mitchell and Webb Look Series 2 39 44 189 518 10 0 108 0 109 80
7 Torchwood Series 2 72 53 157 0 2 0 0 0 57 1119
9 Little Britain Series 2 127 99 209 475 14 0 107 0 71 58
10 Top Gear Series 8 95 62 150 101 21 786 25 0 74 52
11 Top Gear Series 10 101 73 176 82 22 771 29 0 77 57
12 Life in Cold Blood 139 67 158 0 5 110 878 0 96 66
18 Arthur Series 12 522 731 10 0 5 0 0 0 35 4
20 Doctors Series 9 60 74 46 37 6 29 35 546 93 66
23 Horizon 114 43 314 0 8 114 913 0 69 41
26 The Pingu Show 393 765 25 0 1 0 0 0 52 0
27 ChuckleVision Series 17 466 739 22 0 4 0 0 0 46 0
28 The Armstrong and Miller Show 63 40 212 527 9 0 101 0 102 69
29 Bizarre ER 108 86 141 58 138 63 82 86 59 150
30 Spartacus 133 37 1887 0 67 133 69 38 14 16
37 In The Night Garden Series 1 285 811 0 0 0 0 0 0 48 0
50 The Secret World of Haute Couture 81 56 419 86 136 101 131 51 68 28
62 The Hard Sell 134 51 1213 0 110 93 84 32 93 8
70 Escape to the Country Series 6 121 15 98 0 298 0 3 0 33 39
71 Homes Under the Hammer Series 10 115 29 57 0 300 0 6 0 19 42
79 Rogue Traders Series 6 162 64 134 0 261 0 24 0 64 20
100 The One Show 127 82 126 22 224 48 44 41 24 41
Cluster Genre specific clusters
1- Family (Children & General)
2 – Children’s
3 – Art, Media and Culture
4 – Comedy (Sketch Shows)
5 – Magazine & Lifestyle
7 – Science & Nature
8 – Soaps
Programme specific clusters
6 – Top Gear
9 – Ashes to Ashes
10 – Torchwood
Index values are used to show
the relative number of plays
against the cluster size. A value
of 100 is average while over
100 shows a greater propensity
than expected.
Key Index
500+
400-499
300-399
200-299
150-199
149 or less
Editor's Notes
SLIDE FIVE: LINEAR TV AND BBC iPLAYER (CURRENTLY IN DRAFT, WILL BE ADDED TO PRESENTATION)
The scale has been adjusted to compare how three individual programmes perform on BBC iPlayer compared with their linear TV rankings. We picked these three to be a selection of mainstream, breakthrough and niche programmes.
Early indications are that programmes which skew young and niche may perform ‘disproportionately well’ on iPlayer
EastEnders (BBC One) receives a lot of iPlayer requests, the iPlayer users only make up 2% of the whole audience
Breakthrough: Gavin & Stacey (BBC Three) receives 240,000 requests on iPlayer, making up 7% of its total audience
Niche: MI High attracts a relatively small BARB audience on CBBC but does extremely well, proportionately speaking, on iPlayer, with 17% of its audience