8. True rating
• Based on the law of large numbers, average
rating using this movies dataset can work very
well.
• However, by using a small sample, average
rating would significantly differ from the true
rating.
10. Kendall tau
• Statistic used to measure the degree of
similarity between two rankings.
• Practical use:
– Compare how close the top-10 results produced
by Google and Bing are.
11. A
B
C
D
E
C
D
A
B
E
Rank 1 Rank 2
T = 6 - 4 / ½ (5) (5-1) = 0.2
Dissimilarity goes from 1 to -1 where 1 means the two rankings are
the same and -1 means one ranking is the reverse of the other