Hybridoma Technology ( Production , Purification , and Application )
statistics
1. Intro to Research in Information Studies Inferential Statistics Standard Error of the Mean Significance Inferential tests you can use
2. Do you speak the language? t = n 1 - X B 2 X B 2 ( ) n 2 - 1 n 1 + ( ) x - ( n 1 -1) + (n 2 -1) X A — X B — X A 2 X A 2 ( ) ( ) ( ) + [ ] 1 n 2
3. Don’t Panic ! t = n 1 - X B 2 X B 2 ( ) n 2 - 1 n 1 + ( ) x - Compare with SD formula ( n 1 -1) + (n 2 -1) Difference between means X A — X B — X A 2 X A 2 ( ) ( ) ( ) + [ ] 1 n 2
4.
5.
6.
7.
8.
9.
10.
11. If we recalculate the variance with the 60 instead of the 5 in the data…
12. If we include a large outlier : Note increase in SD Like the mean, the standard deviation uses every piece of data and is therefore sensitive to extreme values
13. Two sets of data can have the same mean but different standard deviations. The bigger the SD, the more s-p-r-e-a-d out are the data.
14.
15. Summary Mode • Median • Mean • Range • Interquartile Range • Variance / Standard Deviation • Most frequent observation. Use with nominal data ‘ Middle’ of data. Use with ordinal data or when data contain outliers ‘ Average’. Use with interval and ratio data if no outliers Dependent on two extreme values More useful than range. Often used with median Same conditions as mean. With mean, provides excellent summary of data Measures of Central Tendency Measures of Dispersion
16. Deviation units: Z scores Any data point can be expressed in terms of its Distance from the mean in SD units: A positive z score implies a value above the mean A negative z score implies a value below the mean Andrew Dillon: Move this to later in the course, after distributions?
17.
18.
19.
20. Graphing data - the histogram Number Of errors The categories of data we are studying, e.g., task or interface, or user group etc. The frequency of occurrence for measure of interest, e.g., errors, time, scores on a test etc. 1 2 3 4 5 6 7 8 9 10 Graph gives instant summary of data - check spread, similarity, outliers, etc.
23. The Normal Curve NB: position of measures of central tendency Mean Median Mode 50% of scores fall below mean f
24. Positively skewed distribution Note how the various measures of central tendency separate now - note the direction of the change…mode moves left of other two, mean stays highest, indicating frequency of scores less than the mean Mode Median Mean f
25. Negatively skewed distribution Here the tendency to have higher values more common serves to increase the value of the mode Mean Median Mode f
26.
27. Bimodal f Mean Median Mode Mode Will occur in situations where there might be distinct groups being tested e.g., novices and experts Note how each mode is itself part of a normal distribution (more later)
28. Standard deviations and the normal curve Mean 1 sd f 1 sd 68% of observations fall within ± 1 s.d. 95% of observations fall within ± 2 s.d. (approx) 1 sd 1 sd
46. 2 4 6 8 10 12 14 16 18 The distribution of the means forms a smaller normal distribution about the true mean:
47. True for skewed distributions too Mean f Plot of means from samples Here the tendency to have higher values more common serves to increase the value of the mode
59. SE of difference between means This lets us set up confidence limits for the differences between the two means
60.
61.
62.
63.
64.
65.
66.
67. T-test: From t-tables, we can see that this value of t exceeds t value (with 5 d.f.) for p.10 level So we are confident at 90% level that our new interface leads to improvement
68. T-test: SE mean Sample mean Thus - we can still talk in confidence intervals, e.g., We are 68% confident the mean of population =79.17 5.38