SlideShare une entreprise Scribd logo
1  sur  76
Télécharger pour lire hors ligne
FK6163



Explore & Summarise

     Dr Azmi Mohd Tamil
  Dept of Community Health
Universiti Kebangsaan Malaysia

                        ©drtamil@gmail.com 2012
Introduction



   Method of Exploring and
   Summarising Data differs
According to Types of Variables




                        ©drtamil@gmail.com 2012
Dependent/Independent

         Independent Variables


Food Intake                 Frequency of Exercise




                  Obesity

              Dependent Variable   ©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
Explore

4 Itis the first step in the analytic process
4 to explore the characteristics of the data
4 to screen for errors and correct them
4 to look for distribution patterns - normal
  distribution or not
4 May require transformation before further
  analysis using parametric methods
4 Or may need analysis using non-parametric
  techniques
                                    ©drtamil@gmail.com 2012
Data Screening
                                         PARITY

                                         Frequency   Percent
4 By  running            Valid   1              67       30.7
  frequencies, we may            2              44       20.2
                                 3              36       16.5
  detect inappropriate           4              22       10.1
  responses                      5              21        9.6
                                 6               8        3.7
4 How many in the
                                 7               3        1.4
  audience have 15               8               7        3.2
  children and                   9               5        2.3
                                 10              3        1.4
  currently pregnant             11              1         .5
  with the 16th?                 15              1         .5
                                 Total         218      100.0

                                          ©drtamil@gmail.com 2012
Data Screening

4 See  whether the
  data make sense or
  not.
4 E.g. Parity 10 but
  age only 25.




          ©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
Data Screening

   4 By   looking at measures of central tendency
      and range, we can also detect abnormal values
      for quantitative data

                            Descriptive Statistics

                                                                      Std.
                       N         Minimum     Maximum      Mean      Deviation
Pre-pregnancy weight       184         32         484      53.05       33.37
Valid N (listwise)         184




                                                        ©drtamil@gmail.com 2012
Interpreting the Box Plot
                         Outlier
Largest non-outlier                The whiskers extend
                                   to 1.5 times the box
                                   width from both ends
Upper quartile                     of the box and ends
                                   at an observed value.
                                   Three times the box
Median                             width marks the
                                   boundary between
                                   "mild" and "extreme"
Lower quartile                     outliers.
                                 "mild" = closed dots
Smallest non-outlier
                          Outlier"extreme"= open dots

                                     ©drtamil@gmail.com 2012
Data Screening

              600

4 We  can
 also make    500
                                 73



 use of       400

 graphical
 tools such   300


 as the box
              200
 plot to
 detect       100
                                 181
                                 211
                                 198
                                 141
 wrong
               0
 data entry     N=             184

                       Pre-pregnancy weight
                                       ©drtamil@gmail.com 2012
Data Cleaning

4 Identify the extreme/wrong values
4 Check with original data source – i.e.
  questionnaire
4 If incorrect, do the necessary correction.
4 Correction must be done before
  transformation, recoding and analysis.


                                 ©drtamil@gmail.com 2012
Parameters of Data
                        Distribution

4 Mean  – central value of data
4 Standard deviation – measure of how
  the data scatter around the mean
4 Symmetry (skewness) – the degree of
  the data pile up on one side of the mean
4 Kurtosis – how far data scatter from the
  mean

                                ©drtamil@gmail.com 2012
Normal distribution

4   The Normal distribution is
    represented by a family of curves
    defined uniquely by two parameters,
    which are the mean and the
    standard deviation of the population.
4    The curves are always
    symmetrically bell shaped, but the
    extent to which the bell is
    compressed or flattened out
    depends on the standard deviation
    of the population.
4   However, the mere fact that a curve
    is bell shaped does not mean that it
    represents a Normal distribution,
    because other distributions may
    have a similar sort of shape.


                                            ©drtamil@gmail.com 2012
Normal distribution

4   If the observations follow a     99.7%

    Normal distribution, a range     95.4%
    covered by one standard
                                     68.3%
    deviation above the mean
    and one standard deviation
    below it includes about
    68.3% of the observations;
4   a range of two standard
    deviations above and two
    below (+ 2sd) about 95.4%
    of the observations; and
4   of three standard deviations
    above and three below (+
    3sd) about 99.7% of the
    observations
                                   ©drtamil@gmail.com 2012
Normality

4 Why   bother with normality??
4 Because it dictates the type of analysis
  that you can run on the data




                                 ©drtamil@gmail.com 2012
Normality-Why?
                                                              Parametric

Qualitative      Quantitative   Normally distributed data     Student's t Test
Dichotomus
Qualitative      Quantitative   Normally distributed data     ANOVA
Polinomial
Quantitative     Quantitative   Repeated measurement of the Paired t Test
                                same individual & item (e.g.
                                Hb level before & after
                                treatment). Normally
                                distributed data
Quantitative -   Quantitative - Normally distributed data    Pearson Correlation
continous        continous                                   & Linear
                                                             Regresssion




                                                            ©drtamil@gmail.com 2012
Normality-Why?
                                                      Non-parametric



Qualitative     Quantitative    Data not normally distributed Wilcoxon Rank Sum
Dichotomus                                                    Test or U Mann-
                                                              Whitney Test
Qualitative      Quantitative Data not normally distributed Kruskal-Wallis One
Polinomial                                                    Way ANOVA Test
Quantitative     Quantitative Repeated measurement of the Wilcoxon Rank Sign
                                same individual & item        Test
Quantitative -   Quantitative - Data not normally distributed Spearman/Kendall
continous/ordina continous                                    Rank Correlation
l




                                                         ©drtamil@gmail.com 2012
Normality-How?

                           4 Explored   statistically
4 Explored   graphically
                             • Kolmogorov-Smirnov
  • Histogram
                               statistic, with
  • Stem & Leaf                Lilliefors significance
  • Box plot                   level and the
  • Normal probability         Shapiro-Wilks
    plot                       statistic
  • Detrended normal         • Skew ness (0)
    plot                     • Kurtosis (0)
                                – + leptokurtic
                                – 0 mesokurtik
                                – - platykurtic
                                        ©drtamil@gmail.com 2012
Kolmogorov- Smirnov

4 In the 1930’s, Andrei Nikolaevich
  Kolmogorov (1903-1987) and N.V.
  Smirnov (his student) came out with the
  approach for comparison of distributions
  that did not make use of parameters.
4 This is known as the Kolmogorov-
  Smirnov test.


                                ©drtamil@gmail.com 2012
Skew ness

4 Skewed   to the right
  indicates the
  presence of large
  extreme values
4 Skewed to the left
  indicates the
  presence of small
  extreme values


                           ©drtamil@gmail.com 2012
Kurtosis

4 For  symmetrical
  distribution only.
4 Describes the shape
  of the curve
4 Mesokurtic -
  average shaped
4 Leptokurtic - narrow
  & slim
4 Platikurtic - flat &
  wide                   ©drtamil@gmail.com 2012
Skew ness & Kurtosis

4 Skew   ness ranges from -3 to 3.
4 Acceptable range for normality is skew ness
  lying between -1 to 1.
4 Normality should not be based on skew ness
  alone; the kurtosis measures the “peak ness”
  of the bell-curve (see Fig. 4).
4 Likewise, acceptable range for normality is
  kurtosis lying between -1 to 1.


                                   ©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
Normality - Examples
                                                                             Graphically
60



50



40



30



20



10                                                                   Std. Dev = 5.26
                                                                     Mean = 151.6
0                                                                    N = 218.00
     140.0       145.0   150.0       155.0   160.0   165.0
             142.5   147.5   152.5       157.5   162.5       167.5


     Height                                                              ©drtamil@gmail.com 2012
Q&Q Plot

4 This  plot compares the quintiles of a data
  distribution with the quintiles of a standardised
  theoretical distribution from a specified family
  of distributions (in this case, the normal
  distribution).
4 If the distributional shapes differ, then the
  points will plot along a curve instead of a line.
4 Take note that the interest here is the central
  portion of the line, severe deviations means
  non-normality. Deviations at the “ends” of the
  curve signifies the existence of outliers.

                                      ©drtamil@gmail.com 2012
Normality - Examples
                                                                                                     Graphically
                       Normal Q-Q Plot of Height
                  3



                  2



                  1



                  0
                                                                                 Detrended Normal Q-Q Plot of Height
Expected Normal




                  -1                                                        .6


                                                                            .5
                  -2
                                                                            .4

                  -3                                                        .3
                   130             140     150     160   170
                                                                            .2
                       Observed Value
                                                         Dev from Normal

                                                                            .1


                                                                           0.0


                                                                           -.1

                                                                           -.2
                                                                             130             140        150     160     170


                                                                                 Observed Value    ©drtamil@gmail.com 2012
Normal distribution
Mean=median=mode




       ©drtamil@gmail.com 2012
Normality - Examples
                                                                                 Statistically
                               Descriptives

                                              Statistic    Std. Error
Height   Mean                                   151.65           .356
         95% Confidence        Lower Bound      150.94
         Interval for Mean     Upper Bound                                       Normal distribution
                                                152.35
                                                                                 Mean=median=mode
         5% Trimmed Mean                        151.59
         Median                                 151.50
         Variance                               27.649                           Skewness & kurtosis
         Std. Deviation                          5.258
         Minimum                                   139
                                                                                 within +1
         Maximum                                   168
         Range                                      29
         Interquartile Range                      8.00
                                                                                 p > 0.05, so normal
         Skewness                                 .148          .165             distribution
         Kurtosis                                 .061          .328
                                                                        Tests of Normality
                                                                                                  a
                                                                           Kolmogorov-Smirnov
         Shapiro-Wilks; only if                                     Statistic      df         Sig.
         sample size less than 100.                 Height               .060         218       .052
                                                          a. Lilliefors Significance Correction
                                                                                ©drtamil@gmail.com 2012
K-S Test




©drtamil@gmail.com 2012
K-S Test

4 very  sensitive to the sample sizes of the
  data.
4 For small samples (n<20, say), the
  likelihood of getting p<0.05 is low
4 for large samples (n>100), a slight
  deviation from normality will result in
  being reported as abnormal distribution

                                  ©drtamil@gmail.com 2012
Guide to deciding on
           normality




          ©drtamil@gmail.com 2012
Normality
                                                                                                                          Transformation
                        Normal Q-Q Plot of PARITY
                        Normal Q-Q Plot of PARITY
                  33



                  22



                  11
                                                                                                   Normal Q-Q Plot of LN_PARIT
                                                                                                   Normal Q-Q Plot of LN_PARIT
                  00                                                                          3
                                                                                              3
Expected Normal
Expected Normal




                  -1
                   -1

                                                                                              2
                                                                                              2
                  -2
                   -2
                     00       22    44   66   88   10
                                                    10   12
                                                          12   14
                                                                14   16
                                                                      16

                        Observed Value
                        Observed Value
                                                                                              1
                                                                                              1



                                                                                              0
                                                                                              0
                                                                           Expected Normal
                                                                           Expected Normal




                                                                                             -1
                                                                                              -1



                                                                                             -2
                                                                                              -2
                                                                                                -.5
                                                                                                 -.5     0.0
                                                                                                         0.0        .5
                                                                                                                     .5    1.0
                                                                                                                           1.0      1.5
                                                                                                                                    1.5   2.0
                                                                                                                                          2.0   2.5
                                                                                                                                                2.5   3.0
                                                                                                                                                      3.0


                                                                                                   Observed Value
                                                                                                   Observed Value                ©drtamil@gmail.com 2012
TYPES OF TRANSFORMATIONS




     Square root              Logarithm               Inverse




Reflect and square       Reflect and logarithm   Reflect and inverse
root
                                                 ©drtamil@gmail.com 2012
Summarise

4 Summarise   a large set of data by a few
  meaningful numbers.
4 Single variable analysis
  • For the purpose of describing the data
  • Example; in one year, what kind of cases are
    treated by the Psychiatric Dept?
  • Tables & diagrams are usually used to describe
    the data
  • For numerical data, measures of central tendency
    & spread is usually used


                                        ©drtamil@gmail.com 2012
Frequency Table

                             Race     F            %
                            Malay    760        95.84%
                           Chinese    5         0.63%
                            Indian    0         0.00%
                           Others     28        3.53%
                           TOTAL     793       100.00%


•Illustrates the frequency observed for each
category


                                      ©drtamil@gmail.com 2012
Frequency
                            Distribution Table

• > 20 observations, best          Umur       Bil          %
presented as a frequency        0-0.99               25    3.26%
                                1-4.99               78   10.18%
distribution table.             5-14.99             140   18.28%
•Columns divided into class &   15-24.99            126   16.45%
                                25-34.99            112   14.62%
frequency.                      35-44.99             90   11.75%
•Mod class can be determined    45-54.99             66    8.62%
                                55-64.99             60    7.83%
using such tables.              65-74.99             50    6.53%
                                75-84.99             16    2.09%
                                85+                   3    0.39%
                                JUMLAH              766


                                           ©drtamil@gmail.com 2012
Measurement of Central
 Tendency & Spread




                     ©drtamil@gmail.com 2012
Measures of Central
                 Tendency


4Mean
4Mode
4Median

                  ©drtamil@gmail.com 2012
Measures of Variability


4Standard  deviation
4Inter-quartiles
4Skew ness & kurtosis


                  ©drtamil@gmail.com 2012
Mean

4 theaverage of the data collected
4 To calculate the mean, add up the
  observed values and divide by the
  number of them.
4A major disadvantage of the mean is
 that it is sensitive to outlying points



                                 ©drtamil@gmail.com 2012
Mean: Example


412, 13, 17, 21, 24, 24, 26, 27, 27,
 30, 32, 35, 37, 38, 41, 43, 44, 46,
 53, 58
4Total   of x = 648
4n=    20
4Mean       = 648/20 = 32.4
                              ©drtamil@gmail.com 2012
Measures of variation -
                               standard deviation

4   tells us how much all the scores in a dataset cluster around the
    mean. A large S.D. is indicative of a more varied data scores.
4   a summary measure of the differences of each observation from
    the mean.
4   If the differences themselves were added up, the positive would
    exactly balance the negative and so their sum would be zero.
4   Consequently the squares of the differences are added.




                                                    ©drtamil@gmail.com 2012
©drtamil@gmail.com 2012
sd: Example

                                x                      x
4   12, 13, 17, 21, 24, 24,           (x-mean)^2             (x-mean)^2

                               12     416.16          32       0.16
    26, 27, 27, 30, 32, 35,    13     376.36          35       6.76
    37, 38, 41, 43, 44, 46,    17     237.16          37      21.16
    53, 58                     21     129.96          38      31.36
                               24      70.56          41      73.96
4   Mean = 32.4; n = 20
                               24      70.56          43     112.36
4   Total of(x-mean)2          26      40.96          44     134.56
    = 3050.8                   27      29.16          46     184.96
                               27      29.16          53     424.36
4   Variance = 3050.8/19
                               30       5.76          58     655.36
    = 160.5684                TOTAL   1405.8        TOTAL      1645
4   sd = 160.56840.5=12.67
                                                   ©drtamil@gmail.com 2012
Median

4 the  ranked value that lies in the middle
  of the data
4 the point which has the property that half
  the data are greater than it, and half the
  data are less than it.
4 if n is even, average the n/2th largest
  and the n/2 + 1th largest observations
4 "robust" to outliers


                                 ©drtamil@gmail.com 2012
Median:

4 12, 13, 17, 21, 24, 24, 26, 27, 27, 30,
  32, 35, 37, 38, 41, 43, 44, 46, 53, 58
4 (20+1)/2   = 10th which is 30, 11th is 32
4 Therefore   median is (30 + 32)/2 = 31




                                        ©drtamil@gmail.com 2012
Measures of variation -
                          quartiles

4 The range is very susceptible to what
 are known as outliers
4A more robust approach is to divide the
 distribution of the data into four, and find
 the points below which are 25%, 50%
 and 75% of the distribution. These are
 known as quartiles, and the median is
 the second quartile.
                                  ©drtamil@gmail.com 2012
Quartiles

4 12, 13, 17, 21, 24,
  24, 26, 27, 27, 30,
  32, 35, 37, 38, 41,
  43, 44, 46, 53, 58
4 25th   percentile 24; (24+24)/2
4 50th   percentile 31; (30+32)/2 ; = median
4 75th   percentile 42.5; (41+43)/2


                                       ©drtamil@gmail.com 2012
Mode

4 The   most frequent occurring number.
  E.g. 3, 13, 13, 20, 22, 25: mode = 13.
4 It is usually more informative to quote
  the mode accompanied by the
  percentage of times it happened; e.g.,
  the mode is 13 with 33% of the
  occurrences.


                                 ©drtamil@gmail.com 2012
Mode: Example

4 12,13, 17, 21, 24, 24, 26, 27, 27, 30,
 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

4 Modes   are 24 (10%) & 27 (10%)




                                 ©drtamil@gmail.com 2012
Mean or Median?

4 Which   measure of central tendency
  should we use?
4 if the distribution is normal, the mean+sd
  will be the measure to be presented,
  otherwise the median+IQR should be
  more appropriate.



                                 ©drtamil@gmail.com 2012
Not Normal distribution;   Normal distribution;
Use Median & IQR           Use Mean+SD




                                   ©drtamil@gmail.com 2012
Presentation



Qualitative & Quantitative Data
       Charts & Tables




                          ©drtamil@gmail.com 2012
Presentation




Qualitative Data




                   ©drtamil@gmail.com 2012
Graphing Categorical Data:
                   Univariate Data

                                Categorical Data


                                                              Graphing Data
 Tabulating Data
The Summary Table
                                                                       Pie Charts
                CD


            S avings


             B onds                                  Bar Charts               Pareto Diagram
            S toc ks
                                                                  45                                       120
                                                                  40
                       0   10    20   30   40   50                                                         100
                                                                  35
                                                                  30                                       80
                                                                  25
                                                                                                           60
                                                                  20
                                                                  15                                       40
                                                                  10
                                                                                                           20
                                                                   5
                                                                   0                                       0
                                                                       S toc ks   B onds   S avings   CD




                                                                            ©drtamil@gmail.com 2012
Bar Chart
          80



                              69

          60




          40




          20
                                                        20
Percent




                                         11

          0
                        Housew ife   Office w ork   Field w ork


               Type of work
                                                          ©drtamil@gmail.com 2012
Pie Chart


Others
Chinese




          Malay




          ©drtamil@gmail.com 2012
Tabulating and Graphing
        Bivariate Categorical Data

4 Contingency       tables:
Table 1: Contigency table of pregnancy induced hypertension and
                               SGA

Count
                                      SGA
                              Normal        SGA         Total
Pregnancy induced    No           103           94          197
hypertension         Yes             5          16            21
Total                             108          110          218




                                               ©drtamil@gmail.com 2012
Tabulating and Graphing
     Bivariate Categorical Data
                  120

4 Side
                  100
 by                          103

                                        94

 side             80

 charts
                  60



                  40

                                                                     SGA
                  20
                                                                        Normal
          Count




                                                               16

                   0                                                    SGA
                                   No                    Yes


                        Pregnancy induced hypertension

                                                               ©drtamil@gmail.com 2012
Presentation




Quantitative Data




                    ©drtamil@gmail.com 2012
Tabulating and Graphing
                                  Numerical Data
                                     Numerical Data       41, 24, 32, 26, 27, 27, 30, 24, 38, 21




                                             Frequency Distributions
   Ordered Array                                                                                                   Ogive




21, 24, 24, 26, 27, 27, 30, 32, 38, 41
                                             Cumulative Distributions                         120
                                                                                              100
                                                                                               80
                                                                                               60
                                                                                               40
                                                                                               20
                                                                                               0
                                 2 144677                                                           Area
                                                                                                    10   20   30      40   50   60

   Stem and Leaf                                   Histograms
                                 3 028
      Display                                         7

                                                      6



                                 4 1
                                                      5

                                                      4



                                             Tables   3

                                                      2
                                                      1
                                                                                         Polygons
                                                      0
                                                          10   20   30   40   50   60




                                                                                   ©drtamil@gmail.com 2012
Tabulating Numerical Data:
           Frequency Distributions
4 Sort raw data in ascending order:
  12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

4 Find     range: 58 - 12 = 46
4 Select     number of classes: 5 (usually between 5 and 15)
4 Compute         class interval (width): 10 (46/5 then round up)
4 Determine         class boundaries (limits): 10, 20, 30, 40, 50, 60
4 Compute         class midpoints: 14.95, 24.95, 34.95, 44.95,           54.95

4 Count      observations & assign to classes

                                                          ©drtamil@gmail.com 2012
Frequency Distributions
                              and Percentage Distributions


                   Data in ordered array:
 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

   Class                  Midpoint                 Freq                  %
10.0 - 19.9                 14.95                     3                15%
20.0 - 29.9                 24.95                     6                30%
30.0 - 39.9                 34.95                     5                25%
40.0 - 49.9                 44.95                     4                20%
50.0 - 59.9                 54.95                     2                10%
  TOTAL                                              20               100%
                                                            ©drtamil@gmail.com 2012
Graphing Numerical Data:
                                     The Histogram
                                  Data in ordered array:
                12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
            7

                                      6
            6

                                                      5
            5
Frequency




                                                                       4
            4

                     3                                                                 No Gaps
            3
                                                                                       Between
                                                                                       2
            2
                                                                                         Bars
            1


            0

                   14.95            24.95           34.95            44.95            54.95
                                                     Age
            Class Boundaries
                                                 Class Midpoints             ©drtamil@gmail.com 2012
Graphing Numerical Data:
          The Frequency Polygon
                  Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

7

6

5

4

3

2

1

0
       14.95           24.95          34.95          44.95           54.95



                               Class Midpoints               ©drtamil@gmail.com 2012
Calculate Measures of
        Central Tendency & Spread

4 We can use frequency distribution table
 to calculate;
 •   Mean
 •   Standard Deviation
 •   Median
 •   Mode



                               ©drtamil@gmail.com 2012
Mean

X=
   ∑ f .mp
         n              Class       Midpoint   Freq   freq x m.p.

4 Mean  = 659/20      10.0 - 19.9    14.95      3      44.85
         = 32.95      20.0 - 29.9    24.95      6     149.70
4 Compare with 32.4
                      30.0 - 39.9    34.95      5     174.75
  from direct
                      40.0 - 49.9    44.95      4     179.80
  calculation.
                      50.0 - 59.9    54.95      2     109.90

                       TOTAL                   20     659.00


                                          ©drtamil@gmail.com 2012
Standard deviation
                                   2
                     ( ∑ f .mp )
     ∑ f .mp   2
                   −
                          n
s=                                                    Mid
                   n −1                 Class        Point   Freq   f.m.p.   f.mp^2

                                                     14.95    3     44.85
s2=((24634.05-(6592/20))/19)           10.0 - 19.9                           670.51


s2=2920.05/19                          20.0 - 29.9   24.95    6     149.70   3735.02

s2=153.69                              30.0 - 39.9   34.95    5     174.75   6107.51

s = 12.4
                                       40.0 - 49.9   44.95    4     179.80   8082.01
4 Compare with 12.67 from
   direct measurement.                 50.0 - 59.9   54.95    2     109.90   6039.01


                                       TOTAL                 20     659.00   24634.05




                                                             ©drtamil@gmail.com 2012
Median

  Class       Freq                  4   L1 +i *((n+1)/2) – f1
                                                   fmed
10.0 - 19.9    3                    4   f1 = cumulative freq
                                        above median class
20.0 - 29.9    6
                                    4   29.95 + 10((21/2)-9)
30.0 - 39.9    5     median class
                                                       5
40.0 - 49.9    4
                                    4   29.95 + 15/5 = 32.95
                                    4   From direct calculation,
50.0 - 59.9    2                        median = 31

 TOTAL        20
                                                   ©drtamil@gmail.com 2012
Mode

=L1 +i *(Diff1/(Diff1+Diff2))
                                  Class       Freq
=19.95 + 10(3/(3+1))
=27.45                          10.0 - 19.9    3
                                20.0 - 29.9    6     mode class

4 Compare   with                30.0 - 39.9    5
  modes of 24 & 27              40.0 - 49.9    4
  from direct
                                50.0 - 59.9    2
  calculation.
                                 TOTAL        20

                                                ©drtamil@gmail.com 2012
Graphing Bivariate Numerical
          Data (Scatter Plot)




                  ©drtamil@gmail.com 2012
Linear Regression Line




             ©drtamil@gmail.com 2012
Survival Function
             1.2



             1.0



              .8



              .6



              .4
C S rvival
 um u




              .2
                                                          Survival Function

             0.0                                          Censored
                   0   1      2   3    4   5   6     7


                   DURATION
                                                   ©drtamil@gmail.com 2012
Principles of Graphical
                     Excellence
4 Presents   data in a way that provides
  substance, statistics and design
4 Communicates complex ideas with clarity,
  precision and efficiency
4 Gives the largest number of ideas in the
  most efficient manner
4 Almost always involves several
  dimensions
4 Tells the truth about the data
                             ©drtamil@gmail.com 2012

Contenu connexe

Tendances

Lecture on Measures of Variability/Dispersion by RDSII
Lecture on Measures of Variability/Dispersion by RDSIILecture on Measures of Variability/Dispersion by RDSII
Lecture on Measures of Variability/Dispersion by RDSIIREYNALDO II SALAYOG
 
Kolmogorov-Smirnov Test.pptx
Kolmogorov-Smirnov Test.pptxKolmogorov-Smirnov Test.pptx
Kolmogorov-Smirnov Test.pptxlisa dwi ningtyas
 
Non parametric methods
Non parametric methodsNon parametric methods
Non parametric methodsPedro Moreira
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis TestingSr Edith Bogue
 
Data Collection & Analysis - Grounded Theory
Data Collection & Analysis - Grounded TheoryData Collection & Analysis - Grounded Theory
Data Collection & Analysis - Grounded TheoryClaudia Cárdenas
 
BIOSTATISTICS + EXERCISES
BIOSTATISTICS + EXERCISESBIOSTATISTICS + EXERCISES
BIOSTATISTICS + EXERCISESMINANI Theobald
 
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)Dr. Hament Sharma
 
Stats measures of location 1
Stats measures of location 1Stats measures of location 1
Stats measures of location 1AqsaMemon7
 
4.1-4.2 Sample Spaces and Probability
4.1-4.2 Sample Spaces and Probability4.1-4.2 Sample Spaces and Probability
4.1-4.2 Sample Spaces and Probabilitymlong24
 
Introduction To Survival Analysis
Introduction To Survival AnalysisIntroduction To Survival Analysis
Introduction To Survival Analysisfedericorotolo
 
What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...Smarten Augmented Analytics
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statisticsAshok Kulkarni
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysisMurali Raj
 
multiple regression
multiple regressionmultiple regression
multiple regressionPriya Sharma
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt finalpiyushdhaker
 

Tendances (20)

Lecture on Measures of Variability/Dispersion by RDSII
Lecture on Measures of Variability/Dispersion by RDSIILecture on Measures of Variability/Dispersion by RDSII
Lecture on Measures of Variability/Dispersion by RDSII
 
Kolmogorov-Smirnov Test.pptx
Kolmogorov-Smirnov Test.pptxKolmogorov-Smirnov Test.pptx
Kolmogorov-Smirnov Test.pptx
 
Non parametric methods
Non parametric methodsNon parametric methods
Non parametric methods
 
Review & Hypothesis Testing
Review & Hypothesis TestingReview & Hypothesis Testing
Review & Hypothesis Testing
 
Data Collection & Analysis - Grounded Theory
Data Collection & Analysis - Grounded TheoryData Collection & Analysis - Grounded Theory
Data Collection & Analysis - Grounded Theory
 
Sample Size Determination
Sample Size DeterminationSample Size Determination
Sample Size Determination
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Normality
NormalityNormality
Normality
 
BIOSTATISTICS + EXERCISES
BIOSTATISTICS + EXERCISESBIOSTATISTICS + EXERCISES
BIOSTATISTICS + EXERCISES
 
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
ANOVA - BI FACTORIAL ANOVA (2- WAY ANOVA)
 
Stats measures of location 1
Stats measures of location 1Stats measures of location 1
Stats measures of location 1
 
4.1-4.2 Sample Spaces and Probability
4.1-4.2 Sample Spaces and Probability4.1-4.2 Sample Spaces and Probability
4.1-4.2 Sample Spaces and Probability
 
Introduction To Survival Analysis
Introduction To Survival AnalysisIntroduction To Survival Analysis
Introduction To Survival Analysis
 
What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...What is Binary Logistic Regression Classification and How is it Used in Analy...
What is Binary Logistic Regression Classification and How is it Used in Analy...
 
Inferential statistics
Inferential statisticsInferential statistics
Inferential statistics
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
multiple regression
multiple regressionmultiple regression
multiple regression
 
Sampling bigslides
Sampling bigslidesSampling bigslides
Sampling bigslides
 
Hypothesis testing ppt final
Hypothesis testing ppt finalHypothesis testing ppt final
Hypothesis testing ppt final
 
Simple Linear Regression
Simple Linear RegressionSimple Linear Regression
Simple Linear Regression
 

En vedette

How to run ANOVA on SPSS
How to run ANOVA on SPSSHow to run ANOVA on SPSS
How to run ANOVA on SPSSAzmi Mohd Tamil
 
How to run Student's t-test on SPSS
How to run Student's t-test on SPSSHow to run Student's t-test on SPSS
How to run Student's t-test on SPSSAzmi Mohd Tamil
 
Introduction to spss: define variables
Introduction to spss: define variablesIntroduction to spss: define variables
Introduction to spss: define variablesAzmi Mohd Tamil
 
How to run Pearson's Chi-square for SPSS
How to run Pearson's Chi-square for SPSSHow to run Pearson's Chi-square for SPSS
How to run Pearson's Chi-square for SPSSAzmi Mohd Tamil
 
How to draw Scatter plot on SPSS
How to draw Scatter plot on SPSSHow to draw Scatter plot on SPSS
How to draw Scatter plot on SPSSAzmi Mohd Tamil
 
Running Pearson's Correlation on SPSS
Running Pearson's Correlation on SPSSRunning Pearson's Correlation on SPSS
Running Pearson's Correlation on SPSSAzmi Mohd Tamil
 
How to run Simple Linear Regression on SPSS
How to run Simple Linear Regression on SPSSHow to run Simple Linear Regression on SPSS
How to run Simple Linear Regression on SPSSAzmi Mohd Tamil
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarAzmi Mohd Tamil
 
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestStudent's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestAzmi Mohd Tamil
 
Deciding on a medical research topic: your first challenge
Deciding on a medical research topic: your first challengeDeciding on a medical research topic: your first challenge
Deciding on a medical research topic: your first challengeAzmi Mohd Tamil
 
Choosing your study design
Choosing your study designChoosing your study design
Choosing your study designAzmi Mohd Tamil
 
MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...
MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...
MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...Azmi Mohd Tamil
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overviewAzmi Mohd Tamil
 
Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...
Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...
Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...Azmi Mohd Tamil
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionAzmi Mohd Tamil
 
Non-parametric analysis: Wilcoxon, Kruskal Wallis & Spearman
Non-parametric analysis: Wilcoxon, Kruskal Wallis & SpearmanNon-parametric analysis: Wilcoxon, Kruskal Wallis & Spearman
Non-parametric analysis: Wilcoxon, Kruskal Wallis & SpearmanAzmi Mohd Tamil
 
Techniques in clinical epidemiology
Techniques in clinical epidemiologyTechniques in clinical epidemiology
Techniques in clinical epidemiologyBhoj Raj Singh
 

En vedette (20)

How to run ANOVA on SPSS
How to run ANOVA on SPSSHow to run ANOVA on SPSS
How to run ANOVA on SPSS
 
How to run Student's t-test on SPSS
How to run Student's t-test on SPSSHow to run Student's t-test on SPSS
How to run Student's t-test on SPSS
 
Introduction to spss: define variables
Introduction to spss: define variablesIntroduction to spss: define variables
Introduction to spss: define variables
 
How to run Pearson's Chi-square for SPSS
How to run Pearson's Chi-square for SPSSHow to run Pearson's Chi-square for SPSS
How to run Pearson's Chi-square for SPSS
 
How to draw Scatter plot on SPSS
How to draw Scatter plot on SPSSHow to draw Scatter plot on SPSS
How to draw Scatter plot on SPSS
 
Running Pearson's Correlation on SPSS
Running Pearson's Correlation on SPSSRunning Pearson's Correlation on SPSS
Running Pearson's Correlation on SPSS
 
How to run Simple Linear Regression on SPSS
How to run Simple Linear Regression on SPSSHow to run Simple Linear Regression on SPSS
How to run Simple Linear Regression on SPSS
 
Testing Hypothesis
Testing HypothesisTesting Hypothesis
Testing Hypothesis
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemar
 
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate TestStudent's T-test, Paired T-Test, ANOVA & Proportionate Test
Student's T-test, Paired T-Test, ANOVA & Proportionate Test
 
Deciding on a medical research topic: your first challenge
Deciding on a medical research topic: your first challengeDeciding on a medical research topic: your first challenge
Deciding on a medical research topic: your first challenge
 
Choosing your study design
Choosing your study designChoosing your study design
Choosing your study design
 
MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...
MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...
MOOC for Public Health Physicians: Innovative Method in Ensuring Continuous C...
 
Sample size calculation - a brief overview
Sample size calculation - a brief overviewSample size calculation - a brief overview
Sample size calculation - a brief overview
 
Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...
Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...
Difficulty Index, Discrimination Index, Reliability and Rasch Measurement Ana...
 
Pearson Chi-Square
Pearson Chi-SquarePearson Chi-Square
Pearson Chi-Square
 
Student's t-test
Student's t-testStudent's t-test
Student's t-test
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
 
Non-parametric analysis: Wilcoxon, Kruskal Wallis & Spearman
Non-parametric analysis: Wilcoxon, Kruskal Wallis & SpearmanNon-parametric analysis: Wilcoxon, Kruskal Wallis & Spearman
Non-parametric analysis: Wilcoxon, Kruskal Wallis & Spearman
 
Techniques in clinical epidemiology
Techniques in clinical epidemiologyTechniques in clinical epidemiology
Techniques in clinical epidemiology
 

Similaire à Exploring & Summarizing Data Methods

6. Calculate samplesize for cohort studies
6. Calculate samplesize for cohort studies6. Calculate samplesize for cohort studies
6. Calculate samplesize for cohort studiesAzmi Mohd Tamil
 
MD Paediatrics (Part 1) - Overview of Basic Statistics
MD Paediatrics (Part 1) - Overview of Basic StatisticsMD Paediatrics (Part 1) - Overview of Basic Statistics
MD Paediatrics (Part 1) - Overview of Basic StatisticsBernard Deepal W. Jayamanne
 
Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )Amany Elsayed
 
Interpretation of fields
Interpretation of fieldsInterpretation of fields
Interpretation of fieldsSamuel Ponraj
 
Range stdev-iqr (jejit+practice)
Range stdev-iqr (jejit+practice)Range stdev-iqr (jejit+practice)
Range stdev-iqr (jejit+practice)Ken Plummer
 
Statistics for DP Biology IA
Statistics for DP Biology IAStatistics for DP Biology IA
Statistics for DP Biology IAVeronika Garga
 
Lec. biostatistics
Lec. biostatisticsLec. biostatistics
Lec. biostatisticsRiaz101
 

Similaire à Exploring & Summarizing Data Methods (11)

T test and ANOVA
T test and ANOVAT test and ANOVA
T test and ANOVA
 
6. Calculate samplesize for cohort studies
6. Calculate samplesize for cohort studies6. Calculate samplesize for cohort studies
6. Calculate samplesize for cohort studies
 
MD Paediatrics (Part 1) - Overview of Basic Statistics
MD Paediatrics (Part 1) - Overview of Basic StatisticsMD Paediatrics (Part 1) - Overview of Basic Statistics
MD Paediatrics (Part 1) - Overview of Basic Statistics
 
Validity andreliability
Validity andreliabilityValidity andreliability
Validity andreliability
 
Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )Data analysis ( Bio-statistic )
Data analysis ( Bio-statistic )
 
Interpretation of fields
Interpretation of fieldsInterpretation of fields
Interpretation of fields
 
Data analysis
Data analysisData analysis
Data analysis
 
Non parametric test
Non parametric testNon parametric test
Non parametric test
 
Range stdev-iqr (jejit+practice)
Range stdev-iqr (jejit+practice)Range stdev-iqr (jejit+practice)
Range stdev-iqr (jejit+practice)
 
Statistics for DP Biology IA
Statistics for DP Biology IAStatistics for DP Biology IA
Statistics for DP Biology IA
 
Lec. biostatistics
Lec. biostatisticsLec. biostatistics
Lec. biostatistics
 

Plus de Azmi Mohd Tamil

Hybrid setup - How to conduct simultaneous face-to-face and online presentati...
Hybrid setup - How to conduct simultaneous face-to-face and online presentati...Hybrid setup - How to conduct simultaneous face-to-face and online presentati...
Hybrid setup - How to conduct simultaneous face-to-face and online presentati...Azmi Mohd Tamil
 
Audiovisual and technicalities from preparation to retrieval how to enhance m...
Audiovisual and technicalities from preparation to retrieval how to enhance m...Audiovisual and technicalities from preparation to retrieval how to enhance m...
Audiovisual and technicalities from preparation to retrieval how to enhance m...Azmi Mohd Tamil
 
Broadcast quality online teaching at zero budget
Broadcast quality online teaching at zero budgetBroadcast quality online teaching at zero budget
Broadcast quality online teaching at zero budgetAzmi Mohd Tamil
 
Video for Teaching & Learning: OBS
Video for Teaching & Learning: OBSVideo for Teaching & Learning: OBS
Video for Teaching & Learning: OBSAzmi Mohd Tamil
 
Bengkel 21-12-2020 - Etika atas Talian & Alat Minima
Bengkel 21-12-2020 - Etika atas Talian & Alat MinimaBengkel 21-12-2020 - Etika atas Talian & Alat Minima
Bengkel 21-12-2020 - Etika atas Talian & Alat MinimaAzmi Mohd Tamil
 
GIS & History of Mapping in Malaya (lecture notes circa 2009)
GIS & History of Mapping in Malaya (lecture notes circa 2009)GIS & History of Mapping in Malaya (lecture notes circa 2009)
GIS & History of Mapping in Malaya (lecture notes circa 2009)Azmi Mohd Tamil
 
Blended e-learning in UKMFolio
Blended e-learning in UKMFolioBlended e-learning in UKMFolio
Blended e-learning in UKMFolioAzmi Mohd Tamil
 
How to Compute & Recode SPSS Data
How to Compute & Recode SPSS DataHow to Compute & Recode SPSS Data
How to Compute & Recode SPSS DataAzmi Mohd Tamil
 
Introduction to Data Analysis With R and R Studio
Introduction to Data Analysis With R and R StudioIntroduction to Data Analysis With R and R Studio
Introduction to Data Analysis With R and R StudioAzmi Mohd Tamil
 
Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...
Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...
Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...Azmi Mohd Tamil
 
Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...
Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...
Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...Azmi Mohd Tamil
 
Cochran Mantel Haenszel Test with Breslow-Day Test & Quadratic Equation
Cochran Mantel Haenszel Test with Breslow-Day Test & Quadratic EquationCochran Mantel Haenszel Test with Breslow-Day Test & Quadratic Equation
Cochran Mantel Haenszel Test with Breslow-Day Test & Quadratic EquationAzmi Mohd Tamil
 
New Emerging And Reemerging Infections circa 2006
New Emerging And Reemerging Infections circa 2006New Emerging And Reemerging Infections circa 2006
New Emerging And Reemerging Infections circa 2006Azmi Mohd Tamil
 
Hacks#36 -Raspberry Pi 4 Mini Computer
Hacks#36 -Raspberry Pi 4 Mini ComputerHacks#36 -Raspberry Pi 4 Mini Computer
Hacks#36 -Raspberry Pi 4 Mini ComputerAzmi Mohd Tamil
 
Hack#35 How to FB Live using a Video Encoder
Hack#35 How to FB Live using a Video EncoderHack#35 How to FB Live using a Video Encoder
Hack#35 How to FB Live using a Video EncoderAzmi Mohd Tamil
 
Hack#34 - Online Teaching with Microsoft Teams
Hack#34 - Online Teaching with Microsoft TeamsHack#34 - Online Teaching with Microsoft Teams
Hack#34 - Online Teaching with Microsoft TeamsAzmi Mohd Tamil
 
Skype for Business for UKM
Skype for Business for UKM Skype for Business for UKM
Skype for Business for UKM Azmi Mohd Tamil
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation ModelingAzmi Mohd Tamil
 
Safe computing (circa 2004)
Safe computing (circa 2004)Safe computing (circa 2004)
Safe computing (circa 2004)Azmi Mohd Tamil
 

Plus de Azmi Mohd Tamil (20)

Hybrid setup - How to conduct simultaneous face-to-face and online presentati...
Hybrid setup - How to conduct simultaneous face-to-face and online presentati...Hybrid setup - How to conduct simultaneous face-to-face and online presentati...
Hybrid setup - How to conduct simultaneous face-to-face and online presentati...
 
Audiovisual and technicalities from preparation to retrieval how to enhance m...
Audiovisual and technicalities from preparation to retrieval how to enhance m...Audiovisual and technicalities from preparation to retrieval how to enhance m...
Audiovisual and technicalities from preparation to retrieval how to enhance m...
 
Broadcast quality online teaching at zero budget
Broadcast quality online teaching at zero budgetBroadcast quality online teaching at zero budget
Broadcast quality online teaching at zero budget
 
Video for Teaching & Learning: OBS
Video for Teaching & Learning: OBSVideo for Teaching & Learning: OBS
Video for Teaching & Learning: OBS
 
Bengkel 21-12-2020 - Etika atas Talian & Alat Minima
Bengkel 21-12-2020 - Etika atas Talian & Alat MinimaBengkel 21-12-2020 - Etika atas Talian & Alat Minima
Bengkel 21-12-2020 - Etika atas Talian & Alat Minima
 
GIS & History of Mapping in Malaya (lecture notes circa 2009)
GIS & History of Mapping in Malaya (lecture notes circa 2009)GIS & History of Mapping in Malaya (lecture notes circa 2009)
GIS & History of Mapping in Malaya (lecture notes circa 2009)
 
Blended e-learning in UKMFolio
Blended e-learning in UKMFolioBlended e-learning in UKMFolio
Blended e-learning in UKMFolio
 
How to Compute & Recode SPSS Data
How to Compute & Recode SPSS DataHow to Compute & Recode SPSS Data
How to Compute & Recode SPSS Data
 
Introduction to Data Analysis With R and R Studio
Introduction to Data Analysis With R and R StudioIntroduction to Data Analysis With R and R Studio
Introduction to Data Analysis With R and R Studio
 
Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...
Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...
Hack#38 - How to Stream Zoom to Facebook & YouTube Without Using An Encoder o...
 
Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...
Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...
Hack#37 - How to simultaneously live stream to 4 sites using a single hardwar...
 
Cochran Mantel Haenszel Test with Breslow-Day Test & Quadratic Equation
Cochran Mantel Haenszel Test with Breslow-Day Test & Quadratic EquationCochran Mantel Haenszel Test with Breslow-Day Test & Quadratic Equation
Cochran Mantel Haenszel Test with Breslow-Day Test & Quadratic Equation
 
New Emerging And Reemerging Infections circa 2006
New Emerging And Reemerging Infections circa 2006New Emerging And Reemerging Infections circa 2006
New Emerging And Reemerging Infections circa 2006
 
Hacks#36 -Raspberry Pi 4 Mini Computer
Hacks#36 -Raspberry Pi 4 Mini ComputerHacks#36 -Raspberry Pi 4 Mini Computer
Hacks#36 -Raspberry Pi 4 Mini Computer
 
Hack#35 How to FB Live using a Video Encoder
Hack#35 How to FB Live using a Video EncoderHack#35 How to FB Live using a Video Encoder
Hack#35 How to FB Live using a Video Encoder
 
Hack#34 - Online Teaching with Microsoft Teams
Hack#34 - Online Teaching with Microsoft TeamsHack#34 - Online Teaching with Microsoft Teams
Hack#34 - Online Teaching with Microsoft Teams
 
Hack#33 How To FB-Live
Hack#33 How To FB-LiveHack#33 How To FB-Live
Hack#33 How To FB-Live
 
Skype for Business for UKM
Skype for Business for UKM Skype for Business for UKM
Skype for Business for UKM
 
Introduction to Structural Equation Modeling
Introduction to Structural Equation ModelingIntroduction to Structural Equation Modeling
Introduction to Structural Equation Modeling
 
Safe computing (circa 2004)
Safe computing (circa 2004)Safe computing (circa 2004)
Safe computing (circa 2004)
 

Dernier

Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...narwatsonia7
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Gabriel Guevara MD
 
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...narwatsonia7
 
Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowNehru place Escorts
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAAjennyeacort
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girlsnehamumbai
 
Pharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, PricingPharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, PricingArunagarwal328757
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...narwatsonia7
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersnarwatsonia7
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️saminamagar
 

Dernier (20)

Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
 
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
Call Girls Service in Bommanahalli - 7001305949 with real photos and phone nu...
 
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hosur Just Call 7001305949 Top Class Call Girl Service Available
 
Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024Asthma Review - GINA guidelines summary 2024
Asthma Review - GINA guidelines summary 2024
 
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
Russian Call Girls Gunjur Mugalur Road : 7001305949 High Profile Model Escort...
 
Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Kanakapura Road Just Call 7001305949 Top Class Call Girl Service A...
 
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call NowKolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
Kolkata Call Girls Services 9907093804 @24x7 High Class Babes Here Call Now
 
97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA97111 47426 Call Girls In Delhi MUNIRKAA
97111 47426 Call Girls In Delhi MUNIRKAA
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
 
Pharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, PricingPharmaceutical Marketting: Unit-5, Pricing
Pharmaceutical Marketting: Unit-5, Pricing
 
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
Housewife Call Girls Hsr Layout - Call 7001305949 Rs-3500 with A/C Room Cash ...
 
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hebbal Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
Call Girls ITPL Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jayanagar Just Call 7001305949 Top Class Call Girl Service Available
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbersBook Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
Book Call Girls in Kasavanahalli - 7001305949 with real photos and phone numbers
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️call girls in munirka  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
call girls in munirka DELHI 🔝 >༒9540349809 🔝 genuine Escort Service 🔝✔️✔️
 

Exploring & Summarizing Data Methods

  • 1. FK6163 Explore & Summarise Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia ©drtamil@gmail.com 2012
  • 2. Introduction Method of Exploring and Summarising Data differs According to Types of Variables ©drtamil@gmail.com 2012
  • 3. Dependent/Independent Independent Variables Food Intake Frequency of Exercise Obesity Dependent Variable ©drtamil@gmail.com 2012
  • 5. Explore 4 Itis the first step in the analytic process 4 to explore the characteristics of the data 4 to screen for errors and correct them 4 to look for distribution patterns - normal distribution or not 4 May require transformation before further analysis using parametric methods 4 Or may need analysis using non-parametric techniques ©drtamil@gmail.com 2012
  • 6. Data Screening PARITY Frequency Percent 4 By running Valid 1 67 30.7 frequencies, we may 2 44 20.2 3 36 16.5 detect inappropriate 4 22 10.1 responses 5 21 9.6 6 8 3.7 4 How many in the 7 3 1.4 audience have 15 8 7 3.2 children and 9 5 2.3 10 3 1.4 currently pregnant 11 1 .5 with the 16th? 15 1 .5 Total 218 100.0 ©drtamil@gmail.com 2012
  • 7. Data Screening 4 See whether the data make sense or not. 4 E.g. Parity 10 but age only 25. ©drtamil@gmail.com 2012
  • 10. Data Screening 4 By looking at measures of central tendency and range, we can also detect abnormal values for quantitative data Descriptive Statistics Std. N Minimum Maximum Mean Deviation Pre-pregnancy weight 184 32 484 53.05 33.37 Valid N (listwise) 184 ©drtamil@gmail.com 2012
  • 11. Interpreting the Box Plot Outlier Largest non-outlier The whiskers extend to 1.5 times the box width from both ends Upper quartile of the box and ends at an observed value. Three times the box Median width marks the boundary between "mild" and "extreme" Lower quartile outliers. "mild" = closed dots Smallest non-outlier Outlier"extreme"= open dots ©drtamil@gmail.com 2012
  • 12. Data Screening 600 4 We can also make 500 73 use of 400 graphical tools such 300 as the box 200 plot to detect 100 181 211 198 141 wrong 0 data entry N= 184 Pre-pregnancy weight ©drtamil@gmail.com 2012
  • 13. Data Cleaning 4 Identify the extreme/wrong values 4 Check with original data source – i.e. questionnaire 4 If incorrect, do the necessary correction. 4 Correction must be done before transformation, recoding and analysis. ©drtamil@gmail.com 2012
  • 14. Parameters of Data Distribution 4 Mean – central value of data 4 Standard deviation – measure of how the data scatter around the mean 4 Symmetry (skewness) – the degree of the data pile up on one side of the mean 4 Kurtosis – how far data scatter from the mean ©drtamil@gmail.com 2012
  • 15. Normal distribution 4 The Normal distribution is represented by a family of curves defined uniquely by two parameters, which are the mean and the standard deviation of the population. 4 The curves are always symmetrically bell shaped, but the extent to which the bell is compressed or flattened out depends on the standard deviation of the population. 4 However, the mere fact that a curve is bell shaped does not mean that it represents a Normal distribution, because other distributions may have a similar sort of shape. ©drtamil@gmail.com 2012
  • 16. Normal distribution 4 If the observations follow a 99.7% Normal distribution, a range 95.4% covered by one standard 68.3% deviation above the mean and one standard deviation below it includes about 68.3% of the observations; 4 a range of two standard deviations above and two below (+ 2sd) about 95.4% of the observations; and 4 of three standard deviations above and three below (+ 3sd) about 99.7% of the observations ©drtamil@gmail.com 2012
  • 17. Normality 4 Why bother with normality?? 4 Because it dictates the type of analysis that you can run on the data ©drtamil@gmail.com 2012
  • 18. Normality-Why? Parametric Qualitative Quantitative Normally distributed data Student's t Test Dichotomus Qualitative Quantitative Normally distributed data ANOVA Polinomial Quantitative Quantitative Repeated measurement of the Paired t Test same individual & item (e.g. Hb level before & after treatment). Normally distributed data Quantitative - Quantitative - Normally distributed data Pearson Correlation continous continous & Linear Regresssion ©drtamil@gmail.com 2012
  • 19. Normality-Why? Non-parametric Qualitative Quantitative Data not normally distributed Wilcoxon Rank Sum Dichotomus Test or U Mann- Whitney Test Qualitative Quantitative Data not normally distributed Kruskal-Wallis One Polinomial Way ANOVA Test Quantitative Quantitative Repeated measurement of the Wilcoxon Rank Sign same individual & item Test Quantitative - Quantitative - Data not normally distributed Spearman/Kendall continous/ordina continous Rank Correlation l ©drtamil@gmail.com 2012
  • 20. Normality-How? 4 Explored statistically 4 Explored graphically • Kolmogorov-Smirnov • Histogram statistic, with • Stem & Leaf Lilliefors significance • Box plot level and the • Normal probability Shapiro-Wilks plot statistic • Detrended normal • Skew ness (0) plot • Kurtosis (0) – + leptokurtic – 0 mesokurtik – - platykurtic ©drtamil@gmail.com 2012
  • 21. Kolmogorov- Smirnov 4 In the 1930’s, Andrei Nikolaevich Kolmogorov (1903-1987) and N.V. Smirnov (his student) came out with the approach for comparison of distributions that did not make use of parameters. 4 This is known as the Kolmogorov- Smirnov test. ©drtamil@gmail.com 2012
  • 22. Skew ness 4 Skewed to the right indicates the presence of large extreme values 4 Skewed to the left indicates the presence of small extreme values ©drtamil@gmail.com 2012
  • 23. Kurtosis 4 For symmetrical distribution only. 4 Describes the shape of the curve 4 Mesokurtic - average shaped 4 Leptokurtic - narrow & slim 4 Platikurtic - flat & wide ©drtamil@gmail.com 2012
  • 24. Skew ness & Kurtosis 4 Skew ness ranges from -3 to 3. 4 Acceptable range for normality is skew ness lying between -1 to 1. 4 Normality should not be based on skew ness alone; the kurtosis measures the “peak ness” of the bell-curve (see Fig. 4). 4 Likewise, acceptable range for normality is kurtosis lying between -1 to 1. ©drtamil@gmail.com 2012
  • 26. Normality - Examples Graphically 60 50 40 30 20 10 Std. Dev = 5.26 Mean = 151.6 0 N = 218.00 140.0 145.0 150.0 155.0 160.0 165.0 142.5 147.5 152.5 157.5 162.5 167.5 Height ©drtamil@gmail.com 2012
  • 27. Q&Q Plot 4 This plot compares the quintiles of a data distribution with the quintiles of a standardised theoretical distribution from a specified family of distributions (in this case, the normal distribution). 4 If the distributional shapes differ, then the points will plot along a curve instead of a line. 4 Take note that the interest here is the central portion of the line, severe deviations means non-normality. Deviations at the “ends” of the curve signifies the existence of outliers. ©drtamil@gmail.com 2012
  • 28. Normality - Examples Graphically Normal Q-Q Plot of Height 3 2 1 0 Detrended Normal Q-Q Plot of Height Expected Normal -1 .6 .5 -2 .4 -3 .3 130 140 150 160 170 .2 Observed Value Dev from Normal .1 0.0 -.1 -.2 130 140 150 160 170 Observed Value ©drtamil@gmail.com 2012
  • 29. Normal distribution Mean=median=mode ©drtamil@gmail.com 2012
  • 30. Normality - Examples Statistically Descriptives Statistic Std. Error Height Mean 151.65 .356 95% Confidence Lower Bound 150.94 Interval for Mean Upper Bound Normal distribution 152.35 Mean=median=mode 5% Trimmed Mean 151.59 Median 151.50 Variance 27.649 Skewness & kurtosis Std. Deviation 5.258 Minimum 139 within +1 Maximum 168 Range 29 Interquartile Range 8.00 p > 0.05, so normal Skewness .148 .165 distribution Kurtosis .061 .328 Tests of Normality a Kolmogorov-Smirnov Shapiro-Wilks; only if Statistic df Sig. sample size less than 100. Height .060 218 .052 a. Lilliefors Significance Correction ©drtamil@gmail.com 2012
  • 32. K-S Test 4 very sensitive to the sample sizes of the data. 4 For small samples (n<20, say), the likelihood of getting p<0.05 is low 4 for large samples (n>100), a slight deviation from normality will result in being reported as abnormal distribution ©drtamil@gmail.com 2012
  • 33. Guide to deciding on normality ©drtamil@gmail.com 2012
  • 34. Normality Transformation Normal Q-Q Plot of PARITY Normal Q-Q Plot of PARITY 33 22 11 Normal Q-Q Plot of LN_PARIT Normal Q-Q Plot of LN_PARIT 00 3 3 Expected Normal Expected Normal -1 -1 2 2 -2 -2 00 22 44 66 88 10 10 12 12 14 14 16 16 Observed Value Observed Value 1 1 0 0 Expected Normal Expected Normal -1 -1 -2 -2 -.5 -.5 0.0 0.0 .5 .5 1.0 1.0 1.5 1.5 2.0 2.0 2.5 2.5 3.0 3.0 Observed Value Observed Value ©drtamil@gmail.com 2012
  • 35. TYPES OF TRANSFORMATIONS Square root Logarithm Inverse Reflect and square Reflect and logarithm Reflect and inverse root ©drtamil@gmail.com 2012
  • 36. Summarise 4 Summarise a large set of data by a few meaningful numbers. 4 Single variable analysis • For the purpose of describing the data • Example; in one year, what kind of cases are treated by the Psychiatric Dept? • Tables & diagrams are usually used to describe the data • For numerical data, measures of central tendency & spread is usually used ©drtamil@gmail.com 2012
  • 37. Frequency Table Race F % Malay 760 95.84% Chinese 5 0.63% Indian 0 0.00% Others 28 3.53% TOTAL 793 100.00% •Illustrates the frequency observed for each category ©drtamil@gmail.com 2012
  • 38. Frequency Distribution Table • > 20 observations, best Umur Bil % presented as a frequency 0-0.99 25 3.26% 1-4.99 78 10.18% distribution table. 5-14.99 140 18.28% •Columns divided into class & 15-24.99 126 16.45% 25-34.99 112 14.62% frequency. 35-44.99 90 11.75% •Mod class can be determined 45-54.99 66 8.62% 55-64.99 60 7.83% using such tables. 65-74.99 50 6.53% 75-84.99 16 2.09% 85+ 3 0.39% JUMLAH 766 ©drtamil@gmail.com 2012
  • 39. Measurement of Central Tendency & Spread ©drtamil@gmail.com 2012
  • 40. Measures of Central Tendency 4Mean 4Mode 4Median ©drtamil@gmail.com 2012
  • 41. Measures of Variability 4Standard deviation 4Inter-quartiles 4Skew ness & kurtosis ©drtamil@gmail.com 2012
  • 42. Mean 4 theaverage of the data collected 4 To calculate the mean, add up the observed values and divide by the number of them. 4A major disadvantage of the mean is that it is sensitive to outlying points ©drtamil@gmail.com 2012
  • 43. Mean: Example 412, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 4Total of x = 648 4n= 20 4Mean = 648/20 = 32.4 ©drtamil@gmail.com 2012
  • 44. Measures of variation - standard deviation 4 tells us how much all the scores in a dataset cluster around the mean. A large S.D. is indicative of a more varied data scores. 4 a summary measure of the differences of each observation from the mean. 4 If the differences themselves were added up, the positive would exactly balance the negative and so their sum would be zero. 4 Consequently the squares of the differences are added. ©drtamil@gmail.com 2012
  • 46. sd: Example x x 4 12, 13, 17, 21, 24, 24, (x-mean)^2 (x-mean)^2 12 416.16 32 0.16 26, 27, 27, 30, 32, 35, 13 376.36 35 6.76 37, 38, 41, 43, 44, 46, 17 237.16 37 21.16 53, 58 21 129.96 38 31.36 24 70.56 41 73.96 4 Mean = 32.4; n = 20 24 70.56 43 112.36 4 Total of(x-mean)2 26 40.96 44 134.56 = 3050.8 27 29.16 46 184.96 27 29.16 53 424.36 4 Variance = 3050.8/19 30 5.76 58 655.36 = 160.5684 TOTAL 1405.8 TOTAL 1645 4 sd = 160.56840.5=12.67 ©drtamil@gmail.com 2012
  • 47. Median 4 the ranked value that lies in the middle of the data 4 the point which has the property that half the data are greater than it, and half the data are less than it. 4 if n is even, average the n/2th largest and the n/2 + 1th largest observations 4 "robust" to outliers ©drtamil@gmail.com 2012
  • 48. Median: 4 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 4 (20+1)/2 = 10th which is 30, 11th is 32 4 Therefore median is (30 + 32)/2 = 31 ©drtamil@gmail.com 2012
  • 49. Measures of variation - quartiles 4 The range is very susceptible to what are known as outliers 4A more robust approach is to divide the distribution of the data into four, and find the points below which are 25%, 50% and 75% of the distribution. These are known as quartiles, and the median is the second quartile. ©drtamil@gmail.com 2012
  • 50. Quartiles 4 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 4 25th percentile 24; (24+24)/2 4 50th percentile 31; (30+32)/2 ; = median 4 75th percentile 42.5; (41+43)/2 ©drtamil@gmail.com 2012
  • 51. Mode 4 The most frequent occurring number. E.g. 3, 13, 13, 20, 22, 25: mode = 13. 4 It is usually more informative to quote the mode accompanied by the percentage of times it happened; e.g., the mode is 13 with 33% of the occurrences. ©drtamil@gmail.com 2012
  • 52. Mode: Example 4 12,13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 4 Modes are 24 (10%) & 27 (10%) ©drtamil@gmail.com 2012
  • 53. Mean or Median? 4 Which measure of central tendency should we use? 4 if the distribution is normal, the mean+sd will be the measure to be presented, otherwise the median+IQR should be more appropriate. ©drtamil@gmail.com 2012
  • 54. Not Normal distribution; Normal distribution; Use Median & IQR Use Mean+SD ©drtamil@gmail.com 2012
  • 55. Presentation Qualitative & Quantitative Data Charts & Tables ©drtamil@gmail.com 2012
  • 56. Presentation Qualitative Data ©drtamil@gmail.com 2012
  • 57. Graphing Categorical Data: Univariate Data Categorical Data Graphing Data Tabulating Data The Summary Table Pie Charts CD S avings B onds Bar Charts Pareto Diagram S toc ks 45 120 40 0 10 20 30 40 50 100 35 30 80 25 60 20 15 40 10 20 5 0 0 S toc ks B onds S avings CD ©drtamil@gmail.com 2012
  • 58. Bar Chart 80 69 60 40 20 20 Percent 11 0 Housew ife Office w ork Field w ork Type of work ©drtamil@gmail.com 2012
  • 59. Pie Chart Others Chinese Malay ©drtamil@gmail.com 2012
  • 60. Tabulating and Graphing Bivariate Categorical Data 4 Contingency tables: Table 1: Contigency table of pregnancy induced hypertension and SGA Count SGA Normal SGA Total Pregnancy induced No 103 94 197 hypertension Yes 5 16 21 Total 108 110 218 ©drtamil@gmail.com 2012
  • 61. Tabulating and Graphing Bivariate Categorical Data 120 4 Side 100 by 103 94 side 80 charts 60 40 SGA 20 Normal Count 16 0 SGA No Yes Pregnancy induced hypertension ©drtamil@gmail.com 2012
  • 62. Presentation Quantitative Data ©drtamil@gmail.com 2012
  • 63. Tabulating and Graphing Numerical Data Numerical Data 41, 24, 32, 26, 27, 27, 30, 24, 38, 21 Frequency Distributions Ordered Array Ogive 21, 24, 24, 26, 27, 27, 30, 32, 38, 41 Cumulative Distributions 120 100 80 60 40 20 0 2 144677 Area 10 20 30 40 50 60 Stem and Leaf Histograms 3 028 Display 7 6 4 1 5 4 Tables 3 2 1 Polygons 0 10 20 30 40 50 60 ©drtamil@gmail.com 2012
  • 64. Tabulating Numerical Data: Frequency Distributions 4 Sort raw data in ascending order: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 4 Find range: 58 - 12 = 46 4 Select number of classes: 5 (usually between 5 and 15) 4 Compute class interval (width): 10 (46/5 then round up) 4 Determine class boundaries (limits): 10, 20, 30, 40, 50, 60 4 Compute class midpoints: 14.95, 24.95, 34.95, 44.95, 54.95 4 Count observations & assign to classes ©drtamil@gmail.com 2012
  • 65. Frequency Distributions and Percentage Distributions Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Class Midpoint Freq % 10.0 - 19.9 14.95 3 15% 20.0 - 29.9 24.95 6 30% 30.0 - 39.9 34.95 5 25% 40.0 - 49.9 44.95 4 20% 50.0 - 59.9 54.95 2 10% TOTAL 20 100% ©drtamil@gmail.com 2012
  • 66. Graphing Numerical Data: The Histogram Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 7 6 6 5 5 Frequency 4 4 3 No Gaps 3 Between 2 2 Bars 1 0 14.95 24.95 34.95 44.95 54.95 Age Class Boundaries Class Midpoints ©drtamil@gmail.com 2012
  • 67. Graphing Numerical Data: The Frequency Polygon Data in ordered array: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 7 6 5 4 3 2 1 0 14.95 24.95 34.95 44.95 54.95 Class Midpoints ©drtamil@gmail.com 2012
  • 68. Calculate Measures of Central Tendency & Spread 4 We can use frequency distribution table to calculate; • Mean • Standard Deviation • Median • Mode ©drtamil@gmail.com 2012
  • 69. Mean X= ∑ f .mp n Class Midpoint Freq freq x m.p. 4 Mean = 659/20 10.0 - 19.9 14.95 3 44.85 = 32.95 20.0 - 29.9 24.95 6 149.70 4 Compare with 32.4 30.0 - 39.9 34.95 5 174.75 from direct 40.0 - 49.9 44.95 4 179.80 calculation. 50.0 - 59.9 54.95 2 109.90 TOTAL 20 659.00 ©drtamil@gmail.com 2012
  • 70. Standard deviation 2 ( ∑ f .mp ) ∑ f .mp 2 − n s= Mid n −1 Class Point Freq f.m.p. f.mp^2 14.95 3 44.85 s2=((24634.05-(6592/20))/19) 10.0 - 19.9 670.51 s2=2920.05/19 20.0 - 29.9 24.95 6 149.70 3735.02 s2=153.69 30.0 - 39.9 34.95 5 174.75 6107.51 s = 12.4 40.0 - 49.9 44.95 4 179.80 8082.01 4 Compare with 12.67 from direct measurement. 50.0 - 59.9 54.95 2 109.90 6039.01 TOTAL 20 659.00 24634.05 ©drtamil@gmail.com 2012
  • 71. Median Class Freq 4 L1 +i *((n+1)/2) – f1 fmed 10.0 - 19.9 3 4 f1 = cumulative freq above median class 20.0 - 29.9 6 4 29.95 + 10((21/2)-9) 30.0 - 39.9 5 median class 5 40.0 - 49.9 4 4 29.95 + 15/5 = 32.95 4 From direct calculation, 50.0 - 59.9 2 median = 31 TOTAL 20 ©drtamil@gmail.com 2012
  • 72. Mode =L1 +i *(Diff1/(Diff1+Diff2)) Class Freq =19.95 + 10(3/(3+1)) =27.45 10.0 - 19.9 3 20.0 - 29.9 6 mode class 4 Compare with 30.0 - 39.9 5 modes of 24 & 27 40.0 - 49.9 4 from direct 50.0 - 59.9 2 calculation. TOTAL 20 ©drtamil@gmail.com 2012
  • 73. Graphing Bivariate Numerical Data (Scatter Plot) ©drtamil@gmail.com 2012
  • 74. Linear Regression Line ©drtamil@gmail.com 2012
  • 75. Survival Function 1.2 1.0 .8 .6 .4 C S rvival um u .2 Survival Function 0.0 Censored 0 1 2 3 4 5 6 7 DURATION ©drtamil@gmail.com 2012
  • 76. Principles of Graphical Excellence 4 Presents data in a way that provides substance, statistics and design 4 Communicates complex ideas with clarity, precision and efficiency 4 Gives the largest number of ideas in the most efficient manner 4 Almost always involves several dimensions 4 Tells the truth about the data ©drtamil@gmail.com 2012