SlideShare une entreprise Scribd logo
1  sur  3
Télécharger pour lire hors ligne
28-08-2012




                                                                                                    A newspaper turned to a survey firm in order to investigate
                                                                                                    whether there is any relationship between the level of education
                                                                                                    and the frequency of its readership.

                                                                                                                                                    Level of Education
            Chi-square Test for                                                                         Frequency of        Did not pass     Passes only Xth     graduate      Post-graduate
                                                                                                        readership           Xth grade            grade                        or professional
       homogeneity/independence                                                                         Never                      22               12                8
                                                                                                                                                                               degree holder
                                                                                                                                                                                      3
   Goodness of fit Tests: Chi-square & K-S                                                              Sometimes
                                                                                                        Almost always
                                                                                                                                   25
                                                                                                                                   18
                                                                                                                                                    10
                                                                                                                                                    15
                                                                                                                                                                      12
                                                                                                                                                                      9
                                                                                                                                                                                      6
                                                                                                                                                                                      5

         Nonparametric methods
                                                                                                          Is there any significant evidence for dependence between
                                       Session XVIII                                                      the level of education and readership?




       Chi-square test of Independence                                                                                     Degrees of freedom
  • Independence or homogeneity
  • Useful to gauge dependence between qualitative                                                 • Joint distribution has (rc-1) parameters (joint
    or categorical variables
  • Find the expected frequency of each cell under                                                   probabilities) in general.
    independence (null hypothesis).                                                                • Under independence between categorical variables,
  • Exp freq in each cell should be at least 5 ---
    Merge groups o.w.             f e in (1,1)cell = n P[X = 1, Y = 1]
                                                                                                     the Joint distribution is completely specified by (r-1)
        row total × column total                     (if indep.) = nP[X = 1]P[Y = 1]                 + (c-1) parameters (marginal probabilities) .
fe =                                                          first row total first column total   • So “degrees of freedom” eaten by “independence”
              grand total                           = n×
                                                                      n
                                                                             ×
                                                                                       n
                                                                                                     is given by     (rc-1)-(r-1)-(c-1)=(r-1)*(c-1)
Test statistic is         χ    2
                                   =   ∑
                                           ( fo − fe )2
                                                fe
                                                        = (   ∑
                                                                   fo
                                                                     2

                                                                    fe
                                                                       )−n
                                                                                                   • This is indeed the degrees of freedom associated
which has a chi - square distribution with   (r - 1) × (c - 1) d.f under H 0 ,                       with the Test (Chi-square distribution of the Test
where r and c are the no. of groups in row and column respectively.
                                                                                                     Statistics when H0 is true)




                Observed Frequency Table
                Level of Education                                                                 The following win-loss table shows India’s performance against
Freq. of readership
                <X         X          graduate PG               row-total                          four top teams in one-day games played in India. Does it show any
Never                   22         12         8            3           45                          significant evidence that India does not play equally well against strong
Sometimes               25         10        12            6           53                          teams at home?
Almost always           18         15         9            5           47
column total            65         37        29           14          145
                                                                                                                             win            Loss

                Expected Frequency Table                                                                   Australia          9               8
                Level of Education
Freq. of readership
                <X         X          graduate PG               row-total                                  Pakistan           4              10
Never               20.17       11.48      9.00         4.34           45
Sometimes           23.76       13.52     10.60         5.12           53                                  S.Africa           9               6
                                                                                                                                                            H0: π1= π2= π3= π4
Almost always       21.07       11.99      9.40         4.54           47
column total            65         37        29           14          145                                  W.Indies           9              17

                                                                     test statistic     3.3016                 Total         31               41
                Level of Education
Freq. of readership
                <X         X            graduate PG
                                                                     d.f.                    6
                                                                     p-value          0.770151        Pooled estimate of proportion is 31/72.
Never            0.165576 0.023299       0.111111 0.416256
                                                                                                      Expected no. of wins at home against Aus (under H0) is 17*31/72.
Sometimes        0.064862 0.918325       0.184906 0.152282
                                                                                                      Same argument in new cap!
Almost always 0.447034 0.753886          0.017021 0.04705




                                                                                                                                                                                                   1
28-08-2012




                                                                              Breakdown in vacation: re-visited
                Validating
       Distributional assumptions                                                Q. Is Poisson distribution justified?

                                                                            #breakdown #of months                        prob            f-expected
                                                                                 0          3                            0.082             4.9251
                                                                                 1         14                            0.205            12.31275
                                                                                 2         16                            0.257            15.39094
     Tests for Goodness of Fit:                                                  3         13                            0.214            12.82578
                                                                                 4          9                            0.134            8.016113
       •Chi-square Test                                                          5          2                            0.067            4.008057
       •Kolmogorov-Smirnov Test                                                  6
                                                                                 7
                                                                                            2
                                                                                            1
                                                                                                                         0.028            1.670024
                                                                                                                                          0.851239
                                                                                                                         0.014

                                                                                  Total              60                  1.000                 60




       x            f-observed      prob      f-expected

       0
                        f0
                         3          0.082
                                                   fe
                                                4.9251
                                                           (f0-fe)^2/fe
                                                            0.752474
                                                                              Chi-square goodness of fit test
                                                                                                                  ( f − f )2  f 2
       1
       2
                        14
                        16
                                    0.205
                                    0.257
                                               12.31275
                                               15.39094
                                                            0.231209
                                                            0.024102      Test statistic is          χ    2
                                                                                                              = ∑
                                                                                                                     o e = (∑ o ) − n
                                                                                                                      f        f
       3                13          0.214      12.82578     0.002367                                                    e        e
       4                 9          0.134      8.016113     0.120761
                                                                          which has a chi - square distribution with (k - 1) d.f under H 0 , where k is the no. of cells.
    >= 5                 5          0.109      6.529319     0.358202
                                                                           • The data need to be discrete or grouped
       Total             60         1.000          60      1.489115        • expected frequency in each cell must be at
                                                                             least 5
                                                p-value    0.828568
                                                                           • If some of the parameters of the null
     Degrees of freedom = 6-1-1=4                                            hypothesized distribution are not specified,
                                                                             they may be estimated from the data.
                                                                             However the d.f. need to be further
                                                                             reduced by the no. of parameters being
      Conclusion: yes, Poisson distribution seems very reasonable
                                                                             estimated




           Goodness of Fit Tests                                                      Kolmogorov-Smirnov Test
                                                                                                           expected                    observed

• To check for the validity of distributional
  assumption (discrete or continuous)                                       Test statistics is                max | Fe − Fo |
                                                                                                                           F(x)




   – EXAMPLES
      • Population normal (to use t-tests)
                                                                                                                                        x
      • breakdown in vacation (Poisson distribution)                        • Based on comparing (expected) cumulative
• Based on the difference between expected and                                probability with (observed) cumulative rel. freq.
  observed (relative) frequencies                                           • Cut-off values for the T.S. can be looked up from
• Always 1-tailed test (reject for large value of the                         Appendix Table 8
  TS)                                                                       • The data need not be grouped; for ungrouped
                                                                              data need to consider the cumulative relative
                                                                              frequency only at the data points




                                                                                                                                                                                2
28-08-2012




                                                                                    Comparison between
  Re-do checking for Poisson Distribution
                                                                              Chi-square and K-S as G.O.F. Tests
      x         f-observed F-observedF-expecteddifference
                                                                      • Both can be applied for grouped or ungrouped data;
      0                 3       0.05        0.082   0.032085            however, natural choice
      1                14   0.283333        0.287   0.003964            – Chi-square test for Grouped data
      2                16       0.55        0.544   0.006187
      3                13   0.766667        0.758   0.009091            – K-S test for ungrouped data
      4                 9   0.916667        0.891   0.025489          • K-S is applicable for small sample size also
      5                 2       0.95        0.958   0.007979
      6                 2   0.983333        0.986   0.002479          • Chi-square test can be also applied for
      7                 1          1        0.996   0.004247            – qualitative random variables
      Total            60              TS           0.032085
                                                                        – when parameters are not specified
                                                      1.07            • K-S test is a nonparametric procedure
At 20% level of significance, the C.R. is D n >            = 0.1381
                                                        60            • Chi-square test is more powerful
So fail to reject H 0 at 20% level of significance




                                                                                                                               3

Contenu connexe

Plus de vivek_shaw

Lecture 11 market structure- perfect competition
Lecture 11  market structure- perfect competitionLecture 11  market structure- perfect competition
Lecture 11 market structure- perfect competition
vivek_shaw
 
Lecture 9 costs
Lecture 9  costsLecture 9  costs
Lecture 9 costs
vivek_shaw
 
Lecture 8 production, optimal inputs
Lecture 8  production, optimal inputsLecture 8  production, optimal inputs
Lecture 8 production, optimal inputs
vivek_shaw
 
Lecture 8 production, optimal inputs (1)
Lecture 8  production, optimal inputs (1)Lecture 8  production, optimal inputs (1)
Lecture 8 production, optimal inputs (1)
vivek_shaw
 
Lecture 3 dds sand elasticity
Lecture 3  dds sand elasticityLecture 3  dds sand elasticity
Lecture 3 dds sand elasticity
vivek_shaw
 
Consumertheory1
Consumertheory1Consumertheory1
Consumertheory1
vivek_shaw
 
Consumer theory 2
Consumer theory 2Consumer theory 2
Consumer theory 2
vivek_shaw
 
Class2 market, demand and supply
Class2  market, demand and supplyClass2  market, demand and supply
Class2 market, demand and supply
vivek_shaw
 
Policy implications
Policy implicationsPolicy implications
Policy implications
vivek_shaw
 
Man org session 3-org and technology_5th july 2012
Man org session 3-org and technology_5th july 2012Man org session 3-org and technology_5th july 2012
Man org session 3-org and technology_5th july 2012
vivek_shaw
 
Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012
vivek_shaw
 
Man org session 12_org learning_3rd august 2012
Man org session 12_org learning_3rd august 2012Man org session 12_org learning_3rd august 2012
Man org session 12_org learning_3rd august 2012
vivek_shaw
 
Man org session 11 interorganizational relationships_2nd august 2012
Man org session 11 interorganizational relationships_2nd august 2012Man org session 11 interorganizational relationships_2nd august 2012
Man org session 11 interorganizational relationships_2nd august 2012
vivek_shaw
 
Man org session 10_org control_27th july 2012
Man org session 10_org control_27th july 2012Man org session 10_org control_27th july 2012
Man org session 10_org control_27th july 2012
vivek_shaw
 

Plus de vivek_shaw (20)

Lecture 11 market structure- perfect competition
Lecture 11  market structure- perfect competitionLecture 11  market structure- perfect competition
Lecture 11 market structure- perfect competition
 
Lecture 9 costs
Lecture 9  costsLecture 9  costs
Lecture 9 costs
 
Lecture 8 production, optimal inputs
Lecture 8  production, optimal inputsLecture 8  production, optimal inputs
Lecture 8 production, optimal inputs
 
Lecture 8 production, optimal inputs (1)
Lecture 8  production, optimal inputs (1)Lecture 8  production, optimal inputs (1)
Lecture 8 production, optimal inputs (1)
 
Lecture 3 dds sand elasticity
Lecture 3  dds sand elasticityLecture 3  dds sand elasticity
Lecture 3 dds sand elasticity
 
Game theory 3
Game theory 3Game theory 3
Game theory 3
 
Game theory 1
Game theory 1Game theory 1
Game theory 1
 
Game theory 2
Game theory 2Game theory 2
Game theory 2
 
Ford motors
Ford motorsFord motors
Ford motors
 
Consumertheory1
Consumertheory1Consumertheory1
Consumertheory1
 
Consumer theory 2
Consumer theory 2Consumer theory 2
Consumer theory 2
 
Class2 market, demand and supply
Class2  market, demand and supplyClass2  market, demand and supply
Class2 market, demand and supply
 
Auctions 1
Auctions 1Auctions 1
Auctions 1
 
Costs2
Costs2Costs2
Costs2
 
Policy implications
Policy implicationsPolicy implications
Policy implications
 
Man org session 3-org and technology_5th july 2012
Man org session 3-org and technology_5th july 2012Man org session 3-org and technology_5th july 2012
Man org session 3-org and technology_5th july 2012
 
Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012Man org session 14_org decision making_16th august 2012
Man org session 14_org decision making_16th august 2012
 
Man org session 12_org learning_3rd august 2012
Man org session 12_org learning_3rd august 2012Man org session 12_org learning_3rd august 2012
Man org session 12_org learning_3rd august 2012
 
Man org session 11 interorganizational relationships_2nd august 2012
Man org session 11 interorganizational relationships_2nd august 2012Man org session 11 interorganizational relationships_2nd august 2012
Man org session 11 interorganizational relationships_2nd august 2012
 
Man org session 10_org control_27th july 2012
Man org session 10_org control_27th july 2012Man org session 10_org control_27th july 2012
Man org session 10_org control_27th july 2012
 

Session 18

  • 1. 28-08-2012 A newspaper turned to a survey firm in order to investigate whether there is any relationship between the level of education and the frequency of its readership. Level of Education Chi-square Test for Frequency of Did not pass Passes only Xth graduate Post-graduate readership Xth grade grade or professional homogeneity/independence Never 22 12 8 degree holder 3 Goodness of fit Tests: Chi-square & K-S Sometimes Almost always 25 18 10 15 12 9 6 5 Nonparametric methods Is there any significant evidence for dependence between Session XVIII the level of education and readership? Chi-square test of Independence Degrees of freedom • Independence or homogeneity • Useful to gauge dependence between qualitative • Joint distribution has (rc-1) parameters (joint or categorical variables • Find the expected frequency of each cell under probabilities) in general. independence (null hypothesis). • Under independence between categorical variables, • Exp freq in each cell should be at least 5 --- Merge groups o.w. f e in (1,1)cell = n P[X = 1, Y = 1] the Joint distribution is completely specified by (r-1) row total × column total (if indep.) = nP[X = 1]P[Y = 1] + (c-1) parameters (marginal probabilities) . fe = first row total first column total • So “degrees of freedom” eaten by “independence” grand total = n× n × n is given by (rc-1)-(r-1)-(c-1)=(r-1)*(c-1) Test statistic is χ 2 = ∑ ( fo − fe )2 fe = ( ∑ fo 2 fe )−n • This is indeed the degrees of freedom associated which has a chi - square distribution with (r - 1) × (c - 1) d.f under H 0 , with the Test (Chi-square distribution of the Test where r and c are the no. of groups in row and column respectively. Statistics when H0 is true) Observed Frequency Table Level of Education The following win-loss table shows India’s performance against Freq. of readership <X X graduate PG row-total four top teams in one-day games played in India. Does it show any Never 22 12 8 3 45 significant evidence that India does not play equally well against strong Sometimes 25 10 12 6 53 teams at home? Almost always 18 15 9 5 47 column total 65 37 29 14 145 win Loss Expected Frequency Table Australia 9 8 Level of Education Freq. of readership <X X graduate PG row-total Pakistan 4 10 Never 20.17 11.48 9.00 4.34 45 Sometimes 23.76 13.52 10.60 5.12 53 S.Africa 9 6 H0: π1= π2= π3= π4 Almost always 21.07 11.99 9.40 4.54 47 column total 65 37 29 14 145 W.Indies 9 17 test statistic 3.3016 Total 31 41 Level of Education Freq. of readership <X X graduate PG d.f. 6 p-value 0.770151 Pooled estimate of proportion is 31/72. Never 0.165576 0.023299 0.111111 0.416256 Expected no. of wins at home against Aus (under H0) is 17*31/72. Sometimes 0.064862 0.918325 0.184906 0.152282 Same argument in new cap! Almost always 0.447034 0.753886 0.017021 0.04705 1
  • 2. 28-08-2012 Breakdown in vacation: re-visited Validating Distributional assumptions Q. Is Poisson distribution justified? #breakdown #of months prob f-expected 0 3 0.082 4.9251 1 14 0.205 12.31275 2 16 0.257 15.39094 Tests for Goodness of Fit: 3 13 0.214 12.82578 4 9 0.134 8.016113 •Chi-square Test 5 2 0.067 4.008057 •Kolmogorov-Smirnov Test 6 7 2 1 0.028 1.670024 0.851239 0.014 Total 60 1.000 60 x f-observed prob f-expected 0 f0 3 0.082 fe 4.9251 (f0-fe)^2/fe 0.752474 Chi-square goodness of fit test ( f − f )2 f 2 1 2 14 16 0.205 0.257 12.31275 15.39094 0.231209 0.024102 Test statistic is χ 2 = ∑ o e = (∑ o ) − n f f 3 13 0.214 12.82578 0.002367 e e 4 9 0.134 8.016113 0.120761 which has a chi - square distribution with (k - 1) d.f under H 0 , where k is the no. of cells. >= 5 5 0.109 6.529319 0.358202 • The data need to be discrete or grouped Total 60 1.000 60 1.489115 • expected frequency in each cell must be at least 5 p-value 0.828568 • If some of the parameters of the null Degrees of freedom = 6-1-1=4 hypothesized distribution are not specified, they may be estimated from the data. However the d.f. need to be further reduced by the no. of parameters being Conclusion: yes, Poisson distribution seems very reasonable estimated Goodness of Fit Tests Kolmogorov-Smirnov Test expected observed • To check for the validity of distributional assumption (discrete or continuous) Test statistics is max | Fe − Fo | F(x) – EXAMPLES • Population normal (to use t-tests) x • breakdown in vacation (Poisson distribution) • Based on comparing (expected) cumulative • Based on the difference between expected and probability with (observed) cumulative rel. freq. observed (relative) frequencies • Cut-off values for the T.S. can be looked up from • Always 1-tailed test (reject for large value of the Appendix Table 8 TS) • The data need not be grouped; for ungrouped data need to consider the cumulative relative frequency only at the data points 2
  • 3. 28-08-2012 Comparison between Re-do checking for Poisson Distribution Chi-square and K-S as G.O.F. Tests x f-observed F-observedF-expecteddifference • Both can be applied for grouped or ungrouped data; 0 3 0.05 0.082 0.032085 however, natural choice 1 14 0.283333 0.287 0.003964 – Chi-square test for Grouped data 2 16 0.55 0.544 0.006187 3 13 0.766667 0.758 0.009091 – K-S test for ungrouped data 4 9 0.916667 0.891 0.025489 • K-S is applicable for small sample size also 5 2 0.95 0.958 0.007979 6 2 0.983333 0.986 0.002479 • Chi-square test can be also applied for 7 1 1 0.996 0.004247 – qualitative random variables Total 60 TS 0.032085 – when parameters are not specified 1.07 • K-S test is a nonparametric procedure At 20% level of significance, the C.R. is D n > = 0.1381 60 • Chi-square test is more powerful So fail to reject H 0 at 20% level of significance 3