SlideShare une entreprise Scribd logo
1  sur  16
Télécharger pour lire hors ligne
Correlation Between Continuous
& Categorical Variables
CIVL 7012/8012
2
Association between variables
2
• Continuous and continuous variable
– Pearson’s correlation coefficient
• Categorical and categorical variable
– Chi-square test
– Cramer’s V
– Bonferroni correction
• Categorical and Continuous variable
– Point biserial correlation
3
Association between Continuous
Variables
• Compute Pearson’s correlation coefficient (r)
– r<0.3, weak correlation
– 0.3<r<0.7, moderate correlation
– r>0.7, high correlation
4
Association between categorical
variables
• Pearson’s correlation coefficient can not be
applied.
• What are some of the methods
• How to compute them
• What will be the conclusion
5
Set up hypothesis
• Null hypothesis: Assumes that there is no
association between the two variables.
• Alternative hypothesis: Assumes that there is
an association between the two variables.
6
Categorical variable
Observed Male Female
Married 456 516
Widowed 58 123
Divorced 142 172
Separated 29 50
Never married 188 207
Example: Two categorical variables: marital status and gender
Question: How do we measure degree of association?
Since these are categorical variables Pearson’s correlation coefficient will not work
Reference: https://peterstatistics.com
7
Pearson Chi-square test for independence
• Calculate estimated values
Expected Male Female
Married 437.1747 534.8253
Widowed 81.40804 99.59196
Divorced 141.2272 172.7728
Separated 35.53168 43.46832
Never married 177.6584 217.3416
𝐸𝑖,𝑗 =
𝑅𝑖 × 𝐶𝑗
𝑁
Observed Male Female
Married 456 516
Widowed 58 123
Divorced 142 172
Separated 29 50
Never married 188 207
8
Calculate chi-sq for each pair
(O-E)2/E Male Female
Married 0.810646 0.662634
Widowed 6.730738 5.501811
Divorced 0.004229 0.003457
Separated 1.2007 0.981471
Never married 0.601988 0.492074
𝑂𝑖,𝑗 − 𝐸𝑖,𝑗
2
𝐸𝑖,𝑗
Pearson Chi-square value (sum of all cells): 16.98975
9
Degrees of freedom and significance
• Degrees of freedom = (r-1) *(c-1)
– In this example: (5-1)*(2-1) = 4
• Significance: Chi-square (16.98975, 4) = 0.00194
• Reject null hypothesis
• Conclusion: there is an association between the
two variables.
10
Cramer’s V (1)
Cramer's V= 𝑠𝑞𝑟𝑡 (𝜒2
/ [𝑛 𝑞 − 1 ])
• q= min (# of rows, # of columns)
• Cramer’s V interpretation
– 0: The variables are not associated
– 1: The variables are perfectly associated
– 0.25: The variables are weakly associated
– .75: The variables are moderately associated
11
Cramer’s V (2)
• In this case
– Not associated
Observed> Male Female Total
Married 456 516 972
Widowed 58 123 181
Divorced 142 172 314
Separated 29 50 79
Never married 188 207 395
Total 873 1068 1941
Pearson Chi-square value: 16.98975
# of rows (r) 5
# of cols (c) 2
q 2
Cramer's V 0.093558
12
Bonferroni correction
Observed Male Female Total
Married 456 516 972
Widowed 58 123 181
Divorced 142 172 314
Separated 29 50 79
Never married 188 207 395
Total 873 1068 1941
Expected Male Female
Married 437.1747 534.8253
Widowed 81.40804 99.59196
Divorced 141.2272 172.7728
Separated 35.53168 43.46832
Never married 177.6584 217.3416
Adjusted Residuals
(O-E)2/E Male Female
Married 1.717883 -1.71788
Widowed -3.67295 3.672949
Divorced 0.095753 -0.09575
Separated -1.50823 1.508229
Never married 1.172004 -1.172
𝜒2
𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙_𝑖, 𝑗 =
𝑂𝑖,𝑗 − 𝐸𝑖,𝑗
𝐸𝑖,𝑗 ∗ 1 −
𝑅𝑖
𝑛
1 −
𝐶𝑗
𝑛
Significance level 0.05
# of tests 10
Adjusted sig level 0.005
Only widowed male and female has significance association
13
Correlation between continuous and
categorial variables
• Point Biserial correlation
– product-moment correlation in which one variable
is continuous and the other variable is binary
(dichotomous)
– Categorical variable does not need to have
ordering
– Assumption: continuous data within each group
created by the binary variable are normally
distributed with equal variances and possibly
different means
14
Point Biserial correlation
• Suppose you want to find the correlation
between
– a continuous random variable Y and
– a binary random variable X which takes the values
zero and one.
• Assume that n paired observations (Yk, Xk), k
= 1, 2, …, n are available.
– If the common product-moment correlation r is
calculated from these data, the resulting
correlation is called the point-biserial correlation.
15
Point Biserial correlation
• Point biserial correlation is defined by
16
Hypothesis test

Contenu connexe

Similaire à L17_CategoricalVariableAssociation.pdf

Correlation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxCorrelation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxasemzkgmu
 
Karl pearson's coefficient of correlation
Karl pearson's coefficient of correlationKarl pearson's coefficient of correlation
Karl pearson's coefficient of correlationteenathankachen1993
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfRavinandan A P
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptxnikshaikh786
 
CORRELATION.pptx
CORRELATION.pptxCORRELATION.pptx
CORRELATION.pptxSreeLatha98
 
Correlation Computations Thiyagu
Correlation Computations ThiyaguCorrelation Computations Thiyagu
Correlation Computations ThiyaguThiyagu K
 
4Correlational analysis presentation .pdf
4Correlational analysis presentation .pdf4Correlational analysis presentation .pdf
4Correlational analysis presentation .pdfMitikuTeka1
 

Similaire à L17_CategoricalVariableAssociation.pdf (13)

Correlation and Regression Analysis.pptx
Correlation and Regression Analysis.pptxCorrelation and Regression Analysis.pptx
Correlation and Regression Analysis.pptx
 
Karl pearson's coefficient of correlation
Karl pearson's coefficient of correlationKarl pearson's coefficient of correlation
Karl pearson's coefficient of correlation
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdf
 
Module 2_ Regression Models..pptx
Module 2_ Regression Models..pptxModule 2_ Regression Models..pptx
Module 2_ Regression Models..pptx
 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
 
Correlation continued
Correlation continuedCorrelation continued
Correlation continued
 
CORRELATION.pptx
CORRELATION.pptxCORRELATION.pptx
CORRELATION.pptx
 
Inferential statistics correlations
Inferential statistics correlationsInferential statistics correlations
Inferential statistics correlations
 
Correlation Computations Thiyagu
Correlation Computations ThiyaguCorrelation Computations Thiyagu
Correlation Computations Thiyagu
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 
Correlation
CorrelationCorrelation
Correlation
 
4Correlational analysis presentation .pdf
4Correlational analysis presentation .pdf4Correlational analysis presentation .pdf
4Correlational analysis presentation .pdf
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 

Dernier

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 

Dernier (20)

4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 

L17_CategoricalVariableAssociation.pdf

  • 1. Correlation Between Continuous & Categorical Variables CIVL 7012/8012
  • 2. 2 Association between variables 2 • Continuous and continuous variable – Pearson’s correlation coefficient • Categorical and categorical variable – Chi-square test – Cramer’s V – Bonferroni correction • Categorical and Continuous variable – Point biserial correlation
  • 3. 3 Association between Continuous Variables • Compute Pearson’s correlation coefficient (r) – r<0.3, weak correlation – 0.3<r<0.7, moderate correlation – r>0.7, high correlation
  • 4. 4 Association between categorical variables • Pearson’s correlation coefficient can not be applied. • What are some of the methods • How to compute them • What will be the conclusion
  • 5. 5 Set up hypothesis • Null hypothesis: Assumes that there is no association between the two variables. • Alternative hypothesis: Assumes that there is an association between the two variables.
  • 6. 6 Categorical variable Observed Male Female Married 456 516 Widowed 58 123 Divorced 142 172 Separated 29 50 Never married 188 207 Example: Two categorical variables: marital status and gender Question: How do we measure degree of association? Since these are categorical variables Pearson’s correlation coefficient will not work Reference: https://peterstatistics.com
  • 7. 7 Pearson Chi-square test for independence • Calculate estimated values Expected Male Female Married 437.1747 534.8253 Widowed 81.40804 99.59196 Divorced 141.2272 172.7728 Separated 35.53168 43.46832 Never married 177.6584 217.3416 𝐸𝑖,𝑗 = 𝑅𝑖 × 𝐶𝑗 𝑁 Observed Male Female Married 456 516 Widowed 58 123 Divorced 142 172 Separated 29 50 Never married 188 207
  • 8. 8 Calculate chi-sq for each pair (O-E)2/E Male Female Married 0.810646 0.662634 Widowed 6.730738 5.501811 Divorced 0.004229 0.003457 Separated 1.2007 0.981471 Never married 0.601988 0.492074 𝑂𝑖,𝑗 − 𝐸𝑖,𝑗 2 𝐸𝑖,𝑗 Pearson Chi-square value (sum of all cells): 16.98975
  • 9. 9 Degrees of freedom and significance • Degrees of freedom = (r-1) *(c-1) – In this example: (5-1)*(2-1) = 4 • Significance: Chi-square (16.98975, 4) = 0.00194 • Reject null hypothesis • Conclusion: there is an association between the two variables.
  • 10. 10 Cramer’s V (1) Cramer's V= 𝑠𝑞𝑟𝑡 (𝜒2 / [𝑛 𝑞 − 1 ]) • q= min (# of rows, # of columns) • Cramer’s V interpretation – 0: The variables are not associated – 1: The variables are perfectly associated – 0.25: The variables are weakly associated – .75: The variables are moderately associated
  • 11. 11 Cramer’s V (2) • In this case – Not associated Observed> Male Female Total Married 456 516 972 Widowed 58 123 181 Divorced 142 172 314 Separated 29 50 79 Never married 188 207 395 Total 873 1068 1941 Pearson Chi-square value: 16.98975 # of rows (r) 5 # of cols (c) 2 q 2 Cramer's V 0.093558
  • 12. 12 Bonferroni correction Observed Male Female Total Married 456 516 972 Widowed 58 123 181 Divorced 142 172 314 Separated 29 50 79 Never married 188 207 395 Total 873 1068 1941 Expected Male Female Married 437.1747 534.8253 Widowed 81.40804 99.59196 Divorced 141.2272 172.7728 Separated 35.53168 43.46832 Never married 177.6584 217.3416 Adjusted Residuals (O-E)2/E Male Female Married 1.717883 -1.71788 Widowed -3.67295 3.672949 Divorced 0.095753 -0.09575 Separated -1.50823 1.508229 Never married 1.172004 -1.172 𝜒2 𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙_𝑖, 𝑗 = 𝑂𝑖,𝑗 − 𝐸𝑖,𝑗 𝐸𝑖,𝑗 ∗ 1 − 𝑅𝑖 𝑛 1 − 𝐶𝑗 𝑛 Significance level 0.05 # of tests 10 Adjusted sig level 0.005 Only widowed male and female has significance association
  • 13. 13 Correlation between continuous and categorial variables • Point Biserial correlation – product-moment correlation in which one variable is continuous and the other variable is binary (dichotomous) – Categorical variable does not need to have ordering – Assumption: continuous data within each group created by the binary variable are normally distributed with equal variances and possibly different means
  • 14. 14 Point Biserial correlation • Suppose you want to find the correlation between – a continuous random variable Y and – a binary random variable X which takes the values zero and one. • Assume that n paired observations (Yk, Xk), k = 1, 2, …, n are available. – If the common product-moment correlation r is calculated from these data, the resulting correlation is called the point-biserial correlation.
  • 15. 15 Point Biserial correlation • Point biserial correlation is defined by