Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

correlation and regression

correlation and regression

  • Soyez le premier à commenter

correlation and regression

  1. 1. 1 Correlation and Regression BY UNSA SHAKIR
  2. 2. 2 Correlation and Regression Correlation describes the strength of a linear relationship between two variables Regression tells us how to draw the straight line described by the correlation
  3. 3. 3 Correlation and Regression • For example: A sociologist may be interested in the relationship between education and self-esteem or Income and Number of Children in a family. Independent Variables Education Family Income Dependent Variables Self-Esteem Number of Children
  4. 4. 4 Correlation and Regression • For example: • May expect: As education increases, self-esteem increases (positive relationship). • May expect: As family income increases, the number of children in families declines (negative relationship). Independent Variables Education Family Income Dependent Variables Self-Esteem Number of Children + -
  5. 5. 5 Correlation
  6. 6. 6 Correlation • Correlation is a statistical technique used to determine the degree to which two variables are related • A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where x is the independent (or explanatory) variable, and y is the dependent (or response) variable.
  7. 7. 7 Correlation x 1 2 3 4 5 y – 4 – 2 – 1 0 2 A scatter plot can be used to determine whether a linear (straight line) correlation exists between two variables. x 2 4 –2 – 4 y 2 6 Example:
  8. 8. 8 Linear Correlation x y Negative Linear Correlation x y No Correlation x y Positive Linear Correlation x y Nonlinear Correlation As x increases, y tends to decrease. As x increases, y tends to increase.
  9. 9. 9 Correlation Coefficient • It is also called Pearson's correlation or product moment correlation coefficient • The correlation coefficient is a measure of the strength and the direction of a linear relationship between two variables. The symbol r represents the sample correlation coefficient. The formula for r is       2 22 2 . n xy x y r n x x n y y           
  10. 10. 10 The sign of r denotes the nature of association while the value of r denotes the strength of association.
  11. 11. 11 If the sign is +ve this means the relation is direct (an increase in one variable is associated with an increase in the other variable and a decrease in one variable is associated with a decrease in the other variable). While if the sign is -ve this means an inverse or indirect relationship (which means an increase in one variable is associated with a decrease in the other).
  12. 12. 12 The value of r ranges between ( -1) and ( +1) The value of r denotes the strength of the association as illustrated by the following diagram. -1 10-0.25-0.75 0.750.25 strong strongintermediate intermediateweak weak no relation perfect correlation perfect correlation Directindirect
  13. 13. 13 If r = Zero this means no association or correlation between the two variables. If 0 < r < 0.25 = weak correlation. If 0.25 ≤ r < 0.75 = intermediate correlation. If 0.75 ≤ r < 1 = strong correlation. If r = l = perfect correlation.
  14. 14. 14 Linear Correlation x y Strong negative correlation x y Weak positive correlation x y Strong positive correlation x y Nonlinear Correlation r = 0.91 r = 0.88 r = 0.42 r = 0.07
  15. 15. 15 Calculating a Correlation Coefficient       2 22 2 . n xy x y r n x x n y y            1. Find the sum of the x-values. 2. Find the sum of the y-values. Calculating a Correlation Coefficient In Words In Symbols x y xy3. Multiply each x-value by its corresponding y-value and find the sum.
  16. 16. 16 Calculating a Correlation Coefficient Calculating a Correlation Coefficient In Words In Symbols 2 x 2 y 4. Square each x-value and find the sum. 5. Square each y-value and find the sum. 6. Use these five sums to calculate the correlation coefficient.
  17. 17. 17 Correlation Coefficient x y xy x2 y2 1 – 3 – 3 1 9 2 – 1 – 2 4 1 3 0 0 9 0 4 1 4 16 1 5 2 10 25 4 Example: Calculate the correlation coefficient r for the following data. 15x  1y   9xy  2 55x  2 15y 
  18. 18. 18 Correlation Coefficient       2 22 2 n xy x y r n x x n y y            Example: Calculate the correlation coefficient r for the following data.     22 5(9) 15 1 5(55) 15 5(15) 1       60 50 74  0.986 There is a strong positive linear correlation between x and y.
  19. 19. 19 Correlation Coefficient Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50 Example: The following data represents the number of hours, 12 different students watched television during the weekend and the scores of each student who took a test the following Monday. a.) Display the scatter plot. b.) Calculate the correlation coefficient r.
  20. 20. 20 Correlation Coefficient 100 x y Hours watching TV Testscore 80 60 40 20 2 4 6 8 10 Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50
  21. 21. 21 Correlation Coefficient Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50 xy 0 85 16 4 222 28 5 34 0 38 0 420 348 45 5 52 5 50 0 x2 0 1 4 9 9 25 25 25 36 49 49 10 0 y2 921 6 722 5 67 24 547 6 90 25 46 24 57 76 705 6 336 4 42 25 56 25 25 00 Example continued: 54x  908y  3724xy  2 332x  2 70836y 
  22. 22. 22 Correlation Coefficient Example continued:       2 22 2 n xy x y r n x x n y y                22 12(3724) 54 908 12(332) 54 12(70836) 908     0.831  • There is a strong negative linear correlation. • As the number of hours spent watching TV increases, the test scores tend to decrease.
  23. 23. 23 Example: A sample of 6 children was selected, data about their age in years and weight in kilograms was recorded as shown in the following table . It is required to find the correlation between age and weight. Weight (Kg) Age (years) serial No 1271 862 1283 1054 1165 1396
  24. 24. 24 Y2X2xy Weight (Kg) (y) Age (year) (x) Serial n. 14449841271 643648862 14464961283 10025501054 12136661165 169811171396 ∑y2= 742 ∑x2= 291 ∑xy= 461 ∑y= 66 ∑x= 41 Total
  25. 25. 25 r = 0.759 strong direct correlation                 6 (66) 742. 6 (41) 291 6 6641 461 r 22
  26. 26. 26 EXAMPLE: Relationship betweenAnxiety and Test Scores Anxiety (X) Test score (Y) X2 Y2 XY 10 2 100 4 20 8 3 64 9 24 2 9 4 81 18 1 7 1 49 7 5 6 25 36 30 6 5 36 25 30 ∑X = 32 ∑Y = 32 ∑X2 = 230 ∑Y2 = 204 ∑XY=129
  27. 27. 27 Calculating Correlation Coefficient    94. )200)(356( 1024774 32)204(632)230(6 )32)(32()129)(6( 22      r r = - 0.94 Indirect strong correlation
  28. 28. 28 Example Tree Height Trunk Diameter y x xy y2 x2 35 8 280 1225 64 49 9 441 2401 81 27 7 189 729 49 33 6 198 1089 36 60 13 780 3600 169 21 7 147 441 49 45 11 495 2025 121 51 12 612 2601 144 Σ =321 Σ =73 Σ =3142 Σ =14111 Σ =713
  29. 29. 29 13 0 10 20 30 40 50 60 70 0 2 4 6 8 10 12 14 2 Trunk Diameter, x Tree Height, y Example • r = 0.886 → relatively strong positive linear association between x and y
  30. 30. 30
  31. 31. 31 Regression
  32. 32. 32 Regression Analyses • Regression technique is concerned with predicting some variables by knowing others • The process of predicting variable Y using variable X
  33. 33. 33 20 Types of Regression Models Positive Linear Relationship Negative Linear Relationship Relationship NOT Linear No Relationship
  34. 34. 34 Regression Uses a variable (x) to predict some outcome variable (y) Tells you how values in y change as a function of changes in values of x
  35. 35. 35 The regression line makes the sum of the squares of the residuals smaller than for any other line Regression minimizes residuals 80 100 120 140 160 180 200 220 60 70 80 90 100 110 120 Wt (kg) SBP(mmHg)
  36. 36. 36 By using the least squares method (a procedure that minimizes the vertical deviations of plotted points surrounding a straight line) we are able to construct a best fitting straight line to the scatter diagram points and then formulate a regression equation in the form of:         n x)( x n yx xy b 2 2 1 )xb(xyyˆ  bXayˆ  Regression equation describes the regression line mathematically by showing Intercept and Slope
  37. 37. 37 Correlation and Regression • The statistics equation for a line: Y = a + bx Where: Y = the line’s position on the vertical axis at any point (estimated value of dependent variable) X = the line’s position on the horizontal axis at any point (value of the independent variable for which you want an estimate of Y) b = the slope of the line (called the coefficient) a = the intercept with the Y axis, where X equals zero ^ ^
  38. 38. 38 Linear Equations Y Y = bX + a a = Y-intercept X Change in Y Change in X b = Slope
  39. 39. 39 Exercise A sample of 6 persons was selected the value of their age ( x variable) and their weight is demonstrated in the following table. Find the regression equation and what is the predicted weight when age is 8.5 years. Weight (y)Age (x)Serial no. 12 8 12 10 11 13 7 6 8 5 6 9 1 2 3 4 5 6
  40. 40. 40 Answer Y2X2xyWeight (y)Age (x)Serial no. 144 64 144 100 121 169 49 36 64 25 36 81 84 48 96 50 66 117 12 8 12 10 11 13 7 6 8 5 6 9 1 2 3 4 5 6 7422914616641Total
  41. 41. 41 6.83 6 41 x  11 6 66 y 92.0 6 )41( 291 6 6641 461 2     b Regression equation 6.83)0.9(x11yˆ (x) 
  42. 42. 42 0.92x4.675yˆ (x)  12.50Kg8.5*0.924.675yˆ (8.5)  Kg58.117.5*0.924.675yˆ (7.5) 
  43. 43. 43 we create a regression line by plotting two estimated values for y against their X component, then extending the line right and left.
  44. 44. 44 Regression Line Example: a.) Find the equation of the regression line. b.) Use the equation to find the expected value when value of x is 2.3 x y xy x2 y2 1 – 3 – 3 1 9 2 – 1 – 2 4 1 3 0 0 9 0 4 1 4 16 1 5 2 10 25 4 15x  1y   9xy  2 55x  2 15y 
  45. 45. 45 Regression Line 2 x y 1 1 2 3 1 2 3 4 5     22 n xy x y m n x x             2 5(9) 15 1 5(55) 15     60 50  1.2
  46. 46. 46 Regression Line Example: The following data represents the number of hours 12 different students watched television during the weekend and the scores of each student who took a test the following Monday. a.) Find the equation of the regression line. b.) Use the equation to find the expected test score for a student who watches 9 hours of TV.
  47. 47. 47 Regression Line Hours, x 0 1 2 3 3 5 5 5 6 7 7 10 Test score, y 96 85 82 74 95 68 76 84 58 65 75 50 xy 0 85 164 222 285 340 380 420 348 455 525 500 x2 0 1 4 9 9 25 25 25 36 49 49 100 y2 9216 722 5 672 4 547 6 902 5 462 4 577 6 705 6 336 4 422 5 562 5 250 0 54x  908y  3724xy  2 332x  2 70836y 
  48. 48. 48 • Find the correlation between age and blood pressure using simple and Spearman's correlation coefficients, and comment. • Find the regression equation? • What is the predicted blood pressure for a man aging 25 years? Exercise
  49. 49. 49 x2xyyxSerial 4002400120201 18495504128432 39698883141633 6763276126264 28097102134535 9613968128316 33647888136587 21166072132468 33648120140589 4900100801447010
  50. 50. 50 x2xyyxSerial 211658881284611 280972081365312 360087601466013 40024801242014 396990091436315 184955901304316 67632241242617 36122991211918 96139061263119 52928291232320 416781144862630852Total
  51. 51. 51         n x)( x n yx xy b 2 2 1 4547.0 20 852 41678 20 2630852 114486 2     = =112.13 + 0.4547 x for age 25 B.P = 112.13 + 0.4547 * 25=123.49 = 123.5 mm hg yˆ

    Soyez le premier à commenter

    Identifiez-vous pour voir les commentaires

  • YeshvanthKrishna

    Oct. 18, 2019
  • VibhaVerma8

    Oct. 21, 2019
  • sanjeebKumarswain

    Oct. 24, 2019
  • NehaYadav399

    Nov. 17, 2019
  • ChetanChauhan123

    Nov. 18, 2019
  • ShakshiiNaithani

    Dec. 31, 2019
  • MuhammadKhizar19

    May. 13, 2020
  • GlyzzaLuzon

    Aug. 6, 2020
  • HalemaNasralah

    Aug. 23, 2020
  • IndraniBhadra

    Sep. 27, 2020
  • PalakSrivastava23

    Oct. 28, 2020
  • ShraddhaSharma67

    Dec. 10, 2020
  • rajiyabegum2

    Feb. 26, 2021
  • DeepikaRavi15

    Mar. 13, 2021
  • arextgeth

    Apr. 15, 2021
  • KetevanGentschLalias

    May. 4, 2021
  • AnkitaVerma146

    May. 26, 2021
  • NehaSrivastava206

    Jun. 16, 2021
  • ShifnaJamal

    Jul. 25, 2021
  • RohanDivate3

    Jul. 26, 2021

correlation and regression

Vues

Nombre de vues

2 989

Sur Slideshare

0

À partir des intégrations

0

Nombre d'intégrations

2

Actions

Téléchargements

158

Partages

0

Commentaires

0

Mentions J'aime

20

×