SlideShare une entreprise Scribd logo
1  sur  16
INTRODUCTION TO
STATISTICAL THEORY FOR
SCIENTIST
CORRELATION AND REGRESSION
• If we have question like “are two or more variables linearly
  related? If so, what is the strength of the relationship?”

• Numerical measure used to determine whether two or more
  variables are linearly related and to determine the strength of
  the relationship. This measure called CORRELATION
  COEFFICIENT

• There are two types of relationship; SIMPLE RELATIONSHIP
  AND MULTIPLE RELATIONSHIP.
• Statistical method used to determine
CORRELATION      whether a linear relationship between
                 variables exist
               • Used to describe the nature of relationship
 REGRESSION      between variables; positive/negative or
                 linear/nonlinear

               • Have two variables; an independent
     SIMPLE
                 variable (explanatory) and a dependent
 REGRESSION
                 variable (response)

   MULTIPLE    • Two or more independent variables where
 REGRESSION      used to predict one dependent variable

    POSITIVE   • Both variables increase or decrease at the
RELATIONSHIP     same time

    NEGATIVE   • As one variable increase, the other variable
RELATIONSHIP     decrease and vice versa.
Scatter plots and Correlation
• In order to find relationship between two different variables, data
  need to be collected. Example: relationship between number of
  hours study and grades for exam

• Independent variable is variable that can be controlled or
  manipulated while dependent variable cannot

• Dependent and independent variable can be plotted in graph named
  scatter plot

• Independent variable x plotted on the horizontal axis while
  dependent y on vertical axis

• Scatter plot is visual way to show the relationship between two
  variable
SCATTER PLOT is a graph of the ordered pairs (x,y) of
number consisting of the independent variable x and
               dependent variable y




             Cars (in ten         Revenue (in
     Company thousand)              billion)
         A             63              7
         B             29             3.9
         C            20.8            2.1
         D            19.1            2.8
          E           13.4            1.4
          F           8.5             1.5
Correlation
• Correlation explained here is from Pearson Product Moment
  Correlation Coefficient (PPMC) by Karl Pearson

          Correlation coefficient computed from the sample data
        measures the strength and direction of a linear relationship
       between two quantitative variables. The symbol for the sample
           correlation is r while ρ (rho) for population correlation

• Value range for correlation is from -1 to +1.
• Correlation value which is close to +1 shows that there were a
  strong positive correlation while when the value is close to -1,
  it shows that there were a strong negative correlation
• Value of r close to zero means that no linear relationship
  between the variable or only a weak relationship between
  both variables.
•
•
Regression
• We previously test the significance of the correlation
  coefficient. If the correlation is significant, the next step is to
  determine the equation of regression line

• LINE OF BEST FIT: best fit means that the sum of squares of
  the vertical distance from each point to the line is at minimum

• Reason best fit needed is that the value of y will be predicted
  from the values of x; hence the closer the points to the lines,
  the better prediction will be
•
• MARGINAL CHANGE: the magnitude of the change in one variable
  when the other variable changes exactly 1 unit.

• See example 10-9; the slope of the regression line is 0.106 which
  means for each increase of 10,000 cars, the value of y changes 0.106
  unit ($ 106 million) on average.

• EXTRAPOLATION: making prediction beyond the bounds of the data.

• When prediction are made, they are based on present condition or
  on the premise that present trends will continue.

• OUTLIER: point that seems out of place when compared with the
  other points

• Some of this points can affect the equation of the regression line
  where the points are called influential points or influential
  observation
Coefficient of determination

         x   1    2   3    4    5
         y   10   8   12   16   20
•
•
•
Coefficient of determination
•

Contenu connexe

Tendances

Measures of relationships
Measures of relationshipsMeasures of relationships
Measures of relationshipsyogesh ingle
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regressionKeyur Tejani
 
Correlation & Regression
Correlation & RegressionCorrelation & Regression
Correlation & RegressionGrant Heller
 
Computing Transformations Spring2005
Computing Transformations Spring2005Computing Transformations Spring2005
Computing Transformations Spring2005guest5989655
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)Harsh Upadhyay
 
Measure of Relationship: Correlation Coefficient
Measure of Relationship: Correlation CoefficientMeasure of Relationship: Correlation Coefficient
Measure of Relationship: Correlation CoefficientLade Asrah Carim
 
Correlation coefficient
Correlation coefficientCorrelation coefficient
Correlation coefficientCarlo Magno
 
Regression & It's Types
Regression & It's TypesRegression & It's Types
Regression & It's TypesMehul Boricha
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regressionjasondroesch
 
Computing transformations
Computing transformationsComputing transformations
Computing transformationsTarun Gehlot
 
Linear regression
Linear regressionLinear regression
Linear regressionDepEd
 

Tendances (20)

Measures of relationships
Measures of relationshipsMeasures of relationships
Measures of relationships
 
correlation and regression
correlation and regressioncorrelation and regression
correlation and regression
 
Correlation & Regression
Correlation & RegressionCorrelation & Regression
Correlation & Regression
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Correlation analysis
Correlation analysisCorrelation analysis
Correlation analysis
 
Correlation
CorrelationCorrelation
Correlation
 
Regression
RegressionRegression
Regression
 
Computing Transformations Spring2005
Computing Transformations Spring2005Computing Transformations Spring2005
Computing Transformations Spring2005
 
Simple linear regression (final)
Simple linear regression (final)Simple linear regression (final)
Simple linear regression (final)
 
Measure of Relationship: Correlation Coefficient
Measure of Relationship: Correlation CoefficientMeasure of Relationship: Correlation Coefficient
Measure of Relationship: Correlation Coefficient
 
Correlation coefficient
Correlation coefficientCorrelation coefficient
Correlation coefficient
 
Regression & It's Types
Regression & It's TypesRegression & It's Types
Regression & It's Types
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 
Correlation 2
Correlation 2Correlation 2
Correlation 2
 
Coefficient of correlation
Coefficient of correlationCoefficient of correlation
Coefficient of correlation
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Computing transformations
Computing transformationsComputing transformations
Computing transformations
 
Regression presentation
Regression presentationRegression presentation
Regression presentation
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Multivariate analysis
Multivariate analysisMultivariate analysis
Multivariate analysis
 

Similaire à correlation and regression

A presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.pptA presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.pptvigia41
 
Unit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptxUnit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptxAnusuya123
 
Artifical Intelligence And Machine Learning Algorithum.pptx
Artifical Intelligence And Machine Learning Algorithum.pptxArtifical Intelligence And Machine Learning Algorithum.pptx
Artifical Intelligence And Machine Learning Algorithum.pptxAishwarya SenthilNathan
 
STATISTICAL REGRESSION MODELS
STATISTICAL REGRESSION MODELSSTATISTICAL REGRESSION MODELS
STATISTICAL REGRESSION MODELSAneesa K Ayoob
 
Simple linear regression (Updated).pptx
Simple linear regression (Updated).pptxSimple linear regression (Updated).pptx
Simple linear regression (Updated).pptxdiscountglasstx
 
Linear Regression | Machine Learning | Data Science
Linear Regression | Machine Learning | Data ScienceLinear Regression | Machine Learning | Data Science
Linear Regression | Machine Learning | Data ScienceSumit Pandey
 
Correlationanalysis
CorrelationanalysisCorrelationanalysis
CorrelationanalysisLibu Thomas
 
Regression Analysis.ppt
Regression Analysis.pptRegression Analysis.ppt
Regression Analysis.pptAbebe334138
 
Correlation and Regression.pptx
Correlation and Regression.pptxCorrelation and Regression.pptx
Correlation and Regression.pptxJayaprakash985685
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptxarsh260174
 
Regression Analysis Techniques.pptx
Regression Analysis Techniques.pptxRegression Analysis Techniques.pptx
Regression Analysis Techniques.pptxYutaItadori
 
CORRELATION-AND-REGRESSION.pdf for human resource
CORRELATION-AND-REGRESSION.pdf for human resourceCORRELATION-AND-REGRESSION.pdf for human resource
CORRELATION-AND-REGRESSION.pdf for human resourceSharon517605
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate dataUlster BOCES
 
REGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptxREGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptxcajativ595
 
regression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdfregression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdflisow86669
 

Similaire à correlation and regression (20)

IDS.pdf
IDS.pdfIDS.pdf
IDS.pdf
 
A presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.pptA presentation for Multiple linear regression.ppt
A presentation for Multiple linear regression.ppt
 
Unit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptxUnit-III Correlation and Regression.pptx
Unit-III Correlation and Regression.pptx
 
Artifical Intelligence And Machine Learning Algorithum.pptx
Artifical Intelligence And Machine Learning Algorithum.pptxArtifical Intelligence And Machine Learning Algorithum.pptx
Artifical Intelligence And Machine Learning Algorithum.pptx
 
STATISTICAL REGRESSION MODELS
STATISTICAL REGRESSION MODELSSTATISTICAL REGRESSION MODELS
STATISTICAL REGRESSION MODELS
 
Simple linear regression (Updated).pptx
Simple linear regression (Updated).pptxSimple linear regression (Updated).pptx
Simple linear regression (Updated).pptx
 
Linear Regression | Machine Learning | Data Science
Linear Regression | Machine Learning | Data ScienceLinear Regression | Machine Learning | Data Science
Linear Regression | Machine Learning | Data Science
 
Correlationanalysis
CorrelationanalysisCorrelationanalysis
Correlationanalysis
 
Covariance vs Correlation
Covariance vs CorrelationCovariance vs Correlation
Covariance vs Correlation
 
BRM-lecture-11.ppt
BRM-lecture-11.pptBRM-lecture-11.ppt
BRM-lecture-11.ppt
 
Regression Analysis.ppt
Regression Analysis.pptRegression Analysis.ppt
Regression Analysis.ppt
 
Correlation and Regression.pptx
Correlation and Regression.pptxCorrelation and Regression.pptx
Correlation and Regression.pptx
 
Regression Analysis.pptx
Regression Analysis.pptxRegression Analysis.pptx
Regression Analysis.pptx
 
Regression Analysis Techniques.pptx
Regression Analysis Techniques.pptxRegression Analysis Techniques.pptx
Regression Analysis Techniques.pptx
 
correlation and regression.pptx
correlation and regression.pptxcorrelation and regression.pptx
correlation and regression.pptx
 
CORRELATION-AND-REGRESSION.pdf for human resource
CORRELATION-AND-REGRESSION.pdf for human resourceCORRELATION-AND-REGRESSION.pdf for human resource
CORRELATION-AND-REGRESSION.pdf for human resource
 
Exploring bivariate data
Exploring bivariate dataExploring bivariate data
Exploring bivariate data
 
REGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptxREGRESSION METasdfghjklmjhgftrHODS1.pptx
REGRESSION METasdfghjklmjhgftrHODS1.pptx
 
regression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdfregression-linearandlogisitics-220524024037-4221a176 (1).pdf
regression-linearandlogisitics-220524024037-4221a176 (1).pdf
 
Linear and Logistics Regression
Linear and Logistics RegressionLinear and Logistics Regression
Linear and Logistics Regression
 

correlation and regression

  • 1. INTRODUCTION TO STATISTICAL THEORY FOR SCIENTIST CORRELATION AND REGRESSION
  • 2. • If we have question like “are two or more variables linearly related? If so, what is the strength of the relationship?” • Numerical measure used to determine whether two or more variables are linearly related and to determine the strength of the relationship. This measure called CORRELATION COEFFICIENT • There are two types of relationship; SIMPLE RELATIONSHIP AND MULTIPLE RELATIONSHIP.
  • 3. • Statistical method used to determine CORRELATION whether a linear relationship between variables exist • Used to describe the nature of relationship REGRESSION between variables; positive/negative or linear/nonlinear • Have two variables; an independent SIMPLE variable (explanatory) and a dependent REGRESSION variable (response) MULTIPLE • Two or more independent variables where REGRESSION used to predict one dependent variable POSITIVE • Both variables increase or decrease at the RELATIONSHIP same time NEGATIVE • As one variable increase, the other variable RELATIONSHIP decrease and vice versa.
  • 4. Scatter plots and Correlation • In order to find relationship between two different variables, data need to be collected. Example: relationship between number of hours study and grades for exam • Independent variable is variable that can be controlled or manipulated while dependent variable cannot • Dependent and independent variable can be plotted in graph named scatter plot • Independent variable x plotted on the horizontal axis while dependent y on vertical axis • Scatter plot is visual way to show the relationship between two variable
  • 5. SCATTER PLOT is a graph of the ordered pairs (x,y) of number consisting of the independent variable x and dependent variable y Cars (in ten Revenue (in Company thousand) billion) A 63 7 B 29 3.9 C 20.8 2.1 D 19.1 2.8 E 13.4 1.4 F 8.5 1.5
  • 6. Correlation • Correlation explained here is from Pearson Product Moment Correlation Coefficient (PPMC) by Karl Pearson Correlation coefficient computed from the sample data measures the strength and direction of a linear relationship between two quantitative variables. The symbol for the sample correlation is r while ρ (rho) for population correlation • Value range for correlation is from -1 to +1. • Correlation value which is close to +1 shows that there were a strong positive correlation while when the value is close to -1, it shows that there were a strong negative correlation • Value of r close to zero means that no linear relationship between the variable or only a weak relationship between both variables.
  • 7.
  • 8.
  • 9. Regression • We previously test the significance of the correlation coefficient. If the correlation is significant, the next step is to determine the equation of regression line • LINE OF BEST FIT: best fit means that the sum of squares of the vertical distance from each point to the line is at minimum • Reason best fit needed is that the value of y will be predicted from the values of x; hence the closer the points to the lines, the better prediction will be
  • 10.
  • 11. • MARGINAL CHANGE: the magnitude of the change in one variable when the other variable changes exactly 1 unit. • See example 10-9; the slope of the regression line is 0.106 which means for each increase of 10,000 cars, the value of y changes 0.106 unit ($ 106 million) on average. • EXTRAPOLATION: making prediction beyond the bounds of the data. • When prediction are made, they are based on present condition or on the premise that present trends will continue. • OUTLIER: point that seems out of place when compared with the other points • Some of this points can affect the equation of the regression line where the points are called influential points or influential observation
  • 12. Coefficient of determination x 1 2 3 4 5 y 10 8 12 16 20
  • 13.
  • 14.
  • 15.