SlideShare une entreprise Scribd logo
1  sur  21
Factors influencing the Human Development Index (HDI) using Multiple linear regression ADITYA PANUGANTI 1202062944 Industrial Engineering Year of data: 2008 Source: UN Development Programme Database
Objective and Dataset description To find which of the following variables have an effect on the Human Development Index (HDI)
Fitting the full model without interaction terms The regression equation for full model is y = 0.0596 + 0.00440 LIF + 0.000007 GDP - 0.000748 GRO + 0.0158 SCH + 0.0080 GEN+ 0.0159 EXP - 0.000004 GNI + 0.000003 MAT - 0.000051 HOM - 0.000540 MOR+ 0.000176 LIT - 0.0185 DEP + 0.0023 CON1 - 0.0117 CON2 - 0.0100 CON3+ 0.00431 CON4 - 0.0268 CON5 Difficult to interpret the coefficients of the above regression equation. Hence standardized the regression coefficients using Unit Normal scaling
Fitting the full model after Standardization The regression equation is 	y = 0.684 + 0.0404 LIF + 0.100 GDP - 0.0117 GRO + 0.0408 SCH + 0.00136 GEN+ 0.0443 EXP - 0.0627 GNI + 0.00089 MAT - 0.00068 HOM - 0.0196 MOR+ 0.00259 LIT - 0.0185 DEP + 0.0023 CON1 - 0.0117 CON2 - 0.0100 CON3+ 0.00431 CON4 - 0.0268 CON5 Model Statistics: R-Sq = 98.5%   R-Sq(adj) = 98.2% Analysis of Variance (ANOVA) 	Source           DF      SS    MS      F          P 	Regression       17 2.21784  0.13046  325.49  0.000 	Residual Error  84     0.03367  0.00040 	Total                101     2.25150
Signs of Multicollinearity Inference from Variance Inflation Factor (VIFs): 	VIF of GDP = 560.116 and VIF of GNI = 533.109 (Indicating Severe Multicollinearity) 	VIF of EXP = 18.368 and VIF of GRO = 16.456 (just over 10; Indicating  Multicollinearity) Inference from Correlation matrix:   	     LIF     GDP     GRO     SCH     GEN     EXP     GNI     MAT 	GDP    0.595 	GRO    0.719   0.630 	SCH    0.603   0.553   0.776 	GEN   -0.677  -0.705  -0.758  -0.743 	EXP    0.692   0.636   0.956   0.774  -0.798 	GNI    0.584   0.999   0.618   0.539  -0.688   0.620 ,[object Object]
No change in R-sq and R-sq(adj) statistics before and after dropping the model R-Sq = 98.5%   R-Sq(adj) = 98.2% To  confirm Multicollinearity between EXP and GRO, did a further analysis using Principal Component Analysis. Found the condition number to be (Condition number = λmax/ λmin=7.8001/0.0327 = 238.53  >100, indicating moderate multicollinearity ,[object Object],[object Object]
Indicator Interactions Considered interaction terms of DEP and other numerical variables. 24 variables in all including all the interaction terms S = 0.0220704   R-Sq = 98.3%   R-Sq(adj) = 97.8%;  R-Sq(pred) = 96.80% Residual plots:
Outliers and Influential points
Other outliers in graph Fitting each of the datapoints 45, 50, 80 and checking if there is any changes in summary stats These points are not contributing to any leverage, nor being influential; except for the fact that they are outliers; also R-sq not changing much, therefore we are leaving them in the model.
Residual plots after taking off the outliers and influential points ,[object Object]
To confirm this, we have used box cox transformation which showed us that there is a need in the transformation on ‘y’,[object Object]
Residual plots after transformation Can find some outliers in the Normal probability plot
Outliers and Influential points
Residual plots after taking off the outliers and influential points No need for any transformation, Box-Cox suggests λ = 1
Variable selection and Model building
Fit the selected model Regression equation: 	y2= 0.476 - 0.0164 GEN + 0.0403 GRO + 0.0422 LIF + 0.0557 GDP + 0.0449 SCH - 0.0181 CON2 - 0.0388 MOR + 0.0523 GDP_D + 0.0289 CON5 + 0.0412 MOR_D - 0.0476 HOM_D Detected Multicollinearity using Principal component analysis condition number = 134.837 (>100, Moderate Multicollinearity) Linear dependency equation: 0.107GRO+0.337LIF+0.798MOR-0.467MOR_D (dependency between the variables in the equation) Using correlation matrix found that the variable MOR has large correlation with LIF and MOR_D. Dropping MOR removed multicollinearity from model (condition number = 39.04617 (<100, No multicollinearity)
Residual plots after dropping MOR ,[object Object]
No need for any transformation, Box-Cox suggests λ = 1,[object Object]
Model validation Considered 118 countries for modelling  102  Estimation data and 16  prediction data
Conclusion The reduced model has a better R-sq than the actual model and most of the variables are significant (low p-value) in the model. The following variables were found to be significant  Gender inequality index Combined gross enrolment Life expectancy at birth GDP Mean schooling years Countries in continent 2 GDP& intensity of deprivation Under 5 mortality rate& intensity of deprivation Homicide rate& intensity of deprivation
Possible improvements More datapoints Ridge regression to eliminate multicollinearity Robust regression – to add more weight to the datapoints and retain them in the model.

Contenu connexe

Tendances

Rm chapter 2 report
Rm chapter 2 reportRm chapter 2 report
Rm chapter 2 reportcheppumol
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributionsRajaKrishnan M
 
The Development of Research Proposal in Economics
The Development of Research Proposal in Economics The Development of Research Proposal in Economics
The Development of Research Proposal in Economics Muhammad Ayyoub, PhD
 
Research Method
Research MethodResearch Method
Research MethodYeonYuRae
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Modelsrichardchandler
 
Basic Research methodology notes
Basic Research methodology notesBasic Research methodology notes
Basic Research methodology notesDr. Sunil Kumar
 
Research methodology theory chapt. 1- kotthari
Research methodology theory  chapt. 1- kotthariResearch methodology theory  chapt. 1- kotthari
Research methodology theory chapt. 1- kotthariRubia Bhatia
 
Deductive, inductive, and abductive reasoning and their application in trans...
Deductive, inductive, and abductive reasoning and their application in  trans...Deductive, inductive, and abductive reasoning and their application in  trans...
Deductive, inductive, and abductive reasoning and their application in trans...Pragmatic Cohesion Consulting, LLC
 
Probability distribution
Probability distributionProbability distribution
Probability distributionPunit Raut
 
Fitting Data into Probability Distributions
Fitting Data into Probability DistributionsFitting Data into Probability Distributions
Fitting Data into Probability DistributionsNikhil Chandra Sarkar
 

Tendances (14)

Rm chapter 2 report
Rm chapter 2 reportRm chapter 2 report
Rm chapter 2 report
 
Different types of distributions
Different types of distributionsDifferent types of distributions
Different types of distributions
 
Chisquare test
Chisquare  testChisquare  test
Chisquare test
 
Types of Research
Types of ResearchTypes of Research
Types of Research
 
The Development of Research Proposal in Economics
The Development of Research Proposal in Economics The Development of Research Proposal in Economics
The Development of Research Proposal in Economics
 
Research Method
Research MethodResearch Method
Research Method
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
 
Basic Research methodology notes
Basic Research methodology notesBasic Research methodology notes
Basic Research methodology notes
 
Research methodology theory chapt. 1- kotthari
Research methodology theory  chapt. 1- kotthariResearch methodology theory  chapt. 1- kotthari
Research methodology theory chapt. 1- kotthari
 
Z test, f-test,etc
Z test, f-test,etcZ test, f-test,etc
Z test, f-test,etc
 
Research and its process
Research and its processResearch and its process
Research and its process
 
Deductive, inductive, and abductive reasoning and their application in trans...
Deductive, inductive, and abductive reasoning and their application in  trans...Deductive, inductive, and abductive reasoning and their application in  trans...
Deductive, inductive, and abductive reasoning and their application in trans...
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 
Fitting Data into Probability Distributions
Fitting Data into Probability DistributionsFitting Data into Probability Distributions
Fitting Data into Probability Distributions
 

En vedette

HUMAN DEVELOPMENT INDEX AND ITS MEASUREMENT
HUMAN DEVELOPMENT INDEX AND ITS MEASUREMENTHUMAN DEVELOPMENT INDEX AND ITS MEASUREMENT
HUMAN DEVELOPMENT INDEX AND ITS MEASUREMENTarslan_bzu
 
Theories & factors affecting growth and development
Theories & factors affecting growth and developmentTheories & factors affecting growth and development
Theories & factors affecting growth and developmentAruna Naudasari
 
The Human Development Index
The Human Development IndexThe Human Development Index
The Human Development Indextutor2u
 
The Cold War: Actions and Reactions
The Cold War: Actions and ReactionsThe Cold War: Actions and Reactions
The Cold War: Actions and Reactionsmspitt
 
Human development indicators
Human development indicatorsHuman development indicators
Human development indicatorsBijith VB
 
Factors That Affect Growth And Development
Factors That Affect Growth And DevelopmentFactors That Affect Growth And Development
Factors That Affect Growth And Developmentlavadoods Masta
 
14 Development Definitions And Measuring Development
14 Development Definitions And Measuring Development14 Development Definitions And Measuring Development
14 Development Definitions And Measuring DevelopmentEcumene
 
Components of Human Development
Components of Human DevelopmentComponents of Human Development
Components of Human DevelopmentMypzi
 
эко предпринимательство. путь к успеху
эко предпринимательство. путь к успехуэко предпринимательство. путь к успеху
эко предпринимательство. путь к успехуmusorabolshenet
 
500 Уборок. Презентация для организаторов.
500 Уборок. Презентация для организаторов.500 Уборок. Презентация для организаторов.
500 Уборок. Презентация для организаторов.musorabolshenet
 
Sugar creation preso corrected final.ver1
Sugar creation preso corrected final.ver1Sugar creation preso corrected final.ver1
Sugar creation preso corrected final.ver1Salman Surgit
 
KIRIKU presents SOLUTION for C:F
KIRIKU presents SOLUTION for C:FKIRIKU presents SOLUTION for C:F
KIRIKU presents SOLUTION for C:Fdpereira7
 
2011-11-09 The State of Open Textbooks (Sloan-C Conference)
2011-11-09 The State of Open Textbooks (Sloan-C Conference)2011-11-09 The State of Open Textbooks (Sloan-C Conference)
2011-11-09 The State of Open Textbooks (Sloan-C Conference)Nicole Allen
 
Presentazione progetto smm
Presentazione progetto smmPresentazione progetto smm
Presentazione progetto smmGeosnews.com
 
Road to warriors
Road to warriorsRoad to warriors
Road to warriorsXing Liu
 
Es 08 pert final
Es 08 pert finalEs 08 pert final
Es 08 pert finalTim Arroyo
 

En vedette (20)

HUMAN DEVELOPMENT INDEX AND ITS MEASUREMENT
HUMAN DEVELOPMENT INDEX AND ITS MEASUREMENTHUMAN DEVELOPMENT INDEX AND ITS MEASUREMENT
HUMAN DEVELOPMENT INDEX AND ITS MEASUREMENT
 
Theories & factors affecting growth and development
Theories & factors affecting growth and developmentTheories & factors affecting growth and development
Theories & factors affecting growth and development
 
The Human Development Index
The Human Development IndexThe Human Development Index
The Human Development Index
 
The Cold War: Actions and Reactions
The Cold War: Actions and ReactionsThe Cold War: Actions and Reactions
The Cold War: Actions and Reactions
 
Human development indicators
Human development indicatorsHuman development indicators
Human development indicators
 
Factors That Affect Growth And Development
Factors That Affect Growth And DevelopmentFactors That Affect Growth And Development
Factors That Affect Growth And Development
 
14 Development Definitions And Measuring Development
14 Development Definitions And Measuring Development14 Development Definitions And Measuring Development
14 Development Definitions And Measuring Development
 
Components of Human Development
Components of Human DevelopmentComponents of Human Development
Components of Human Development
 
Di indonesia
Di indonesiaDi indonesia
Di indonesia
 
эко предпринимательство. путь к успеху
эко предпринимательство. путь к успехуэко предпринимательство. путь к успеху
эко предпринимательство. путь к успеху
 
500 Уборок. Презентация для организаторов.
500 Уборок. Презентация для организаторов.500 Уборок. Презентация для организаторов.
500 Уборок. Презентация для организаторов.
 
Radio Sua Voz
Radio Sua VozRadio Sua Voz
Radio Sua Voz
 
Sugar creation preso corrected final.ver1
Sugar creation preso corrected final.ver1Sugar creation preso corrected final.ver1
Sugar creation preso corrected final.ver1
 
KIRIKU presents SOLUTION for C:F
KIRIKU presents SOLUTION for C:FKIRIKU presents SOLUTION for C:F
KIRIKU presents SOLUTION for C:F
 
2011-11-09 The State of Open Textbooks (Sloan-C Conference)
2011-11-09 The State of Open Textbooks (Sloan-C Conference)2011-11-09 The State of Open Textbooks (Sloan-C Conference)
2011-11-09 The State of Open Textbooks (Sloan-C Conference)
 
Presentazione progetto smm
Presentazione progetto smmPresentazione progetto smm
Presentazione progetto smm
 
Road to warriors
Road to warriorsRoad to warriors
Road to warriors
 
Es 08 pert final
Es 08 pert finalEs 08 pert final
Es 08 pert final
 
Ion Gaina
Ion GainaIon Gaina
Ion Gaina
 
Cv 2011
Cv 2011Cv 2011
Cv 2011
 

Similaire à Factors influencing the Human Development Index (HDI) using Multiple Linear Regression

A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...aurkoiitk
 
Estimation of Import Regression for Canada
Estimation of Import Regression for CanadaEstimation of Import Regression for Canada
Estimation of Import Regression for CanadaGeray Gerayli
 
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docxInstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docxdirkrplav
 
Housing Starts Forecast
Housing Starts ForecastHousing Starts Forecast
Housing Starts ForecastJohnMonty15
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear modelRahul Rockers
 
Pushover analysis of simply support concrete section beam subjected to increm...
Pushover analysis of simply support concrete section beam subjected to increm...Pushover analysis of simply support concrete section beam subjected to increm...
Pushover analysis of simply support concrete section beam subjected to increm...Salar Delavar Qashqai
 
A zero-adjusted gamma model for LGD
A zero-adjusted gamma model for LGDA zero-adjusted gamma model for LGD
A zero-adjusted gamma model for LGDedwardtong
 
WCM PPT-1 for private limited - demo lokesh
WCM PPT-1 for private limited - demo lokeshWCM PPT-1 for private limited - demo lokesh
WCM PPT-1 for private limited - demo lokeshLokesh153390
 
Durbib- Watson D between 0-2 means there is a positive correlati
Durbib- Watson D between 0-2 means there is a positive correlatiDurbib- Watson D between 0-2 means there is a positive correlati
Durbib- Watson D between 0-2 means there is a positive correlatiAlyciaGold776
 
Productivity Mgs In Spain Acede05 Tenerife
Productivity Mgs In Spain Acede05 TenerifeProductivity Mgs In Spain Acede05 Tenerife
Productivity Mgs In Spain Acede05 TenerifeLuis Carlos
 
Simple Regression Years with Midwest and Shelf Space Winter .docx
Simple Regression Years with Midwest and Shelf Space Winter .docxSimple Regression Years with Midwest and Shelf Space Winter .docx
Simple Regression Years with Midwest and Shelf Space Winter .docxbudabrooks46239
 
Is the Macroeconomy Locally Unstable and Why Should We Care?
Is the Macroeconomy Locally Unstable and Why Should We Care?Is the Macroeconomy Locally Unstable and Why Should We Care?
Is the Macroeconomy Locally Unstable and Why Should We Care?ADEMU_Project
 
Linear models
Linear modelsLinear models
Linear modelsFAO
 
Getting things right: optimal tax policy with labor market duality
Getting things right: optimal tax policy with labor market dualityGetting things right: optimal tax policy with labor market duality
Getting things right: optimal tax policy with labor market dualityGilbert Mbara
 
Intro to econometrics
Intro to econometricsIntro to econometrics
Intro to econometricsGaetan Lion
 

Similaire à Factors influencing the Human Development Index (HDI) using Multiple Linear Regression (20)

A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
A Study on the Short Run Relationship b/w Major Economic Indicators of US Eco...
 
Estimation of Import Regression for Canada
Estimation of Import Regression for CanadaEstimation of Import Regression for Canada
Estimation of Import Regression for Canada
 
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docxInstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
InstructionsView CAAE Stormwater video Too Big for Our Ditches.docx
 
Ab data
Ab dataAb data
Ab data
 
Housing Starts Forecast
Housing Starts ForecastHousing Starts Forecast
Housing Starts Forecast
 
Statistics homework help
Statistics homework helpStatistics homework help
Statistics homework help
 
Generalized linear model
Generalized linear modelGeneralized linear model
Generalized linear model
 
Ch15
Ch15Ch15
Ch15
 
Pushover analysis of simply support concrete section beam subjected to increm...
Pushover analysis of simply support concrete section beam subjected to increm...Pushover analysis of simply support concrete section beam subjected to increm...
Pushover analysis of simply support concrete section beam subjected to increm...
 
A zero-adjusted gamma model for LGD
A zero-adjusted gamma model for LGDA zero-adjusted gamma model for LGD
A zero-adjusted gamma model for LGD
 
WCM PPT-1 for private limited - demo lokesh
WCM PPT-1 for private limited - demo lokeshWCM PPT-1 for private limited - demo lokesh
WCM PPT-1 for private limited - demo lokesh
 
Durbib- Watson D between 0-2 means there is a positive correlati
Durbib- Watson D between 0-2 means there is a positive correlatiDurbib- Watson D between 0-2 means there is a positive correlati
Durbib- Watson D between 0-2 means there is a positive correlati
 
Testing for normality
Testing for normalityTesting for normality
Testing for normality
 
Productivity Mgs In Spain Acede05 Tenerife
Productivity Mgs In Spain Acede05 TenerifeProductivity Mgs In Spain Acede05 Tenerife
Productivity Mgs In Spain Acede05 Tenerife
 
Simple Regression Years with Midwest and Shelf Space Winter .docx
Simple Regression Years with Midwest and Shelf Space Winter .docxSimple Regression Years with Midwest and Shelf Space Winter .docx
Simple Regression Years with Midwest and Shelf Space Winter .docx
 
Is the Macroeconomy Locally Unstable and Why Should We Care?
Is the Macroeconomy Locally Unstable and Why Should We Care?Is the Macroeconomy Locally Unstable and Why Should We Care?
Is the Macroeconomy Locally Unstable and Why Should We Care?
 
Linear models
Linear modelsLinear models
Linear models
 
Getting things right: optimal tax policy with labor market duality
Getting things right: optimal tax policy with labor market dualityGetting things right: optimal tax policy with labor market duality
Getting things right: optimal tax policy with labor market duality
 
Intro to econometrics
Intro to econometricsIntro to econometrics
Intro to econometrics
 
Assignment
AssignmentAssignment
Assignment
 

Factors influencing the Human Development Index (HDI) using Multiple Linear Regression

  • 1. Factors influencing the Human Development Index (HDI) using Multiple linear regression ADITYA PANUGANTI 1202062944 Industrial Engineering Year of data: 2008 Source: UN Development Programme Database
  • 2. Objective and Dataset description To find which of the following variables have an effect on the Human Development Index (HDI)
  • 3. Fitting the full model without interaction terms The regression equation for full model is y = 0.0596 + 0.00440 LIF + 0.000007 GDP - 0.000748 GRO + 0.0158 SCH + 0.0080 GEN+ 0.0159 EXP - 0.000004 GNI + 0.000003 MAT - 0.000051 HOM - 0.000540 MOR+ 0.000176 LIT - 0.0185 DEP + 0.0023 CON1 - 0.0117 CON2 - 0.0100 CON3+ 0.00431 CON4 - 0.0268 CON5 Difficult to interpret the coefficients of the above regression equation. Hence standardized the regression coefficients using Unit Normal scaling
  • 4. Fitting the full model after Standardization The regression equation is y = 0.684 + 0.0404 LIF + 0.100 GDP - 0.0117 GRO + 0.0408 SCH + 0.00136 GEN+ 0.0443 EXP - 0.0627 GNI + 0.00089 MAT - 0.00068 HOM - 0.0196 MOR+ 0.00259 LIT - 0.0185 DEP + 0.0023 CON1 - 0.0117 CON2 - 0.0100 CON3+ 0.00431 CON4 - 0.0268 CON5 Model Statistics: R-Sq = 98.5% R-Sq(adj) = 98.2% Analysis of Variance (ANOVA) Source DF SS MS F P Regression 17 2.21784 0.13046 325.49 0.000 Residual Error 84 0.03367 0.00040 Total 101 2.25150
  • 5.
  • 6.
  • 7. Indicator Interactions Considered interaction terms of DEP and other numerical variables. 24 variables in all including all the interaction terms S = 0.0220704 R-Sq = 98.3% R-Sq(adj) = 97.8%; R-Sq(pred) = 96.80% Residual plots:
  • 9. Other outliers in graph Fitting each of the datapoints 45, 50, 80 and checking if there is any changes in summary stats These points are not contributing to any leverage, nor being influential; except for the fact that they are outliers; also R-sq not changing much, therefore we are leaving them in the model.
  • 10.
  • 11.
  • 12. Residual plots after transformation Can find some outliers in the Normal probability plot
  • 14. Residual plots after taking off the outliers and influential points No need for any transformation, Box-Cox suggests λ = 1
  • 15. Variable selection and Model building
  • 16. Fit the selected model Regression equation: y2= 0.476 - 0.0164 GEN + 0.0403 GRO + 0.0422 LIF + 0.0557 GDP + 0.0449 SCH - 0.0181 CON2 - 0.0388 MOR + 0.0523 GDP_D + 0.0289 CON5 + 0.0412 MOR_D - 0.0476 HOM_D Detected Multicollinearity using Principal component analysis condition number = 134.837 (>100, Moderate Multicollinearity) Linear dependency equation: 0.107GRO+0.337LIF+0.798MOR-0.467MOR_D (dependency between the variables in the equation) Using correlation matrix found that the variable MOR has large correlation with LIF and MOR_D. Dropping MOR removed multicollinearity from model (condition number = 39.04617 (<100, No multicollinearity)
  • 17.
  • 18.
  • 19. Model validation Considered 118 countries for modelling 102  Estimation data and 16  prediction data
  • 20. Conclusion The reduced model has a better R-sq than the actual model and most of the variables are significant (low p-value) in the model. The following variables were found to be significant Gender inequality index Combined gross enrolment Life expectancy at birth GDP Mean schooling years Countries in continent 2 GDP& intensity of deprivation Under 5 mortality rate& intensity of deprivation Homicide rate& intensity of deprivation
  • 21. Possible improvements More datapoints Ridge regression to eliminate multicollinearity Robust regression – to add more weight to the datapoints and retain them in the model.