SlideShare une entreprise Scribd logo
1  sur  10
MODEL SPECIFICATION FOR
MULTIPLE REGRESSION
Ryan Herzog, Ph.D.
Gonzaga University
ECON 355 – Regression Analysis
MEASURES OF FIT
• R-squared
• Standard Error of the Regression
R-SQUARED
• The regression R2 measures the fraction of the variance of Y that is
explained by X; it is unitless and ranges between zero (no fit) and one
(perfect fit)
• By simply looking at R-squared of a regression we will not be able to say
much, we need to have at least two regressions to compare them
• All else equal we want to be able to explain a higher share of variance in Y
• Stata:
• Use California school dataset
• Regress test scores on class size. The R-squared is 0.0512. This means that class size
explains about 5% of the variance in test scores.
• Regress test scores on expenditure per student. What is the R-squared? How would
you interpret it?
INTERPRETING R-SQUARED
• An increase in R-squared does not necessarily mean that an added variable
is statistically significant
• A high R-squared does not mean that the regressors are a true cause of the
dependent variable
• A high R-squared does not mean that the coefficients on the regressors are
true
• A low R-squared does not mean that the coefficients on the regressors are
wrong
• A high R-squared does not necessarily mean that you have the most
appropriate set of regressors, nor does a low R-squared necessarily mean
that you have an inappropriate set of regressors.
R-SQUARED RULES
• Same dependent variable
• Same number of independent variables
• Stata:
• generate ltestscr=ln(testscr)
Generate a table with the regressions below (use outreg)
• Regress test score on class size and expenditure per student
• Regress natural log of test score on class size
• Regress test score on class size and average district income
• Regress expenditure per student on average district income
• Regress natural log of test score on average district income
• Regress natural log of test score on class size and expenditure per student
STANDARD ERROR OF THE REGRESSION
• The SER is a measure of the spread of the observations around the regression line
measured in the units of the dependent variable
• SER is an estimator of the standard deviation of the regression error 𝑢𝑖
• All else equal we want to have a smaller spread of the observations around the
regression line
• In Stata SER is called root MSE (mean squared error)
• In the regression of class size on test score the SER is about 18.6. This means that the
standard deviation of the regression residuals around the regression line is 18.6 points.
• We can use SER to compare models
• What are the SERs for the rest of the regressions you have run? What does that mean?
CAUSAL EFFECTS AND IDEALIZED
EXPERIMENTS
• Most of our questions concern causal relationships among variables, i.e.
does lower class size lead to higher test scores?
• Causality means that a specific action leads to a specific measurable
consequence
• The best way to measure a causal effect is by conducting an experiment
• In a randomized controlled experiment there is both a control group and a
treatment group. Assignment to a group happens randomly
• We would like to be able to show that the only systematic reason for
differences in outcomes between the treatment and control groups is the
treatment itself
• In practice, it is not possible to perform ideal experiment. This however, gives
us a benchmark.
NONRANDOM SAMPLE EXAMPLE - 1
• In 1936 the Literary Gazette polled a “random” sample of households chosen
from telephone records and automobile registration
• In 1936 many households did not have cars or telephones, and those that
did tended to be richer – and were also more likely to be Republican
• The results of the poll indicated the Landon (a republican presidential
candidate) would defeat an incumbent (Roosevelt) by a landslide – 57% to
43% in the 1936 election
• Roosevelt ended up winning by 59% to 41%
• Do you think surveys conducted using social media might have a similar
problem with bias?
NONRANDOM SAMPLE EXAMPLE - 2
• Some mutual funds simply track the market, some are actively managed by full-time
professionals.
• Do the latter mutual funds outperform the former?
• One way to answer the question is to use historical data on funds currently available
for purchase, however this means that the most poorly underperforming funds would
not be represented.
• The sample is selected based on the value of the dependent variable, returns, because
funds with the lowest returns are eliminated
• The mean return of all funds would then be lower than the mean return of those still in
existence. This is also called a survivorship bias.
• When corrected for survivorship bias it turns out actively managed funds do not
outperform the market
NON-RANDOM SAMPLE EXAMPLE 3
• Does the class size affect the test scores with only districts where
average class size is above 20 students included?
• What is the average height of a GU student measured outside of a
basketball locker room?

Contenu connexe

Tendances

Factor analysis (fa)
Factor analysis (fa)Factor analysis (fa)
Factor analysis (fa)
Rajdeep Raut
 
Econometric model ing
Econometric model ingEconometric model ing
Econometric model ing
Matt Grant
 
Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...
Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...
Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...
tarta
 
Chapter 01
Chapter 01Chapter 01
Chapter 01
bmcfad01
 
Statistics
StatisticsStatistics
Statistics
pikuoec
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
saba khan
 

Tendances (20)

R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Statr sessions 4 to 6
Statr sessions 4 to 6Statr sessions 4 to 6
Statr sessions 4 to 6
 
Factor analysis (fa)
Factor analysis (fa)Factor analysis (fa)
Factor analysis (fa)
 
Week 3 unit 1
Week 3 unit 1Week 3 unit 1
Week 3 unit 1
 
Eric Delmelle: Disease Mapping
Eric Delmelle: Disease Mapping Eric Delmelle: Disease Mapping
Eric Delmelle: Disease Mapping
 
An Introduction to Factor analysis ppt
An Introduction to Factor analysis pptAn Introduction to Factor analysis ppt
An Introduction to Factor analysis ppt
 
Econometric model ing
Econometric model ingEconometric model ing
Econometric model ing
 
Marketing Research-Factor Analysis
Marketing Research-Factor AnalysisMarketing Research-Factor Analysis
Marketing Research-Factor Analysis
 
Review of Statistics
Review of StatisticsReview of Statistics
Review of Statistics
 
Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...
Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...
Statistical Techniques in Business & Economics (McGRAV-HILL) 12 Edt. Chapter ...
 
Factor Analysis (Marketing Research)
Factor Analysis (Marketing Research)Factor Analysis (Marketing Research)
Factor Analysis (Marketing Research)
 
Statistics
StatisticsStatistics
Statistics
 
Specification Errors | Eonomics
Specification Errors | EonomicsSpecification Errors | Eonomics
Specification Errors | Eonomics
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 
Chapter 01
Chapter 01Chapter 01
Chapter 01
 
Statistics
StatisticsStatistics
Statistics
 
Bmgt 311 chapter_15
Bmgt 311 chapter_15Bmgt 311 chapter_15
Bmgt 311 chapter_15
 
Data analysis
Data analysisData analysis
Data analysis
 
Factor analysis
Factor analysisFactor analysis
Factor analysis
 

Similaire à Topic 6 (model specification)

Institutional Research and Regression
Institutional Research and RegressionInstitutional Research and Regression
Institutional Research and Regression
Colby Stoever
 
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
whitneyleman54422
 
Scatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationScatterplots and Cautions of Correlation
Scatterplots and Cautions of Correlation
Oleg Janke
 
Spe 501 class 11
Spe 501 class 11Spe 501 class 11
Spe 501 class 11
jzurheide
 

Similaire à Topic 6 (model specification) (20)

Institutional Research and Regression
Institutional Research and RegressionInstitutional Research and Regression
Institutional Research and Regression
 
Dispersion
DispersionDispersion
Dispersion
 
Test norms.pptx
Test norms.pptxTest norms.pptx
Test norms.pptx
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
Quantitative research
Quantitative researchQuantitative research
Quantitative research
 
Quantitative Research Design.pptx
Quantitative Research Design.pptxQuantitative Research Design.pptx
Quantitative Research Design.pptx
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
The Interpretation Of Quartiles And Percentiles July 2009
The Interpretation Of Quartiles And Percentiles   July 2009The Interpretation Of Quartiles And Percentiles   July 2009
The Interpretation Of Quartiles And Percentiles July 2009
 
Presentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlalPresentation research- chapter 10-11 istiqlal
Presentation research- chapter 10-11 istiqlal
 
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docxSTAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
STAT 350 (Spring 2017) Homework 11 (20 points + 1 point BONUS).docx
 
Predicting Student Performance on the MSP-HSPE: Understanding, Conducting, an...
Predicting Student Performance on the MSP-HSPE: Understanding, Conducting, an...Predicting Student Performance on the MSP-HSPE: Understanding, Conducting, an...
Predicting Student Performance on the MSP-HSPE: Understanding, Conducting, an...
 
fundamentals of data science and analytics on descriptive analysis.pptx
fundamentals of data science and analytics on descriptive analysis.pptxfundamentals of data science and analytics on descriptive analysis.pptx
fundamentals of data science and analytics on descriptive analysis.pptx
 
TSL3133 Topic 12 Quantitative Data Analysis
TSL3133 Topic 12 Quantitative Data AnalysisTSL3133 Topic 12 Quantitative Data Analysis
TSL3133 Topic 12 Quantitative Data Analysis
 
Inferential Statistics
Inferential StatisticsInferential Statistics
Inferential Statistics
 
Quantitative data analysis - John Richardson
Quantitative data analysis - John RichardsonQuantitative data analysis - John Richardson
Quantitative data analysis - John Richardson
 
Scatterplots and Cautions of Correlation
Scatterplots and Cautions of CorrelationScatterplots and Cautions of Correlation
Scatterplots and Cautions of Correlation
 
Measurement and Scaling
Measurement and ScalingMeasurement and Scaling
Measurement and Scaling
 
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptxChapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
Chapter 2 The Science of Psychological Measurement (Alivio, Ansula).pptx
 
Spe 501 class 11
Spe 501 class 11Spe 501 class 11
Spe 501 class 11
 
Measure of Variability Report.pptx
Measure of Variability Report.pptxMeasure of Variability Report.pptx
Measure of Variability Report.pptx
 

Plus de Ryan Herzog

Plus de Ryan Herzog (20)

Chapter 14 - Great Recession
Chapter 14 - Great RecessionChapter 14 - Great Recession
Chapter 14 - Great Recession
 
Chapter 13 - AD/AS
Chapter 13 - AD/ASChapter 13 - AD/AS
Chapter 13 - AD/AS
 
Chapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyChapter 12 - Monetary Policy
Chapter 12 - Monetary Policy
 
Chapter 11 - IS Curve
Chapter 11 - IS CurveChapter 11 - IS Curve
Chapter 11 - IS Curve
 
Chapter 10 - Great Recession
Chapter 10 - Great RecessionChapter 10 - Great Recession
Chapter 10 - Great Recession
 
Chapter 9 - Short Run
Chapter 9 - Short RunChapter 9 - Short Run
Chapter 9 - Short Run
 
Chapter 8 - Inflation
Chapter 8 - InflationChapter 8 - Inflation
Chapter 8 - Inflation
 
Chapter 7 - Labor Market
Chapter 7 - Labor MarketChapter 7 - Labor Market
Chapter 7 - Labor Market
 
Chapter 6 - Romer Model
Chapter 6 - Romer Model Chapter 6 - Romer Model
Chapter 6 - Romer Model
 
Chapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthChapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for Growth
 
Chapter 4 - Model of Production
Chapter 4 - Model of ProductionChapter 4 - Model of Production
Chapter 4 - Model of Production
 
Chapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthChapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic Growth
 
Chapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyChapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the Macroeconomy
 
Topic 7 (data)
Topic 7 (data)Topic 7 (data)
Topic 7 (data)
 
Inequality
InequalityInequality
Inequality
 
Topic 7 (questions)
Topic 7 (questions)Topic 7 (questions)
Topic 7 (questions)
 
Topic 5 (multiple regression)
Topic 5 (multiple regression)Topic 5 (multiple regression)
Topic 5 (multiple regression)
 
Topic 5 (multiple regression)
Topic 5 (multiple regression)Topic 5 (multiple regression)
Topic 5 (multiple regression)
 
Topic 4 (binary)
Topic 4 (binary)Topic 4 (binary)
Topic 4 (binary)
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
 

Dernier

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Dernier (20)

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Role Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptxRole Of Transgenic Animal In Target Validation-1.pptx
Role Of Transgenic Animal In Target Validation-1.pptx
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 

Topic 6 (model specification)

  • 1. MODEL SPECIFICATION FOR MULTIPLE REGRESSION Ryan Herzog, Ph.D. Gonzaga University ECON 355 – Regression Analysis
  • 2. MEASURES OF FIT • R-squared • Standard Error of the Regression
  • 3. R-SQUARED • The regression R2 measures the fraction of the variance of Y that is explained by X; it is unitless and ranges between zero (no fit) and one (perfect fit) • By simply looking at R-squared of a regression we will not be able to say much, we need to have at least two regressions to compare them • All else equal we want to be able to explain a higher share of variance in Y • Stata: • Use California school dataset • Regress test scores on class size. The R-squared is 0.0512. This means that class size explains about 5% of the variance in test scores. • Regress test scores on expenditure per student. What is the R-squared? How would you interpret it?
  • 4. INTERPRETING R-SQUARED • An increase in R-squared does not necessarily mean that an added variable is statistically significant • A high R-squared does not mean that the regressors are a true cause of the dependent variable • A high R-squared does not mean that the coefficients on the regressors are true • A low R-squared does not mean that the coefficients on the regressors are wrong • A high R-squared does not necessarily mean that you have the most appropriate set of regressors, nor does a low R-squared necessarily mean that you have an inappropriate set of regressors.
  • 5. R-SQUARED RULES • Same dependent variable • Same number of independent variables • Stata: • generate ltestscr=ln(testscr) Generate a table with the regressions below (use outreg) • Regress test score on class size and expenditure per student • Regress natural log of test score on class size • Regress test score on class size and average district income • Regress expenditure per student on average district income • Regress natural log of test score on average district income • Regress natural log of test score on class size and expenditure per student
  • 6. STANDARD ERROR OF THE REGRESSION • The SER is a measure of the spread of the observations around the regression line measured in the units of the dependent variable • SER is an estimator of the standard deviation of the regression error 𝑢𝑖 • All else equal we want to have a smaller spread of the observations around the regression line • In Stata SER is called root MSE (mean squared error) • In the regression of class size on test score the SER is about 18.6. This means that the standard deviation of the regression residuals around the regression line is 18.6 points. • We can use SER to compare models • What are the SERs for the rest of the regressions you have run? What does that mean?
  • 7. CAUSAL EFFECTS AND IDEALIZED EXPERIMENTS • Most of our questions concern causal relationships among variables, i.e. does lower class size lead to higher test scores? • Causality means that a specific action leads to a specific measurable consequence • The best way to measure a causal effect is by conducting an experiment • In a randomized controlled experiment there is both a control group and a treatment group. Assignment to a group happens randomly • We would like to be able to show that the only systematic reason for differences in outcomes between the treatment and control groups is the treatment itself • In practice, it is not possible to perform ideal experiment. This however, gives us a benchmark.
  • 8. NONRANDOM SAMPLE EXAMPLE - 1 • In 1936 the Literary Gazette polled a “random” sample of households chosen from telephone records and automobile registration • In 1936 many households did not have cars or telephones, and those that did tended to be richer – and were also more likely to be Republican • The results of the poll indicated the Landon (a republican presidential candidate) would defeat an incumbent (Roosevelt) by a landslide – 57% to 43% in the 1936 election • Roosevelt ended up winning by 59% to 41% • Do you think surveys conducted using social media might have a similar problem with bias?
  • 9. NONRANDOM SAMPLE EXAMPLE - 2 • Some mutual funds simply track the market, some are actively managed by full-time professionals. • Do the latter mutual funds outperform the former? • One way to answer the question is to use historical data on funds currently available for purchase, however this means that the most poorly underperforming funds would not be represented. • The sample is selected based on the value of the dependent variable, returns, because funds with the lowest returns are eliminated • The mean return of all funds would then be lower than the mean return of those still in existence. This is also called a survivorship bias. • When corrected for survivorship bias it turns out actively managed funds do not outperform the market
  • 10. NON-RANDOM SAMPLE EXAMPLE 3 • Does the class size affect the test scores with only districts where average class size is above 20 students included? • What is the average height of a GU student measured outside of a basketball locker room?