SlideShare une entreprise Scribd logo
1  sur  19
DIFFERENCE IN MEANS AND
REGRESSIONS WITH BINARY
INDEPENDENT VARIABLES
ECON 355 – Regression Analysis
Ryan Herzog, Ph.D.
IN THIS TOPIC
• Categorical variables
• Mean-comparison tests
• Difference-in-difference(s) estimation technique
CATEGORICAL VARIABLES
• Up until now we have been dealing with continuous variables, i.e. the price of a house or
the size of a house
• Categorical variables are different, they are usually described by a word, not number. We
can deal with them by grouping them.
• For example, if I ask you if you are a cat or a dog person and record the numbers in a
spreadsheet, I will be dealing with words “cat” and “dog” not numbers
CLASS DATA COLLECTION
• Please use the following link to input data about yourself:
- Cat person/dog person
- Coffee consumption
https://docs.google.com/spreadsheets/d/1R7cPm92FeYuaxAQRLINh1_zCLL_VlEZwj_K9CGz
1zPA/edit?usp=sharing
- Do cat and dog lovers consume the same amount of caffeine?
DIFFERENCE IN MEANS
• To test if there is a difference in means between two groups, we need to find the t-stat:
• Find the means and the difference between them
• Divide the difference by the standard error:
𝑠1
2
𝑛1
+
𝑠2
2
𝑛2
• The null hypothesis is that the means of the two groups are equal, i.e. cat and dog
lovers consume the same amount of caffeine
• The alternative is that they do not consume the same amount of caffeine ( the means of
the two groups are different)
BINARY VARIABLES
• To record categorical variables we will use dummies/binary variables: 1 and 0
• For example, if I have two groups that are mutually exclusive, meaning each observation
can only belong to one group but not both at the same time, I will assign “1” to the first
group and “0” to the other.
• For example, cat lovers can be coded as 1, and dog lovers will then be coded as 0.
• Stata: use TeachingRatings.dta dataset
• Which variables are continuous and which ones are categorical?
DIY
• Work in Stata
• We need to find out the difference in means of student-teacher evaluations for male
and female professors
• bys female: summarize course_eval
• Use the formula for t-stat to test if the means between the two groups are equal
REGRESSION WITH A BINARY INDEPENDENT VARIABLE
• When we have a binary variable on the right-hand side we are effectively comparing
means between two groups the included group and the excluded group.
Stata: reg course_eval female
• The interpretation of beta changes
• It is not anymore “when X increases by 1 unit” but rather ”For the included group (what
is it?) the dependent variable on average changes by beta compared to the excluded
group (what is it?)”
LET’S TRY MORE EXAMPLES
• Regress course evaluations on the following binary variables
• Minority (equal to 1 if the professor represents a teaching minority, 0 otherwise)
• One credit (equal to 1 if the course is a 1-credit course, 0 otherwise)
• nnenglish (equal to 1 if the professor’s native language is not English, 0 otherwise)
• Intro (equal to 1 if the course is introductory, 0 otherwise)
• Are the relationships statistically significant?
• If yes, please interpret them
CONDUCTING A MEAN-COMPARISON TEST (T-TEST) IN
STATA
• We can also find the same answer by conducting the t-test analysis in Stata
• Statistics=>summaries, tables, and tests=>classical tests of hypotheses=>t-test (mean-
comparison test)
1. Run a mean-comparison test of teaching evaluations based on the gender
2. Run a mean-comparison test of teaching evaluations based on any other categorical
variable
CREATING A BINARY VARIABLE IN STATA
• Please use “binarydata_stata” dataset
• Library – stands for a family member owning a library card when the respondent was 14
• Urban – the respondent lives in an urban area at 2002 interview
• Government – the respondent works for the government
• To create a binary variable for the “library”
gen libraryd=0
replace libraryd=1 if library=="yes”
• Do earnings of those whose family owned a library card differ from the earnings of
those whose family did not? If yes, by how much?
DIY
• Please convert the government variable into a binary variable
• By conducting a mean-comparison test in Stata please pick the correct interpretation of
the test result
DIFFERENCE-IN-DIFFERENCES ESTIMATION TECHNIQUE
• Allows to show causality
• Needs a treatment and a control group
• We can use difference-in-differences, for example, if there is a new policy implemented
on a local level
• Example: Card and Krueger (2000).
Control group: fast food stores in Eastern Pennsylvania
Treatment group: fast food stores in New Jersey
Treatment: increase in minimum wage in New Jersey on April 1, 1992
Compare employment growth in Eastern Pennsylvania and New Jersey before and after
treatment
PAIRED T-TEST
• The paired t-test is used to determine whether the mean of a dependent variable (e.g.,
weight, anxiety level, salary, reaction time, etc.) is the same in two related groups (e.g.,
two groups of participants that are measured at two different "time points" or who
undergo two different "conditions").
• To understand whether there was a difference in managers' salaries before and after undertaking
a PhD
• Your dependent variable would be "salary", and your two related groups would be the two
different "time points”
• To understand whether there was a difference in smokers' daily cigarette consumption 6 week
after wearing nicotine patches compared with wearing patches that did not contain nicotine,
known as a "placebo"
• Your dependent variable would be "daily cigarette consumption", and your two related groups
would be the two different "conditions" participants were exposed to; that is, cigarette
consumption values after wearing "nicotine patches" (the treatment group) compared to after
wearing the "placebo" (the control group).
• Specifically, you use a paired t-test to determine whether the mean difference between
two groups is statistically significantly different to zero.
DIFF-N-DIFF CONTINUED. ANOTHER EXAMPLE
• Richardson and Troost (2009). Different monetary policies by federal reserve districts
• Mississippi is divided between 6th and 8th federal reserve districts
• During the Great Depression Atlanta Federal Reserve (6th district) increased lending by
30-40% to rescue banks from bankruptcy; St. Louis Fed if anything cut the lending by
10% (laissez faire)
• Treatment group – Mississippi banks in the 6th federal reserve district
• Control group – Mississippi banks in the 8th federal reserve district
• Treatment – monetary policy during Great Depression
MONETARY POLICY DURING GREAT DEPRESSION CONT’D
• Use banks.xlsx
• District 6 and district 8 variables signify the number of banks in each district at a point in
time
• Use filter in excel to find out the number of banks on the first of July each year
• What is the difference in the number of banks between 1929 and 1933 in the 6th district?
• What is the difference in the number of banks between 1929 and 1933 in the 8th district?
• What is the difference of the two differences?
DIFF-N-DIFF WATER SUPPLY AND CHOLERA EXAMPLE
• John Snow (1855) – described the relationship between water supply and cholera death
in London overtime
• In South London both Southwark and Vauxhall Company and Lambert Company drew
water from contaminated Thames in central London in 1849
• In 1852 Lambeth Company started drawing water from an uncontaminated water source
upstream.
• What would we expect to happen in this case?
• What is the control group? Treatment group? Treatment?
DIFF-N-DIFF WATER SUPPLY AND CHOLERA EXAMPLE
CONT’D
• Use Cholera_deaths excel dataset
• To conduct the diff-n-diff analysis here what should you do? What is the conclusion?
• In Stata
• Stata: statistics=>summaries, tables, and tests=>classical tests of hypotheses=>t-test (mean-
comparison test)=>paired test=> by group
• What is the conclusion based on the statistical significance of the test?
REVIEW
1. What is the difference between categorical variables and continuous variables? Please give an example
of each
2. To run a regression with a categorical variable as an independent variable what do we need to do?
3. What is the difference in the interpretation of a regression with a continuous independent variable and a
regression with a categorical independent variable?
4. How do we interpret the constant in a regression with a categorical independent variable? Please give
an example
5. What do we mean by “omitted group” when including a categorical variable in a regression? Please give
an example
6. What does conducting a mean-comparison test allow us to do?
7. How do we conduct a mean-comparison test in Stata? When do we conduct two-sample t-test and
when do we conduct a paired t-test?
8. To be able to conduct a difference in difference analysis what do we need to have?
9. Intuitively, how do we conduct a difference in difference analysis?

Contenu connexe

Tendances (8)

Variables and its attributes
Variables and its attributesVariables and its attributes
Variables and its attributes
 
Independent and dependent variables
Independent and dependent variablesIndependent and dependent variables
Independent and dependent variables
 
Lesson 1 04 types of data
Lesson 1 04 types of dataLesson 1 04 types of data
Lesson 1 04 types of data
 
Variables of the Study
Variables of the StudyVariables of the Study
Variables of the Study
 
Null hypothesis for spearmans rho
Null hypothesis for spearmans rhoNull hypothesis for spearmans rho
Null hypothesis for spearmans rho
 
Lesson 5 variables
Lesson 5 variablesLesson 5 variables
Lesson 5 variables
 
Types of Variables
Types of VariablesTypes of Variables
Types of Variables
 
Week 7 - Types of Data
Week 7 -  Types of DataWeek 7 -  Types of Data
Week 7 - Types of Data
 

Similaire à Topic 4 (binary)

Formulating a Hypothesis
Formulating a HypothesisFormulating a Hypothesis
Formulating a Hypothesis
bjkim0228
 
Research method ch08 statistical methods 2 anova
Research method ch08 statistical methods 2 anovaResearch method ch08 statistical methods 2 anova
Research method ch08 statistical methods 2 anova
naranbatn
 
typesofvariablesinresearchankitach-181022084515.docx
typesofvariablesinresearchankitach-181022084515.docxtypesofvariablesinresearchankitach-181022084515.docx
typesofvariablesinresearchankitach-181022084515.docx
saranya443113
 

Similaire à Topic 4 (binary) (20)

Topic 5 (multiple regression)
Topic 5 (multiple regression)Topic 5 (multiple regression)
Topic 5 (multiple regression)
 
Formulating a Hypothesis
Formulating a HypothesisFormulating a Hypothesis
Formulating a Hypothesis
 
Topic 5 (multiple regression)
Topic 5 (multiple regression)Topic 5 (multiple regression)
Topic 5 (multiple regression)
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 
ANOVA 2023 aa 2564896.pptx
ANOVA 2023  aa 2564896.pptxANOVA 2023  aa 2564896.pptx
ANOVA 2023 aa 2564896.pptx
 
Parametric & non-parametric
Parametric & non-parametricParametric & non-parametric
Parametric & non-parametric
 
univariate and bivariate analysis in spss
univariate and bivariate analysis in spss univariate and bivariate analysis in spss
univariate and bivariate analysis in spss
 
BASIC CONCEPTS AND VOCABULARY OF STATISTICS.pptx
BASIC CONCEPTS AND VOCABULARY OF STATISTICS.pptxBASIC CONCEPTS AND VOCABULARY OF STATISTICS.pptx
BASIC CONCEPTS AND VOCABULARY OF STATISTICS.pptx
 
Introduction to basics of bio statistics.
Introduction to basics of bio statistics.Introduction to basics of bio statistics.
Introduction to basics of bio statistics.
 
HYPOTHESES.pptx
HYPOTHESES.pptxHYPOTHESES.pptx
HYPOTHESES.pptx
 
When to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptxWhen to use, What Statistical Test for data Analysis modified.pptx
When to use, What Statistical Test for data Analysis modified.pptx
 
Spss introductory session data entry and descriptive stats
Spss introductory session data entry and descriptive statsSpss introductory session data entry and descriptive stats
Spss introductory session data entry and descriptive stats
 
The t test
The t testThe t test
The t test
 
Basic Terms in Statistics
Basic Terms in StatisticsBasic Terms in Statistics
Basic Terms in Statistics
 
Statistics using SPSS
Statistics using SPSSStatistics using SPSS
Statistics using SPSS
 
Analysis of Variance
Analysis of VarianceAnalysis of Variance
Analysis of Variance
 
Research method ch08 statistical methods 2 anova
Research method ch08 statistical methods 2 anovaResearch method ch08 statistical methods 2 anova
Research method ch08 statistical methods 2 anova
 
Types of variables in research
Types of variables in research Types of variables in research
Types of variables in research
 
Lesson 2 introduction to research in language studies
Lesson 2 introduction to research in language studiesLesson 2 introduction to research in language studies
Lesson 2 introduction to research in language studies
 
typesofvariablesinresearchankitach-181022084515.docx
typesofvariablesinresearchankitach-181022084515.docxtypesofvariablesinresearchankitach-181022084515.docx
typesofvariablesinresearchankitach-181022084515.docx
 

Plus de Ryan Herzog

Plus de Ryan Herzog (20)

Chapter 14 - Great Recession
Chapter 14 - Great RecessionChapter 14 - Great Recession
Chapter 14 - Great Recession
 
Chapter 13 - AD/AS
Chapter 13 - AD/ASChapter 13 - AD/AS
Chapter 13 - AD/AS
 
Chapter 12 - Monetary Policy
Chapter 12 - Monetary PolicyChapter 12 - Monetary Policy
Chapter 12 - Monetary Policy
 
Chapter 11 - IS Curve
Chapter 11 - IS CurveChapter 11 - IS Curve
Chapter 11 - IS Curve
 
Chapter 10 - Great Recession
Chapter 10 - Great RecessionChapter 10 - Great Recession
Chapter 10 - Great Recession
 
Chapter 9 - Short Run
Chapter 9 - Short RunChapter 9 - Short Run
Chapter 9 - Short Run
 
Chapter 8 - Inflation
Chapter 8 - InflationChapter 8 - Inflation
Chapter 8 - Inflation
 
Chapter 7 - Labor Market
Chapter 7 - Labor MarketChapter 7 - Labor Market
Chapter 7 - Labor Market
 
Chapter 6 - Romer Model
Chapter 6 - Romer Model Chapter 6 - Romer Model
Chapter 6 - Romer Model
 
Chapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for GrowthChapter 5 - Solow Model for Growth
Chapter 5 - Solow Model for Growth
 
Chapter 4 - Model of Production
Chapter 4 - Model of ProductionChapter 4 - Model of Production
Chapter 4 - Model of Production
 
Chapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic GrowthChapter 3 - Long-Run Economic Growth
Chapter 3 - Long-Run Economic Growth
 
Chapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the MacroeconomyChapter 2 - Measuring the Macroeconomy
Chapter 2 - Measuring the Macroeconomy
 
Topic 7 (data)
Topic 7 (data)Topic 7 (data)
Topic 7 (data)
 
Inequality
InequalityInequality
Inequality
 
Topic 7 (questions)
Topic 7 (questions)Topic 7 (questions)
Topic 7 (questions)
 
Topic 6 (model specification)
Topic 6 (model specification)Topic 6 (model specification)
Topic 6 (model specification)
 
Topic 3 (Stats summary)
Topic 3 (Stats summary)Topic 3 (Stats summary)
Topic 3 (Stats summary)
 
Topic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis TestingTopic 2 - More on Hypothesis Testing
Topic 2 - More on Hypothesis Testing
 
Hypothesis Testing
Hypothesis TestingHypothesis Testing
Hypothesis Testing
 

Dernier

Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
kauryashika82
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
MateoGardella
 

Dernier (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 

Topic 4 (binary)

  • 1. DIFFERENCE IN MEANS AND REGRESSIONS WITH BINARY INDEPENDENT VARIABLES ECON 355 – Regression Analysis Ryan Herzog, Ph.D.
  • 2. IN THIS TOPIC • Categorical variables • Mean-comparison tests • Difference-in-difference(s) estimation technique
  • 3. CATEGORICAL VARIABLES • Up until now we have been dealing with continuous variables, i.e. the price of a house or the size of a house • Categorical variables are different, they are usually described by a word, not number. We can deal with them by grouping them. • For example, if I ask you if you are a cat or a dog person and record the numbers in a spreadsheet, I will be dealing with words “cat” and “dog” not numbers
  • 4. CLASS DATA COLLECTION • Please use the following link to input data about yourself: - Cat person/dog person - Coffee consumption https://docs.google.com/spreadsheets/d/1R7cPm92FeYuaxAQRLINh1_zCLL_VlEZwj_K9CGz 1zPA/edit?usp=sharing - Do cat and dog lovers consume the same amount of caffeine?
  • 5. DIFFERENCE IN MEANS • To test if there is a difference in means between two groups, we need to find the t-stat: • Find the means and the difference between them • Divide the difference by the standard error: 𝑠1 2 𝑛1 + 𝑠2 2 𝑛2 • The null hypothesis is that the means of the two groups are equal, i.e. cat and dog lovers consume the same amount of caffeine • The alternative is that they do not consume the same amount of caffeine ( the means of the two groups are different)
  • 6. BINARY VARIABLES • To record categorical variables we will use dummies/binary variables: 1 and 0 • For example, if I have two groups that are mutually exclusive, meaning each observation can only belong to one group but not both at the same time, I will assign “1” to the first group and “0” to the other. • For example, cat lovers can be coded as 1, and dog lovers will then be coded as 0. • Stata: use TeachingRatings.dta dataset • Which variables are continuous and which ones are categorical?
  • 7. DIY • Work in Stata • We need to find out the difference in means of student-teacher evaluations for male and female professors • bys female: summarize course_eval • Use the formula for t-stat to test if the means between the two groups are equal
  • 8. REGRESSION WITH A BINARY INDEPENDENT VARIABLE • When we have a binary variable on the right-hand side we are effectively comparing means between two groups the included group and the excluded group. Stata: reg course_eval female • The interpretation of beta changes • It is not anymore “when X increases by 1 unit” but rather ”For the included group (what is it?) the dependent variable on average changes by beta compared to the excluded group (what is it?)”
  • 9. LET’S TRY MORE EXAMPLES • Regress course evaluations on the following binary variables • Minority (equal to 1 if the professor represents a teaching minority, 0 otherwise) • One credit (equal to 1 if the course is a 1-credit course, 0 otherwise) • nnenglish (equal to 1 if the professor’s native language is not English, 0 otherwise) • Intro (equal to 1 if the course is introductory, 0 otherwise) • Are the relationships statistically significant? • If yes, please interpret them
  • 10. CONDUCTING A MEAN-COMPARISON TEST (T-TEST) IN STATA • We can also find the same answer by conducting the t-test analysis in Stata • Statistics=>summaries, tables, and tests=>classical tests of hypotheses=>t-test (mean- comparison test) 1. Run a mean-comparison test of teaching evaluations based on the gender 2. Run a mean-comparison test of teaching evaluations based on any other categorical variable
  • 11. CREATING A BINARY VARIABLE IN STATA • Please use “binarydata_stata” dataset • Library – stands for a family member owning a library card when the respondent was 14 • Urban – the respondent lives in an urban area at 2002 interview • Government – the respondent works for the government • To create a binary variable for the “library” gen libraryd=0 replace libraryd=1 if library=="yes” • Do earnings of those whose family owned a library card differ from the earnings of those whose family did not? If yes, by how much?
  • 12. DIY • Please convert the government variable into a binary variable • By conducting a mean-comparison test in Stata please pick the correct interpretation of the test result
  • 13. DIFFERENCE-IN-DIFFERENCES ESTIMATION TECHNIQUE • Allows to show causality • Needs a treatment and a control group • We can use difference-in-differences, for example, if there is a new policy implemented on a local level • Example: Card and Krueger (2000). Control group: fast food stores in Eastern Pennsylvania Treatment group: fast food stores in New Jersey Treatment: increase in minimum wage in New Jersey on April 1, 1992 Compare employment growth in Eastern Pennsylvania and New Jersey before and after treatment
  • 14. PAIRED T-TEST • The paired t-test is used to determine whether the mean of a dependent variable (e.g., weight, anxiety level, salary, reaction time, etc.) is the same in two related groups (e.g., two groups of participants that are measured at two different "time points" or who undergo two different "conditions"). • To understand whether there was a difference in managers' salaries before and after undertaking a PhD • Your dependent variable would be "salary", and your two related groups would be the two different "time points” • To understand whether there was a difference in smokers' daily cigarette consumption 6 week after wearing nicotine patches compared with wearing patches that did not contain nicotine, known as a "placebo" • Your dependent variable would be "daily cigarette consumption", and your two related groups would be the two different "conditions" participants were exposed to; that is, cigarette consumption values after wearing "nicotine patches" (the treatment group) compared to after wearing the "placebo" (the control group). • Specifically, you use a paired t-test to determine whether the mean difference between two groups is statistically significantly different to zero.
  • 15. DIFF-N-DIFF CONTINUED. ANOTHER EXAMPLE • Richardson and Troost (2009). Different monetary policies by federal reserve districts • Mississippi is divided between 6th and 8th federal reserve districts • During the Great Depression Atlanta Federal Reserve (6th district) increased lending by 30-40% to rescue banks from bankruptcy; St. Louis Fed if anything cut the lending by 10% (laissez faire) • Treatment group – Mississippi banks in the 6th federal reserve district • Control group – Mississippi banks in the 8th federal reserve district • Treatment – monetary policy during Great Depression
  • 16. MONETARY POLICY DURING GREAT DEPRESSION CONT’D • Use banks.xlsx • District 6 and district 8 variables signify the number of banks in each district at a point in time • Use filter in excel to find out the number of banks on the first of July each year • What is the difference in the number of banks between 1929 and 1933 in the 6th district? • What is the difference in the number of banks between 1929 and 1933 in the 8th district? • What is the difference of the two differences?
  • 17. DIFF-N-DIFF WATER SUPPLY AND CHOLERA EXAMPLE • John Snow (1855) – described the relationship between water supply and cholera death in London overtime • In South London both Southwark and Vauxhall Company and Lambert Company drew water from contaminated Thames in central London in 1849 • In 1852 Lambeth Company started drawing water from an uncontaminated water source upstream. • What would we expect to happen in this case? • What is the control group? Treatment group? Treatment?
  • 18. DIFF-N-DIFF WATER SUPPLY AND CHOLERA EXAMPLE CONT’D • Use Cholera_deaths excel dataset • To conduct the diff-n-diff analysis here what should you do? What is the conclusion? • In Stata • Stata: statistics=>summaries, tables, and tests=>classical tests of hypotheses=>t-test (mean- comparison test)=>paired test=> by group • What is the conclusion based on the statistical significance of the test?
  • 19. REVIEW 1. What is the difference between categorical variables and continuous variables? Please give an example of each 2. To run a regression with a categorical variable as an independent variable what do we need to do? 3. What is the difference in the interpretation of a regression with a continuous independent variable and a regression with a categorical independent variable? 4. How do we interpret the constant in a regression with a categorical independent variable? Please give an example 5. What do we mean by “omitted group” when including a categorical variable in a regression? Please give an example 6. What does conducting a mean-comparison test allow us to do? 7. How do we conduct a mean-comparison test in Stata? When do we conduct two-sample t-test and when do we conduct a paired t-test? 8. To be able to conduct a difference in difference analysis what do we need to have? 9. Intuitively, how do we conduct a difference in difference analysis?