SlideShare a Scribd company logo
1 of 11
Download to read offline
Regression
APAM E4990
Modeling Social Data
Jake Hofman
Columbia University
February 24, 2017
Jake Hofman (Columbia University) Regression February 24, 2017 1 / 6
Definition
?
Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
Definition
Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
Definition
“The primary goal in a regression analysis is to
understand, as far as possible with the available data,
how the conditional distribution of the response varies
across subpopulations determined by the possible
values of the predictor or predictors.”
-“Applied Regression Including Computing and Graphics”
Cook & Weisberg (1999)
Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
Goals
Describe
Provide a compact summary of outcomes under different conditions
Predict
Make forecasts for future outcomes or unobserved conditions
Explain
Account for associations between predictors and outcomes
Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
Goals
Describe
Provide a compact summary of outcomes under different conditions
Never “false”, but may be wasteful or misleading
Predict
Make forecasts for future outcomes or unobserved conditions
Varying degrees of success, often room for improvement
Explain
Account for associations between predictors and outcomes
Difficult to establish causality in observational studies
See “Regression Analysis: A Constructive Critique”, Berk (2004)
Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
Goals
Models should be flexible enough to describe observed phenomena
but simple enough to generalize to future observations
Jake Hofman (Columbia University) Regression February 24, 2017 4 / 6
Examples1
1.2 Setting the Regression Context 3
Should one be especially interested in a comparison of the means, one could
roceed descriptively with a conventional least squares regression analysis as
special case. That is, for each observation i, one could let
ˆyi = β0 + β1xi, (1.1)
here the response variable y is each applicant’s SAT score, x is an indicator
Fig. 1.2. Distribution of SAT scores for Asian applicants.
SAT Scores for Asian Applicants
SAT Score
Frequency
600 800 1000 1200 1400 1600
050100150
of some response y varies across subpopulations determined by the po
values of the predictor or predictors” (Cook and Weisberg, 1999: 27).
is, interest centers on the distribution of the response variable Y conditi
on one or more predictors X.
This definition includes a wide variety of elementary procedures e
implemented in R. (See, for example, Maindonald and Braun, 2007: Ch
2.) For example, consider Figures 1.1 and 1.2. The first shows the distrib
of SAT scores for recent applicants to a major university, who self-ide
as “Hispanic.” The second shows the distribution of SAT scores for r
applicants to that same university, who self-identify as “Asian.”
Fig. 1.1. Distribution of SAT scores for Hispanic applicants.
It is clear that the two distributions differ substantially. The Asian
tribution is shifted to the right, leading to a distribution with a higher
(1227 compared to 1072), a smaller standard deviation (170 compared to
and greater skewing. A comparative description of the two histograms
constitutes a proper regression analysis. Using various summary stati
some key features of the two displays are compared and contrasted (
1
“Statistical Learning from a Regression Perspective”, Berk (2008)
Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
Examples1
aph more legible.
2e+04 4e+04 6e+04 8e+04 1e+05
8001000120014001600
SAT Score by Household Income
Income Bounded at $100,000
SATScore
Fig. 1.4. SAT scores by family income.1
“Statistical Learning from a Regression Perspective”, Berk (2008)
Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
Examples1
6 1 Regression Framework
1234
400 600 800 1000 1200 1400 1600
400 600 800 1000 1200 1400 1600 400 600 800 1000 1200 1400 1600
1234
FreshmanGPA
0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
High School GPA
Fig. 1.5. Freshman GPA on SAT holding high school GPA constant.1
“Statistical Learning from a Regression Perspective”, Berk (2008)
Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
Framework
• Specify the outcome and predictors, along with the form of
the model relating them
• Define a loss function that quantifies how close a model’s
predictions are to observed outcomes
• Develop an algorithm to fit the model to the observations by
minimizing this loss
• Assess model performance and interpret results.
Jake Hofman (Columbia University) Regression February 24, 2017 6 / 6

More Related Content

What's hot

Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationColleen Farrelly
 
Logistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsLogistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsColleen Farrelly
 
1.2 Data Classification
1.2 Data Classification1.2 Data Classification
1.2 Data Classificationleblance
 
Strong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional DataStrong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional Datasahirbhatnagar
 
1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempola1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempolaRising Media, Inc.
 
Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design Peter Kenny
 
Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1Debra Wallace
 
Detecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.PptDetecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.Pptbarthriley
 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Laurens De Vocht
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysisWansuklangk
 
Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.DrBarada Mohanty
 
Fitting and understanding Multilevel Models-Andrew Gelman
 Fitting and understanding Multilevel Models-Andrew Gelman  Fitting and understanding Multilevel Models-Andrew Gelman
Fitting and understanding Multilevel Models-Andrew Gelman Deepak Kumar
 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataTianfan Song
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lectureShreyas S K
 
Detecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice ModelDetecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice Modelkosby2000
 

What's hot (20)

Mixed models
Mixed modelsMixed models
Mixed models
 
Hierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validationHierarchical clustering and topology for psychometric validation
Hierarchical clustering and topology for psychometric validation
 
Logistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerationsLogistic regression: topological and geometric considerations
Logistic regression: topological and geometric considerations
 
1.2 Data Classification
1.2 Data Classification1.2 Data Classification
1.2 Data Classification
 
Strong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional DataStrong Heredity Models in High Dimensional Data
Strong Heredity Models in High Dimensional Data
 
1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempola1645 track2 brandenburger_lempola
1645 track2 brandenburger_lempola
 
Combined queries
Combined queriesCombined queries
Combined queries
 
Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design Data-analytic sins in property-based molecular design
Data-analytic sins in property-based molecular design
 
A1 m deon
A1 m deonA1 m deon
A1 m deon
 
Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1Elementary Statistics Picturing the World ch01.1
Elementary Statistics Picturing the World ch01.1
 
Detecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.PptDetecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
Detecting Dif Between Conventional And Computerized Adaptive Testing.Ppt
 
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
Benchmarking the Effectiveness of Associating Chains of Links for Exploratory...
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.Advanced Methods of Statistical Analysis used in Animal Breeding.
Advanced Methods of Statistical Analysis used in Animal Breeding.
 
Multicollinearity
MulticollinearityMulticollinearity
Multicollinearity
 
Fitting and understanding Multilevel Models-Andrew Gelman
 Fitting and understanding Multilevel Models-Andrew Gelman  Fitting and understanding Multilevel Models-Andrew Gelman
Fitting and understanding Multilevel Models-Andrew Gelman
 
Discriminant analysis
Discriminant analysisDiscriminant analysis
Discriminant analysis
 
Statistical Methods to Handle Missing Data
Statistical Methods to Handle Missing DataStatistical Methods to Handle Missing Data
Statistical Methods to Handle Missing Data
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lecture
 
Detecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice ModelDetecting Attributes and Covariates Interaction in Discrete Choice Model
Detecting Attributes and Covariates Interaction in Discrete Choice Model
 

Viewers also liked

Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in Rjakehofman
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overviewjakehofman
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scalejakehofman
 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experimentsjakehofman
 
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIjakehofman
 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regressionjakehofman
 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part Ijakehofman
 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classificationjakehofman
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wranglingjakehofman
 
Computational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIComputational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIjakehofman
 
Computational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part IComputational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part Ijakehofman
 
Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part Ijakehofman
 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIjakehofman
 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Countingjakehofman
 
Chapter 10 be wto_agriculture
Chapter 10 be wto_agricultureChapter 10 be wto_agriculture
Chapter 10 be wto_agricultureHo Cao Viet
 
Capoeira power point- IKT
Capoeira power point- IKTCapoeira power point- IKT
Capoeira power point- IKTlanag
 

Viewers also liked (20)

Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in R
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overview
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scale
 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experiments
 
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part II
 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regression
 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part I
 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classification
 
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data WranglingComputational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 09: Data Wrangling
 
Computational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIComputational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part II
 
Computational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part IComputational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part I
 
Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part I
 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part II
 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Counting
 
Chapter 10 be wto_agriculture
Chapter 10 be wto_agricultureChapter 10 be wto_agriculture
Chapter 10 be wto_agriculture
 
Capoeira power point- IKT
Capoeira power point- IKTCapoeira power point- IKT
Capoeira power point- IKT
 
Death2
Death2Death2
Death2
 
Bx Lite
Bx LiteBx Lite
Bx Lite
 
La ecologia social de hoy
La ecologia social de hoyLa ecologia social de hoy
La ecologia social de hoy
 

Similar to Modeling Social Data, Lecture 6: Regression, Part 1

Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
 
Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantPRAJAKTASAWANT33
 
Module5.slp
Module5.slpModule5.slp
Module5.slpGimylin
 
Correlation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter PlotCorrelation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter PlotDenzelMontuya1
 
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docxAssignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docxdeanmtaylor1545
 
Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayCrystal Alvarez
 
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...inventionjournals
 
1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docx1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docxjackiewalcutt
 
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docxChapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docxtiffanyd4
 
Add slides
Add slidesAdd slides
Add slidesRupa D
 
Data Description-Numerical Measure-Chap003 2 2.ppt
Data Description-Numerical Measure-Chap003 2 2.pptData Description-Numerical Measure-Chap003 2 2.ppt
Data Description-Numerical Measure-Chap003 2 2.pptArkoKesha
 
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxRunning head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxhealdkathaleen
 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxdaniahendric
 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxSALU18
 
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...Nicha Tatsaneeyapan
 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression Dr Athar Khan
 

Similar to Modeling Social Data, Lecture 6: Regression, Part 1 (20)

Ekonometrika
EkonometrikaEkonometrika
Ekonometrika
 
Module5.slp
Module5.slpModule5.slp
Module5.slp
 
Module5.slp
Module5.slpModule5.slp
Module5.slp
 
NON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta SawantNON-PARAMETRIC TESTS by Prajakta Sawant
NON-PARAMETRIC TESTS by Prajakta Sawant
 
Module5.slp
Module5.slpModule5.slp
Module5.slp
 
Correlation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter PlotCorrelation: Bivariate Data and Scatter Plot
Correlation: Bivariate Data and Scatter Plot
 
exercises.pdf
exercises.pdfexercises.pdf
exercises.pdf
 
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docxAssignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
Assignment 1case study 6.1.jpgAssignment 1case study 6.1-1.docx
 
Evaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis EssayEvaluation Of A Correlation Analysis Essay
Evaluation Of A Correlation Analysis Essay
 
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
Predicting an Applicant Status Using Principal Component, Discriminant and Lo...
 
1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docx1. Consider the following partially completed computer printout fo.docx
1. Consider the following partially completed computer printout fo.docx
 
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docxChapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
Chapter6Chapter Guides.pdfIBM SPSS for Introductory Sta.docx
 
Add slides
Add slidesAdd slides
Add slides
 
Data Description-Numerical Measure-Chap003 2 2.ppt
Data Description-Numerical Measure-Chap003 2 2.pptData Description-Numerical Measure-Chap003 2 2.ppt
Data Description-Numerical Measure-Chap003 2 2.ppt
 
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docxRunning head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
Running head DATA ANALYSIS1DATA ANALYSIS 7Dat.docx
 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docx
 
A. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docxA. Section 7.1 Use Table 7-7 to complete t.docx
A. Section 7.1 Use Table 7-7 to complete t.docx
 
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
Estimating ambiguity preferences and perceptions in multiple prior models: Ev...
 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
 
Ordinal logistic regression
Ordinal logistic regression Ordinal logistic regression
Ordinal logistic regression
 

More from jakehofman

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2jakehofman
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1jakehofman
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networksjakehofman
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classificationjakehofman
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systemsjakehofman
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scalejakehofman
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studiesjakehofman
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Sciencejakehofman
 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbitjakehofman
 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10jakehofman
 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09jakehofman
 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brainjakehofman
 

More from jakehofman (14)

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networks
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classification
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systems
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayes
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scale
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studies
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Science
 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbit
 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09
 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brain
 

Recently uploaded

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...pradhanghanshyam7136
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structuredhanjurrannsibayan2
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdfssuserdda66b
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 

Recently uploaded (20)

Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 

Modeling Social Data, Lecture 6: Regression, Part 1

  • 1. Regression APAM E4990 Modeling Social Data Jake Hofman Columbia University February 24, 2017 Jake Hofman (Columbia University) Regression February 24, 2017 1 / 6
  • 2. Definition ? Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
  • 3. Definition Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
  • 4. Definition “The primary goal in a regression analysis is to understand, as far as possible with the available data, how the conditional distribution of the response varies across subpopulations determined by the possible values of the predictor or predictors.” -“Applied Regression Including Computing and Graphics” Cook & Weisberg (1999) Jake Hofman (Columbia University) Regression February 24, 2017 2 / 6
  • 5. Goals Describe Provide a compact summary of outcomes under different conditions Predict Make forecasts for future outcomes or unobserved conditions Explain Account for associations between predictors and outcomes Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
  • 6. Goals Describe Provide a compact summary of outcomes under different conditions Never “false”, but may be wasteful or misleading Predict Make forecasts for future outcomes or unobserved conditions Varying degrees of success, often room for improvement Explain Account for associations between predictors and outcomes Difficult to establish causality in observational studies See “Regression Analysis: A Constructive Critique”, Berk (2004) Jake Hofman (Columbia University) Regression February 24, 2017 3 / 6
  • 7. Goals Models should be flexible enough to describe observed phenomena but simple enough to generalize to future observations Jake Hofman (Columbia University) Regression February 24, 2017 4 / 6
  • 8. Examples1 1.2 Setting the Regression Context 3 Should one be especially interested in a comparison of the means, one could roceed descriptively with a conventional least squares regression analysis as special case. That is, for each observation i, one could let ˆyi = β0 + β1xi, (1.1) here the response variable y is each applicant’s SAT score, x is an indicator Fig. 1.2. Distribution of SAT scores for Asian applicants. SAT Scores for Asian Applicants SAT Score Frequency 600 800 1000 1200 1400 1600 050100150 of some response y varies across subpopulations determined by the po values of the predictor or predictors” (Cook and Weisberg, 1999: 27). is, interest centers on the distribution of the response variable Y conditi on one or more predictors X. This definition includes a wide variety of elementary procedures e implemented in R. (See, for example, Maindonald and Braun, 2007: Ch 2.) For example, consider Figures 1.1 and 1.2. The first shows the distrib of SAT scores for recent applicants to a major university, who self-ide as “Hispanic.” The second shows the distribution of SAT scores for r applicants to that same university, who self-identify as “Asian.” Fig. 1.1. Distribution of SAT scores for Hispanic applicants. It is clear that the two distributions differ substantially. The Asian tribution is shifted to the right, leading to a distribution with a higher (1227 compared to 1072), a smaller standard deviation (170 compared to and greater skewing. A comparative description of the two histograms constitutes a proper regression analysis. Using various summary stati some key features of the two displays are compared and contrasted ( 1 “Statistical Learning from a Regression Perspective”, Berk (2008) Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
  • 9. Examples1 aph more legible. 2e+04 4e+04 6e+04 8e+04 1e+05 8001000120014001600 SAT Score by Household Income Income Bounded at $100,000 SATScore Fig. 1.4. SAT scores by family income.1 “Statistical Learning from a Regression Perspective”, Berk (2008) Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
  • 10. Examples1 6 1 Regression Framework 1234 400 600 800 1000 1200 1400 1600 400 600 800 1000 1200 1400 1600 400 600 800 1000 1200 1400 1600 1234 FreshmanGPA 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 High School GPA Fig. 1.5. Freshman GPA on SAT holding high school GPA constant.1 “Statistical Learning from a Regression Perspective”, Berk (2008) Jake Hofman (Columbia University) Regression February 24, 2017 5 / 6
  • 11. Framework • Specify the outcome and predictors, along with the form of the model relating them • Define a loss function that quantifies how close a model’s predictions are to observed outcomes • Develop an algorithm to fit the model to the observations by minimizing this loss • Assess model performance and interpret results. Jake Hofman (Columbia University) Regression February 24, 2017 6 / 6