SlideShare une entreprise Scribd logo
1  sur  15
 Purpose  – Determine if one or more IVs
  can predict a DV
 Examples:
  • Does your height (IV) predict how much money
    you will spend (DV)?
  • Does the number of store managers predict how
    often the machine will break down (DV)?
  • Does the number of clicks (IV1) and the number of
    comments (IV2) on the blog predict the size of
    revenue (DV)?
Research Question                    Inferential Statistics
Compare means of 2 numeric           T test
variables
Relate 2 categorical variables       Pearson Chi Square
Relate 2 numeric variables           Pearson Correlation r
Use 1+ IVs to explain 1 numeric DV   Regression
 Correlation     tells us how X relates to Y (in
  the past)
 Simple Regression tells us how X
  predicts Y (in the future)
    • E.g., Does AvgDailyClicks predict
     DirectSalesRevenue?
   Multiple Regression tells us how X1, X2,
    X3, ….. predicts Y
    • E.g., Do NumberBlogAuthors & AvgDailyClicks
     predict SponsorRevenue?
 The  relationship between Xs and Y are
  linear
 If you have 2 or more Xs, they are not
  perfectly correlated with each other
 Xs are not correlated with external
  variables
 Independence – Any two observations
  should be independent from each other.
 Errors are normally distributed
 And a few others
 Example:Does Number of Stupid
  Customers predict Self Checkout Error
  Rate?
 When we use X to predict Y:
  • X = the predictor = the independent variable (IV)
  • Y = the predicted value = the dependent variable (the
    value of Y depends on the predictor X) (DV)
  • You’re basically building a linear model between X and
    Y:
   Y = Constant + B*X + error
   Y = Constant + B*X + error
   Y=     1      + 2*X




                                        Constant = 1




                                 Slope B = 2



                                                   Source: wikepedia
Who is the best fitting model?
    (Hint: Not Kate Moss)
Line that’s closest to all dots
Goodness of Fit (R2):
                 How well does the line fit the data?
             (How well does Kate fit the average woman?)

(constant)



                Slope B




                       Distances to regression line = error
                             Good fit = small errors
   Y = Constant + B*X + error
     DirectSalesRevenue = 19.466-.003*AvgDailyClicks+error

                                     Constant is significantly greater than zero




                                       Slope (-.003) is significantly less than zero




Goodness of Fit (R2): Model explains 59% variations in DirectSalesRevenue
The number of average daily clicks
significantly predicted direct sales revenue, b
= -.03, t(39) = 14.72, p < .001. The number of
average daily clicks also explained a
significant proportion of variance in direct
sales revenue, R2 = .59, F(1, 38) = 42.64, p <
.001. These findings suggest that, websites
with more average daily clicks tend to have
lower direct sales revenue level.
Y=200X (R2 = 45%)
Given any X, we can predict value of Y with 45% accuracy
   Assumptions: Xs are somewhat independent; Y values are
    independent; Y values are normally distributed; errors are normally
    distributed; X Y relations are linear; no outliers
     • Example: Time series data are NOT independent – stock price today depends on
       stock price yesterday which depends on stock price the day before, etc.
   Multiple regression is just an extension of single regression
     • Use multiple Xs (e.g., both AvgDailyClicks and NumberAuthors) to
       predict Y
     • When you have a condition (e.g., customer choice depends on gender;
       brand awareness depends on comm. channel; number of applications
       depends on program of study), you need to create an interaction term 
       next class
   When an X is categorical (e.g., whether the blog host is Google or
    WordPress): Code X in numbers – e.g., 0 is Google, 1 is WordPress
   When Y is categorical (e.g., whether the blog won the Outstanding
    Blog Award): Code Y in numbers – e.g. 0 is No, 1 is Yes, and use
    Logistic Regression
   What is your Y (the value you want to predict)?
   Is your Y categorical?  Do you need Logistic Regression?
    See the instructor for help
   What is your X (your predictor variable)? How many Xs do
    you have?
   Is any of your Xs categorical?  Do you have a coding
    scheme?
   Do you have a condition? (e.g., customer choice depends
    on gender; brand awareness depends on comm. channel;
    number of applications depends on program of study) 
    See the instructor for help
Research Question                      Inferential Statistics
Compare means of 2 numeric variables   T test
Relate 2 numeric variables             Pearson Correlation r
Relate 2 categorical variables         Pearson Chi Square
Use 1+ IVs to explain 1 numeric DV     Regression

Contenu connexe

En vedette

Lermontov
LermontovLermontov
Lermontov
Armine
 
Manpower group idm-platform
Manpower group idm-platformManpower group idm-platform
Manpower group idm-platform
OracleIDM
 
んだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LT
んだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LTんだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LT
んだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LT
Eikichi Gotoh
 
Skin cancer exposed - Tools' information
Skin cancer exposed - Tools' informationSkin cancer exposed - Tools' information
Skin cancer exposed - Tools' information
Xplore Health
 
Romanticism
RomanticismRomanticism
Romanticism
ms_faris
 
The civil war, lincoln, lee
The civil war, lincoln, leeThe civil war, lincoln, lee
The civil war, lincoln, lee
ms_faris
 
Working progress preliminary task
Working progress preliminary taskWorking progress preliminary task
Working progress preliminary task
aq101824
 
Meltwater Buzz Service Overview
Meltwater Buzz Service OverviewMeltwater Buzz Service Overview
Meltwater Buzz Service Overview
ammit0724
 
마이클 수업 과제2 1
마이클 수업 과제2 1마이클 수업 과제2 1
마이클 수업 과제2 1
문정 최
 
Building a Website from Planning to Photoshop Mockup to HTML/CSS
Building a Website from Planning to Photoshop Mockup to HTML/CSSBuilding a Website from Planning to Photoshop Mockup to HTML/CSS
Building a Website from Planning to Photoshop Mockup to HTML/CSS
hstryk
 

En vedette (16)

Messen mit PHP
Messen mit PHPMessen mit PHP
Messen mit PHP
 
Con8828 justifying and planning a successful identity management upgrade final
Con8828 justifying and planning a successful identity management upgrade finalCon8828 justifying and planning a successful identity management upgrade final
Con8828 justifying and planning a successful identity management upgrade final
 
Lermontov
LermontovLermontov
Lermontov
 
Manpower group idm-platform
Manpower group idm-platformManpower group idm-platform
Manpower group idm-platform
 
んだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LT
んだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LTんだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LT
んだっちゃだれ Sendai.pm - YAPC::Asia Tokyo 2013 LT
 
MAHALAKSHMI
MAHALAKSHMIMAHALAKSHMI
MAHALAKSHMI
 
Skin cancer exposed - Tools' information
Skin cancer exposed - Tools' informationSkin cancer exposed - Tools' information
Skin cancer exposed - Tools' information
 
Menciones
MencionesMenciones
Menciones
 
Romanticism
RomanticismRomanticism
Romanticism
 
The civil war, lincoln, lee
The civil war, lincoln, leeThe civil war, lincoln, lee
The civil war, lincoln, lee
 
Working progress preliminary task
Working progress preliminary taskWorking progress preliminary task
Working progress preliminary task
 
Cayla t
Cayla tCayla t
Cayla t
 
Being business minded
Being business mindedBeing business minded
Being business minded
 
Meltwater Buzz Service Overview
Meltwater Buzz Service OverviewMeltwater Buzz Service Overview
Meltwater Buzz Service Overview
 
마이클 수업 과제2 1
마이클 수업 과제2 1마이클 수업 과제2 1
마이클 수업 과제2 1
 
Building a Website from Planning to Photoshop Mockup to HTML/CSS
Building a Website from Planning to Photoshop Mockup to HTML/CSSBuilding a Website from Planning to Photoshop Mockup to HTML/CSS
Building a Website from Planning to Photoshop Mockup to HTML/CSS
 

Similaire à S6 w2 linear regression

Marketing Engineering Notes
Marketing Engineering NotesMarketing Engineering Notes
Marketing Engineering Notes
Felipe Affonso
 
Convenience shoppingSTAT-S301Fall 2019Question Set 1.docx
Convenience shoppingSTAT-S301Fall 2019Question Set 1.docxConvenience shoppingSTAT-S301Fall 2019Question Set 1.docx
Convenience shoppingSTAT-S301Fall 2019Question Set 1.docx
bobbywlane695641
 
Statistics at HelpWithAssignment.com Regression Example
Statistics at HelpWithAssignment.com Regression ExampleStatistics at HelpWithAssignment.com Regression Example
Statistics at HelpWithAssignment.com Regression Example
HelpWithAssignment.com
 

Similaire à S6 w2 linear regression (20)

Quantitative Analysis Homework Help
Quantitative Analysis Homework HelpQuantitative Analysis Homework Help
Quantitative Analysis Homework Help
 
Linear_Regression
Linear_RegressionLinear_Regression
Linear_Regression
 
Linear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | EdurekaLinear Regression vs Logistic Regression | Edureka
Linear Regression vs Logistic Regression | Edureka
 
Machine_Learning.pptx
Machine_Learning.pptxMachine_Learning.pptx
Machine_Learning.pptx
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
 
Multiple Linear Regression
Multiple Linear Regression Multiple Linear Regression
Multiple Linear Regression
 
Marketing Engineering Notes
Marketing Engineering NotesMarketing Engineering Notes
Marketing Engineering Notes
 
Supervised learning - Linear and Logistic Regression( AI, ML)
Supervised learning - Linear and Logistic Regression( AI, ML)Supervised learning - Linear and Logistic Regression( AI, ML)
Supervised learning - Linear and Logistic Regression( AI, ML)
 
Big data camp la futures so bright tim-shea
Big data camp la   futures so bright tim-sheaBig data camp la   futures so bright tim-shea
Big data camp la futures so bright tim-shea
 
08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)08 Inference for Networks – DYAD Model Overview (2017)
08 Inference for Networks – DYAD Model Overview (2017)
 
Convenience shoppingSTAT-S301Fall 2019Question Set 1.docx
Convenience shoppingSTAT-S301Fall 2019Question Set 1.docxConvenience shoppingSTAT-S301Fall 2019Question Set 1.docx
Convenience shoppingSTAT-S301Fall 2019Question Set 1.docx
 
Linear regression Word of the Week
Linear regression Word of the WeekLinear regression Word of the Week
Linear regression Word of the Week
 
Introduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regressionIntroduction to machine learning and model building using linear regression
Introduction to machine learning and model building using linear regression
 
Statistics at HelpWithAssignment.com Regression Example
Statistics at HelpWithAssignment.com Regression ExampleStatistics at HelpWithAssignment.com Regression Example
Statistics at HelpWithAssignment.com Regression Example
 
Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922Summer 07-mfin7011-tang1922
Summer 07-mfin7011-tang1922
 
Software Measurement: Lecture 1. Measures and Metrics
Software Measurement: Lecture 1. Measures and MetricsSoftware Measurement: Lecture 1. Measures and Metrics
Software Measurement: Lecture 1. Measures and Metrics
 
Chapter 4 - multiple regression
Chapter 4  - multiple regressionChapter 4  - multiple regression
Chapter 4 - multiple regression
 
Exploring relationships
Exploring relationshipsExploring relationships
Exploring relationships
 
Basics of Measurements.pdf
Basics of Measurements.pdfBasics of Measurements.pdf
Basics of Measurements.pdf
 
Basics of Measurements.pdf
Basics of Measurements.pdfBasics of Measurements.pdf
Basics of Measurements.pdf
 

Plus de Rachel Chung

Chatham mba open house (10 5 2013 rc)
Chatham mba open house (10 5 2013 rc)Chatham mba open house (10 5 2013 rc)
Chatham mba open house (10 5 2013 rc)
Rachel Chung
 
S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t test
Rachel Chung
 
Session 3 week 2 central tendency & dispersion 13 sp
Session 3 week 2   central tendency & dispersion 13 spSession 3 week 2   central tendency & dispersion 13 sp
Session 3 week 2 central tendency & dispersion 13 sp
Rachel Chung
 
Session 3 week 2 central tendency & dispersion
Session 3 week 2   central tendency & dispersionSession 3 week 2   central tendency & dispersion
Session 3 week 2 central tendency & dispersion
Rachel Chung
 
Session 3 week 2 central tendency & dispersion
Session 3 week 2   central tendency & dispersionSession 3 week 2   central tendency & dispersion
Session 3 week 2 central tendency & dispersion
Rachel Chung
 
Mba724 s4 2 writing up the final report
Mba724 s4 2 writing up the final reportMba724 s4 2 writing up the final report
Mba724 s4 2 writing up the final report
Rachel Chung
 
Writing up the final report (narrated)
Writing up the final report (narrated)Writing up the final report (narrated)
Writing up the final report (narrated)
Rachel Chung
 
Mba724 s4 2 correlation
Mba724 s4 2 correlationMba724 s4 2 correlation
Mba724 s4 2 correlation
Rachel Chung
 
Mba724 s4 4 questionnaire design
Mba724 s4 4 questionnaire designMba724 s4 4 questionnaire design
Mba724 s4 4 questionnaire design
Rachel Chung
 
Mba724 s4 3 survey methodology
Mba724 s4 3 survey methodologyMba724 s4 3 survey methodology
Mba724 s4 3 survey methodology
Rachel Chung
 
Mba724 s4 2 qualitative research
Mba724 s4 2 qualitative researchMba724 s4 2 qualitative research
Mba724 s4 2 qualitative research
Rachel Chung
 
Mba724 s4 1 qualitative vs. quantitative research
Mba724 s4 1 qualitative vs. quantitative researchMba724 s4 1 qualitative vs. quantitative research
Mba724 s4 1 qualitative vs. quantitative research
Rachel Chung
 
Mba724 s3 1 writing a lit review (based on caa workshop)
Mba724 s3 1 writing a lit review (based on caa workshop)Mba724 s3 1 writing a lit review (based on caa workshop)
Mba724 s3 1 writing a lit review (based on caa workshop)
Rachel Chung
 
MBA724 s6 w1 experimental design
MBA724 s6 w1 experimental designMBA724 s6 w1 experimental design
MBA724 s6 w1 experimental design
Rachel Chung
 
Mff715 w1 0_course_intro_fall11
Mff715 w1 0_course_intro_fall11Mff715 w1 0_course_intro_fall11
Mff715 w1 0_course_intro_fall11
Rachel Chung
 
Mff715 w1 2_generating_researchideas_fall11
Mff715 w1 2_generating_researchideas_fall11Mff715 w1 2_generating_researchideas_fall11
Mff715 w1 2_generating_researchideas_fall11
Rachel Chung
 
Mff715 w1 1_introto_research_fall11
Mff715 w1 1_introto_research_fall11Mff715 w1 1_introto_research_fall11
Mff715 w1 1_introto_research_fall11
Rachel Chung
 
Mba724 s3 2 elements of research design v2
Mba724 s3 2 elements of research design v2Mba724 s3 2 elements of research design v2
Mba724 s3 2 elements of research design v2
Rachel Chung
 
Mba724 s3 w2 central tendency & dispersion (chung)
Mba724 s3 w2   central tendency & dispersion (chung)Mba724 s3 w2   central tendency & dispersion (chung)
Mba724 s3 w2 central tendency & dispersion (chung)
Rachel Chung
 

Plus de Rachel Chung (20)

Chatham mba open house (10 5 2013 rc)
Chatham mba open house (10 5 2013 rc)Chatham mba open house (10 5 2013 rc)
Chatham mba open house (10 5 2013 rc)
 
S5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t testS5 w1 hypothesis testing & t test
S5 w1 hypothesis testing & t test
 
Session 3 week 2 central tendency & dispersion 13 sp
Session 3 week 2   central tendency & dispersion 13 spSession 3 week 2   central tendency & dispersion 13 sp
Session 3 week 2 central tendency & dispersion 13 sp
 
Session 3 week 2 central tendency & dispersion
Session 3 week 2   central tendency & dispersionSession 3 week 2   central tendency & dispersion
Session 3 week 2 central tendency & dispersion
 
Session 3 week 2 central tendency & dispersion
Session 3 week 2   central tendency & dispersionSession 3 week 2   central tendency & dispersion
Session 3 week 2 central tendency & dispersion
 
Mba724 s4 2 writing up the final report
Mba724 s4 2 writing up the final reportMba724 s4 2 writing up the final report
Mba724 s4 2 writing up the final report
 
Writing up the final report (narrated)
Writing up the final report (narrated)Writing up the final report (narrated)
Writing up the final report (narrated)
 
Mba724 s4 2 correlation
Mba724 s4 2 correlationMba724 s4 2 correlation
Mba724 s4 2 correlation
 
Mba724 s4 4 questionnaire design
Mba724 s4 4 questionnaire designMba724 s4 4 questionnaire design
Mba724 s4 4 questionnaire design
 
Mba724 s4 3 survey methodology
Mba724 s4 3 survey methodologyMba724 s4 3 survey methodology
Mba724 s4 3 survey methodology
 
Mba724 s4 2 qualitative research
Mba724 s4 2 qualitative researchMba724 s4 2 qualitative research
Mba724 s4 2 qualitative research
 
Mba724 s4 1 qualitative vs. quantitative research
Mba724 s4 1 qualitative vs. quantitative researchMba724 s4 1 qualitative vs. quantitative research
Mba724 s4 1 qualitative vs. quantitative research
 
Mba724 s3 1 writing a lit review (based on caa workshop)
Mba724 s3 1 writing a lit review (based on caa workshop)Mba724 s3 1 writing a lit review (based on caa workshop)
Mba724 s3 1 writing a lit review (based on caa workshop)
 
S6 w2 chi square
S6 w2 chi squareS6 w2 chi square
S6 w2 chi square
 
MBA724 s6 w1 experimental design
MBA724 s6 w1 experimental designMBA724 s6 w1 experimental design
MBA724 s6 w1 experimental design
 
Mff715 w1 0_course_intro_fall11
Mff715 w1 0_course_intro_fall11Mff715 w1 0_course_intro_fall11
Mff715 w1 0_course_intro_fall11
 
Mff715 w1 2_generating_researchideas_fall11
Mff715 w1 2_generating_researchideas_fall11Mff715 w1 2_generating_researchideas_fall11
Mff715 w1 2_generating_researchideas_fall11
 
Mff715 w1 1_introto_research_fall11
Mff715 w1 1_introto_research_fall11Mff715 w1 1_introto_research_fall11
Mff715 w1 1_introto_research_fall11
 
Mba724 s3 2 elements of research design v2
Mba724 s3 2 elements of research design v2Mba724 s3 2 elements of research design v2
Mba724 s3 2 elements of research design v2
 
Mba724 s3 w2 central tendency & dispersion (chung)
Mba724 s3 w2   central tendency & dispersion (chung)Mba724 s3 w2   central tendency & dispersion (chung)
Mba724 s3 w2 central tendency & dispersion (chung)
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

S6 w2 linear regression

  • 1.  Purpose – Determine if one or more IVs can predict a DV  Examples: • Does your height (IV) predict how much money you will spend (DV)? • Does the number of store managers predict how often the machine will break down (DV)? • Does the number of clicks (IV1) and the number of comments (IV2) on the blog predict the size of revenue (DV)?
  • 2. Research Question Inferential Statistics Compare means of 2 numeric T test variables Relate 2 categorical variables Pearson Chi Square Relate 2 numeric variables Pearson Correlation r Use 1+ IVs to explain 1 numeric DV Regression
  • 3.  Correlation tells us how X relates to Y (in the past)  Simple Regression tells us how X predicts Y (in the future) • E.g., Does AvgDailyClicks predict DirectSalesRevenue?  Multiple Regression tells us how X1, X2, X3, ….. predicts Y • E.g., Do NumberBlogAuthors & AvgDailyClicks predict SponsorRevenue?
  • 4.  The relationship between Xs and Y are linear  If you have 2 or more Xs, they are not perfectly correlated with each other  Xs are not correlated with external variables  Independence – Any two observations should be independent from each other.  Errors are normally distributed  And a few others
  • 5.  Example:Does Number of Stupid Customers predict Self Checkout Error Rate?  When we use X to predict Y: • X = the predictor = the independent variable (IV) • Y = the predicted value = the dependent variable (the value of Y depends on the predictor X) (DV) • You’re basically building a linear model between X and Y: Y = Constant + B*X + error
  • 6. Y = Constant + B*X + error  Y= 1 + 2*X Constant = 1 Slope B = 2 Source: wikepedia
  • 7. Who is the best fitting model? (Hint: Not Kate Moss) Line that’s closest to all dots
  • 8. Goodness of Fit (R2): How well does the line fit the data? (How well does Kate fit the average woman?) (constant) Slope B Distances to regression line = error Good fit = small errors
  • 9.
  • 10. Y = Constant + B*X + error  DirectSalesRevenue = 19.466-.003*AvgDailyClicks+error Constant is significantly greater than zero Slope (-.003) is significantly less than zero Goodness of Fit (R2): Model explains 59% variations in DirectSalesRevenue
  • 11. The number of average daily clicks significantly predicted direct sales revenue, b = -.03, t(39) = 14.72, p < .001. The number of average daily clicks also explained a significant proportion of variance in direct sales revenue, R2 = .59, F(1, 38) = 42.64, p < .001. These findings suggest that, websites with more average daily clicks tend to have lower direct sales revenue level.
  • 12. Y=200X (R2 = 45%) Given any X, we can predict value of Y with 45% accuracy
  • 13. Assumptions: Xs are somewhat independent; Y values are independent; Y values are normally distributed; errors are normally distributed; X Y relations are linear; no outliers • Example: Time series data are NOT independent – stock price today depends on stock price yesterday which depends on stock price the day before, etc.  Multiple regression is just an extension of single regression • Use multiple Xs (e.g., both AvgDailyClicks and NumberAuthors) to predict Y • When you have a condition (e.g., customer choice depends on gender; brand awareness depends on comm. channel; number of applications depends on program of study), you need to create an interaction term  next class  When an X is categorical (e.g., whether the blog host is Google or WordPress): Code X in numbers – e.g., 0 is Google, 1 is WordPress  When Y is categorical (e.g., whether the blog won the Outstanding Blog Award): Code Y in numbers – e.g. 0 is No, 1 is Yes, and use Logistic Regression
  • 14. What is your Y (the value you want to predict)?  Is your Y categorical?  Do you need Logistic Regression? See the instructor for help  What is your X (your predictor variable)? How many Xs do you have?  Is any of your Xs categorical?  Do you have a coding scheme?  Do you have a condition? (e.g., customer choice depends on gender; brand awareness depends on comm. channel; number of applications depends on program of study)  See the instructor for help
  • 15. Research Question Inferential Statistics Compare means of 2 numeric variables T test Relate 2 numeric variables Pearson Correlation r Relate 2 categorical variables Pearson Chi Square Use 1+ IVs to explain 1 numeric DV Regression

Notes de l'éditeur

  1. This slide is self explanatoryMake sure you can recognize a research question that can be answered by simple linear regression – they are all predictive in nature
  2. This is a review slide – Again this table shows you where regression is in the world of inferential statistics
  3. Inferential statistics are more powerful when they can help us predict the future
  4. It’s totally possible that the relationship between two variables is NOT linearCheck your scatterplots first to make sure the relationship looks somewhat linear; otherwise simple linear regression method should NOT be usedFor example, if you want to see if gender (IV1) and race (IV2) predict spending (DV). However in your sample, all men are Caucasian and all women are African American (perfect correlation between gender and race) – then you will NOT be able to run regressionFor example, if you want to see if eating ice cream (IV) causes people to go to the beach more often (DV). You probably will find a positive relationship, however the IV correlates with an external variable (temperature) which causes variance in your DV (it determines whether people go to beach or not.) In this case running a regression would not make senseFor example, if you interview 20 men and 30 women, but turned out that 2 of the 20 men are the same person being interviewed twice! Then the “independence” principle is violatedAfter the regression model is built, if you can still see a recognizable pattern in the errors, then the model is not good enough. The model should capture the trend of the data completely and leave behind completely random errors
  5. This is the example used in Individual Assignment 6
  6. To understand regression you need to first understand how a straight line is expressed mathematically1. All straight lines can be expressed in mathematical terms in terms of a constant and a slope2. We use y=2x+1 as an example
  7. Regression is like what Ralph Lauren and Armani do everyday – finding the runway model that fits the best(note: Kate Moss is one of the most prototypical runway models)
  8. Like Kate Moss (or other run way models), the regression line represents an idealized version of the real worldThe reality is the messy data we collected (aka the dots)The line is an idealized model that best represents the messy-data realityA good model, in the world of statistics, is close to reality. The goal is to minimize the difference between the model and the realityWhen a regression model represents the real world well, the errors (distances from the dots to the lines) are minimal. The goodness of fit measure, or R square, is large.
  9. Accordingto this definition of “goodness of fit”, Kate Moss is a really bad model (poor goodness of hit, large errors, R-square would be very small)Your goal is to do better than Kate Moss!
  10. In this case, Average Daily Clicks significantly predicted Direct Sales Revenue.However, the beta coefficient (the slope) is negative – this means the more the clicks the lower the revenueThe model fit is pretty good
  11. Get the total df value (39 in this case) from the ANOVA table
  12. So with regression you can get a model which is a line with a linear function (Y = BX). This means that given any X we can predict the value of Y. For example, if we would like to see if number of clicks (X) predicts revenue (Y). We get a regression line which is Y = 200X with a R square of 45%This means that given any number of clicks, we can predict the expected revenue level with 45% accuracy.Perhaps this is a regression model based on data of 5-20 clicks. But because the linear line can be extended infinitely to the upper right corner of a graph, we can predict with 45% accuracy that, when we get 1000 clicks, our revenue will be $200,000!
  13. Thereis a LOT more to regression that what we discussed. We covered the basic concepts and you’re not expected to know more than that. However this slide gives you some ideas about other considerations when running regression
  14. Here’ssome food for thought for your group project
  15. To summarize, we have discussed 4different kinds of inferential statistics in this course:T testCorrelationChi squareRegressionHow do you know which test is appropriate for your project?Use this summary table to determineMany students often ask, which test is better than others. This question is like asking, is a pregnancy test better than a DNA test? It’s impossible to answer without knowing what’s your objective.Some people also wonder if we can use more than one test in a research study. The answer is obvious. Of course! We take the same approach to other &quot;research questions&quot; in our lives. For example, if you want to know if pregnant, you get a pregnancy test. If you want to know if you&apos;re diabetic, you get a blood test. If you want to know who&apos;s the father of your child, you get a DNA test! If you need to know answers to all 3 questions, you order all 3 tests!Again, it&apos;s all about your research question!