SlideShare une entreprise Scribd logo
1  sur  133
ANOVA andLinear Regression
Analysis of Variance(ANOVA)
BUS B272 Unit 1 Analysis of Variance The Analysis of Variance (ANOVA) is a procedure that tests to determine whether differences exist between two or more populations. The techniques analyzes the variance of the data to determine whether we can infer that the populations differ.
	One way (Single-factor) analysis of variance ANOVA assumptions F  test for difference among k  means BUS B272 Unit 1 Topics
BUS B272 Unit 1 General Experimental Setting Investigator controls one or more independent variables Called treatments or factors Each treatment contains two or more levels (or categories/classifications) Observe effects on dependent variable Response to different levels of independent variable Experimental design: the plan used to test hypothesis
BUS B272 Unit 1 Completely Randomized Design Experimental units (subjects) are assigned randomly to treatments Subjects are assumed homogeneous Only one factor or independent variable With two or more treatment levels Analyzed by One-way analysis of variance (one-way ANOVA)
BUS B272 Unit 1 Randomized Design Example   
BUS B272 Unit 1 One-way Analysis of Variance F  Test Evaluate the difference among the mean responses of 2 or more (k) populations e.g. : Several types of tires, oven 	temperature settings, different types 	of marketing strategies
BUS B272 Unit 1 ,[object Object]
This condition must be met
Populations are normally distributed
F  test is robust to moderate departure from normality
Populations have equal variancesAssumptions of ANOVA
BUS B272 Unit 1 Hypotheses of One-Way ANOVA All population means are equal  No treatment effect (no variation in means among groups) At least one population mean is different (others may be the same!)  There is treatment effect  Does not mean that all population means are different
BUS B272 Unit 1 One-way ANOVA (No Treatment Effect) The Null Hypothesis is True
BUS B272 Unit 1 One-way ANOVA (Treatment Effect Present) The Null Hypothesis is NOT True
BUS B272 Unit 1 One-way ANOVA(Partition of Total Variation) Total Variation SS(Total) Variation Due to Treatment   SST Variation Due to Random Sampling   SSE + =
BUS B272 Unit 1 ANOVA set-up
BUS B272 Unit 1 Total Variation      : the i-th observation in group j      : the number of observations in group j n   : the total number of observations in all groups k   :  the number of groups the overall or grand mean
BUS B272 Unit 1 Total Variation (continued)
BUS B272 Unit 1 Among-Treatments Variation Variation Due to Differences Among Groups
BUS B272 Unit 1 Among-Treatments Variation (continued)
BUS B272 Unit 1 Summing the variation within each treatment and then adding over all treatments. Within-Treatment Variation
BUS B272 Unit 1 Within-Treatment Variation (continued)
BUS B272 Unit 1 Within-Treatment Variation (continued) ,[object Object]
For 2 groups, use t-test.  F test is more limited.For k = 2, this is the pooled-variance in the t-test.
BUS B272 Unit 1 One-way ANOVAF  Test Statistic Test statistic: MST is mean squares among or between variances MSE is mean squares within or error variances Degrees of freedom:
BUS B272 Unit 1 One-way ANOVA Summary Table
BUS B272 Unit 1 Features of One-way ANOVA F Statistic The F statistic is the ratio of the among estimate of variance and the within estimate of variance. The ratio must always be positive  df1 = k -1 will typically be small df2 = n - k  will typically be large The ratio should be closed to 1 if the null is true.
BUS B272 Unit 1 One-way ANOVA F  Test Example As production manager, you want to see if three filling machines have different mean filling times.  You assign 15 similarly trained and experienced workers, five per machine, to the machines.  At the 0.05 significance level, is there a difference in mean filling times? Machine1Machine2Machine3	25.40	      23.40	      20.00	26.31	      21.80	      22.20	24.10	      23.50	      19.75	23.74	      22.75	      20.60	25.10	      21.60	      20.40
BUS B272 Unit 1 One-way ANOVA Example: Scatter Diagram Machine1Machine2Machine3	25.40	      23.40	      20.00	26.31	      21.80	      22.20	24.10	      23.50	      19.75	23.74	      22.75	      20.60	25.10	      21.60	      20.40 Time in Seconds 27 26 25 24 23 22 21 20 19 • • • • • • • • • • • • • • •
BUS B272 Unit 1 Machine 1Machine 2Machine 3	25.40	     23.40	      20.00	26.31	     21.80	      22.20	24.10	     23.50	      19.75	23.74	     22.75	      20.60	25.10	     21.60	      20.40 One-way ANOVA Example Computations
BUS B272 Unit 1
BUS B272 Unit 1 Summary Table MST/MSE =25.602 3-1=2 47.1640 23.5820 15-3=12 11.0532 0.9211 15-1=14 58.2172
BUS B272 Unit 1  = 0.05 F 0 One-way ANOVA Example Solution Critical Value(s): H0: 1 = 2 = 3 H1: Not all the means are equal Test Statistic:  3.89 df1= 2      df2 = 12 Reject H0 at  = 0.05 There is evidence to believe that at least one  i  differs from the rest.
BUS B272 Unit 1 Computer Application To obtain the Microsoft Excel computer output in the previous page, first enter the data into c columns in an Excel file, then follow the commands: 	Tools/ Data Analysis/ Anova: Single Factor
BUS B272 Unit 1 Computer Output using Data Analysis of Excel
Exercise 1 The manager of a large department store wants to test if the average size of customer transactions differs with four types of payment: Visa card, company card, cash or cheque. If there are differences in the average customer transaction size among the four types of payment, the manager will further investigate which types of payment will give rise to higher transaction volumes and hence he will design an appropriate promotional programme. A random sample of 54 customer transactions using various types of payment was drawn during the past two months. With reference to sampled data, the sample statistics are obtained as follows: BUS B272 Unit 1 Test if differences of average customer transaction size exist among the four types of payment at a 0.05 level of significance.
Exercise 1 BUS B272 Unit 1 One factor is involved, i.e. the type of payment. Under this factor, there are k = 4 treatments (or factor levels) which represent the four types of payment: Visa card, company card, cash and cheque. The experimental units are customer transactions.
Exercise 1 Since the test statistic of 39.16 is greater than the critical value of 2.80, reject H0. At 0.05 level of significance, there is evidence to reveal that the average customer transaction sizes are significantly different among the four types of payment.  BUS B272 Unit 1
Can ANOVA be replaced by t-Test? t-Test : any difference between two population means μ1 and μ2 Multiple t-tests are required for more than two population means Conducting multiple tests increases the probability of making Type I errors.  	E.g. compare 6 population means, if use ANOVA with significant level 5%, there will be a 5% chance we reject the null hypothesis when it is true.  	If we use t-test, we need to perform 15 tests and if same 5% significant level is set, the chance of a Type I error will be 		1 – (1 - 0.05)15 = 0.54 BUS B272 Unit 1
Linear Regression
BUS B272 Unit 1 Linear Regression Origin of regression Determining the simple linear regression equation Assessing the fitness of the model  Correlation analysis Estimation and prediction  Assumptions of regression and correlation
BUS B272 Unit 1 Origin of Regression “Regression," from a Latin root meaning "going back," is a series of statistical methods used in studying the relationship between two variables and were first employed by Francis Galton in 1877.  Galton was interested in studying the relationship between a father’s height and the son’ s height. Making use of the “regression” method, he found that son’s height regress to the overall mean and the method is then called “regression”.
BUS B272 Unit 1 Linear Regression Analysis Linear Regression analysis is used primarily to model and describe linear relationship and provide prediction among variables  Predicts the value of a dependent (response) variable based on the value of at least one independent (explanatory) variable Express statistically the effect of the independent variables on the dependent variable
BUS B272 Unit 1 Types of Regression Models Positive Linear Relationship Relationship NOT Linear Negative Linear Relationship No Relationship
BUS B272 Unit 1 Simple Linear Regression Model The relationship between two variables, sayX and Y,  is described by a linear function. The change of the variable Y, (called dependent or response variable) is associated with the change in the other variable X(called independent or explanatory variable).  Explore the dependency of Y on X.
(4, 5) (2, 2.5) (3, 2.5) (1, 2) Why Regression? The larger the sum of squares, the poor the estimate. X 1 2 3 4 Y 2 2.5 2.5 5 BUS B272 Unit 1
BUS B272 Unit 1 Linear Relationship We wish to study whether there is any association between two quantitative variables, sayX and Y If ‘Y tends to increase as X increases’  If ‘Y tends to decrease as X increases’ 	If the corresponding magnitude of increase or decrease follows a specific proportion, the relationship identified is said to be a linear one. –  apositive relationship –  anegative relationship
BUS B272 Unit 1 Scatter Diagram A scatter diagram is a graph plotted for all X-Y pairs of the sample data. By viewing a scatter diagram, one can determine whether a relationship exists between the two variables. It can also suggest the likely mathematical form of that relationship that allow one to judge initially and intuitively whether or not there exists a linear relationship between the two variables involved.
BUS B272 Unit 1 Example The level of air pollution at Kwun Tong and the total number of consultations relating to respiratory diseases in a public clinic in the area were recorded during a specific time period on 14 randomly selected days.
BUS B272 Unit 1 Population Linear Regression Population regression line is a straight line that describes the dependence of the average value (conditional mean) of one variable on the other  Random Error Population SlopeCoefficient  Population Y  intercept  Dependent (Response) Variable PopulationRegression Line   (conditional mean) Independent (Explanatory) Variable
BUS B272 Unit 1 Population Linear Regression (continued) Random Error (vertical discrepancies or residual for point i ) Y (Observed Value of Y) = (Conditional Mean) X Observed Value of Y
BUS B272 Unit 1 Least Squares Method The line fitted by least squares is the one that makes the sum of squares of all those vertical discrepancies (residuals) as small as possible, i.e. minimum of  which is the sum of squared residuals.
BUS B272 Unit 1 Sample Y  intercept Residual Sample regression line is formed by the point estimates of     and     , i.e.,     and    .  It provides an estimate of the population regression line as well as a predicted value of Y Sample Linear Regression Samplecoefficient of slope Sample regression line  (Fitted regression line or predicted value)
BUS B272 Unit 1 Sample Linear Regression (continued) and      are obtained by finding the specific values of       and      that minimizes the sum of the squared residuals
BUS B272 Unit 1 Coefficients of Sample Linear Regression For
BUS B272 Unit 1 Interpretation of the Slope and the Intercept is the average value of Y when the value of X  is zero. 		 measures the change in the average value of Y as a result of a one-unit change in X.
BUS B272 Unit 1 (continued) is the estimated average value of Y when the value of X  is zero. 	  	 is the estimated change in the average value of Y as a result of one-unit change in X. Interpretation of the Slope and the Intercept
BUS B272 Unit 1 Example 1 : Simple Linear Regression Suppose that you want to examine the linear dependency of the annual sales among seven stores on their size in square footage. Sample data for seven stores were obtained. Find the equation of the straight line that fits the data best. Annual Store	   Square 	 Sales		     Feet	($1000)    1           1,726	  3,681    2           1,542	  3,395    3	     2,816	  6,653    4	     5,555	  9,543    5	     1,292	  3,318    6	     2,208	  5,563    7	     1,313	  3,760
BUS B272 Unit 1 Example 1 : Scatter Diagram Excel Output
BUS B272 Unit 1 Computation of Regression Coefficient 	                     Annual            Square     Sales Store   Feet      ($1000)  XY    1       1,726      3,681	  	    2       1,542      3,395	  	    3	  2,816      6,653	    4	  5,555      9,543	    5	  1,292      3,318	  	    6	  2,208      5,563	    7	  1,313      3,760	  	  2,979,076  2,377,764  7,929,856 30,858,025  1,669,264  4,875,264  1,723,969 13,549,761 11,526,025 44,262,409 91,068,849 11,009,124 30,946,969 14,137,600   6,353,406    5,235,090 18,734,848 53,011,365   4,286,856 12,283,104   4,936,880 16,452 35,913 104,841,549 52,413,218 216,500,737
BUS B272 Unit 1 Computation of Regression Coefficient
BUS B272 Unit 1 Example 1 : Equation for the Sample 	Regression Line Yi = 1636.415 +1.487Xi 
BUS B272 Unit 1 Example 1 : Interpretation of Results  The slope of 1.487 means that for each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units. The model estimates that for each increase of one square foot in the size of the store, the expected annual sales are predicted to increase by $1487.
BUS B272 Unit 1 Predicting Annual Sales Based on Square Footage  Suppose that we would like to use the fitted model to predict the average annual sales for a store with 4,000 square feet.
BUS B272 Unit 1 Interpolation versus Extrapolation For using regression line for prediction purpose, it is not appropriate to make predictions beyond the relevant range (in the previous example: (1,292, 5,555)) of the independent variable. That is, we may interpolate within the relevant range of X  values, but we SHOULD NOT extrapolate beyond the range of X values. For example, it is not appropriate to predict the average annual sales for a store with 7,000 square feet since it is beyond the range of X  values, i.e., (1,292, 5,555).
BUS B272 Unit 1 Causal Relationship? In general, when there is a relationship identified between X and Y using regression analysis, we usually would say that ‘X is associated with Y’ instead of saying ‘X causes Y’. We cannot claim that two variables are related by cause and effect just because there is a statistical relationship between the two. In fact, you cannot infer a causal relationship from statistics alone.
BUS B272 Unit 1 For example, the price of dog food and houses, may well be positively correlated over time.  When you collect data concerning the price of dog food and the price of houses over time, you might end up with an inference that they have a positive relationship, but can you conclude that an increase in the price of dog food would directly cause the price of houses to increase too?  It might be that an inflationary force is influencing both and hence they can be seen to move in the same general direction over time.
BUS B272 Unit 1 Computer Application Import the data into two adjacent columns in an Excel file and then click Tools/Data Analysis/ Regression(See page 624-5 for detail description).
BUS B272 Unit 1 Example 1: Computer Output
BUS B272 Unit 1 Exercise 2 Consider the example about the level of air pollution at Kwun Tong and the total number of consultations that relate to respiratory diseases in a public clinic in the area. The corresponding data were given as follows:
BUS B272 Unit 1 Exercise 1 (a)	Determine the sample regression line to predict 	the number of consultations by the level of 	pollution. (b)	Interpret the coefficients. Solution:
BUS B272 Unit 1 Exercise 1 For      , each additional increase in pollution level, the number of consultations increases, on average by 0.456701074.  No meaningful interpretation for       can be made, as the range of x does not include zero.
BUS B272 Unit 1 Assessing the simple linear regression model From time to time, after we have set up a linear regression model, we wish to assess the fitness of the model. That is, we wish to find out how well the model fit to the given data. For a good fit, the data as a whole should be quite close to the regression line and the independent variable can thus be used to predict the value of the dependent variable with high accuracy.  To examine how well the independent variable predicts the dependent variable, we need to develop several measures of variation.
BUS B272 Unit 1 Total Sample Variability Unexplained Variability = Explained Variability + Measure of Variation: The Sum of Squares SS(Total)         =SSR            +           SSE
BUS B272 Unit 1 Measure of Variation: The Sum of Squares SS(Total) = total sum of squares  Measures the variation of the Yi values around their mean Y SSR = regression sum of squares  Explained variation attributable to the relationship between X and Y SSE = error sum of squares  Variation attributable to factors other than the relationship between X and Y  (Unexplained variation) (continued)
BUS B272 Unit 1 Measure of Variation: The Sum of Squares _ SS(Total) = (Yi  – Y )2 (continued) Y Yi  SSE=(Yi - Yi)2 _  _ SSR = (Yi - Y)2 _ Y X Xi
BUS B272 Unit 1
BUS B272 Unit 1 Standard Error of Estimate The standard deviation of the variation of observations around the regression line.
The smallest value that        can assume is 0, which occurs when SSE = 0, that is, when all the points fall on the regression line. Thus, when      is small, the fit is excellent, and the linear regression model is likely to be an effective analytical and forecasting tool. When      is large, the regression model is a poor one, it is of little value to be used. BUS B272 Unit 1 Standard Error of Estimate
BUS B272 Unit 1 The Coefficient of Determination (r 2  or R 2 ) By themselves, SSR, SSE  and SS(Total) provide little that can be directly interpreted.  A simple ratio of SSR and SS(Total) provides a measure of the usefulness of the regression equation. Measures the proportion of variation in Y  that is explained by the independent variable X  in the regression model
BUS B272 Unit 1 Coefficients of Determination (r 2) r2 = 1 Y Y r2 = 1 ^ Y  =  b  +  b X i 0 1 i ^ Y  =  b  +  b X i 0 1 i X X r2 = 0 r2 = 0.8 Y Y ^ ^ Y  =  b  +  b X Y  =  b  +  b X i 0 1 i i 0 1 i X X
BUS B272 Unit 1 Coefficient of Correlation Coefficient of correlation is used to measure strength of association (linear relationship) between two numerical variables) Only concerned with strength of the relationship No causal effect is implied
BUS B272 Unit 1 (continued) Population correlation coefficient is denoted by  (Rho). Sample correlation coefficient is denoted by r . It is an estimate of   and is used to measure the strength of the linear relationship in the sample observations. Coefficient of Correlation
BUS B272 Unit 1 Coefficient of Correlation
BUS B272 Unit 1 Sample of Observations from Various r  Values Y Y Y X X X r = –1 r = –0.6 r = 0 Y Y X X r = 0.6 r = 1
BUS B272 Unit 1 Features of r and r Unit free Range between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker the linear relationship
BUS B272 Unit 1 There is also a more systematic way to assess model fitness, i.e., to perform a hypothesis testing on the slope of the regression line. Inference about the Slope If the two variables involved are not at all linearly related, one could observe from the scatter diagram shown on the right that the slope of the regression line will be zero.
BUS B272 Unit 1 Hence, we can determine whether a significant relationship between the variables X  and Y exists by testing whether 	(the true slope) is equal to zero. Inference about the Slope (There is no linear relationship) (There is a linear relationship) If       is rejected, there is evidence to believe that a linear relationship exists between X  and Y.
BUS B272 Unit 1 The standard error of the slope The estimated standard error of     .
BUS B272 Unit 1 Inference about the Slope: t  Test t  test for a population slope Is there a linear dependency of Y on X ? Null and alternative hypotheses H0:  1 = 0	(no linear dependency) H1:  1 0	(linear dependency) Test statistic:
BUS B272 Unit 1 Example: Store Sales Data for Seven Stores: Estimated Regression Equation: Annual Store	   Square 	 Sales		     Feet	($000)    1           1,726	  3,681    2           1,542	  3,395    3	     2,816	  6,653    4	     5,555	  9,543    5	     1,292	  3,318    6	     2,208	  5,563    7	     1,313	  3,760	  Yi = 1636.415 +1.487Xi The slope of this model is 1.487.  Is square footage of the store affecting its annual sales?
H0: 1 = 0          0.05 H1: 1 0          df7 - 2  = 5 Test Statistic:  BUS B272 Unit 1
BUS B272 Unit 1 Inferences about the Slope: t  Test Example Reject Reject 0.025 0.025 0 2.5706 -2.5706 Decision: Conclusion: Critical Value(s): Reject H0 At 5% level of significance, there is evidence to reveal that square footage is associated with annual sales.
BUS B272 Unit 1 (No linear relationship) (A linear relationship) (No positive linear relationship) (A positive linear relationship) (No negative linear relationship) (A negative linear relationship) Inferences about the Slope
BUS B272 Unit 1 Exercise 3 	Consider the data of Exercise 2 about the level of air pollution at Kwun Tong and the total number of consultations that relate to respiratory diseases in a public clinic in the area.  Test at the 5% level of significance to determine whether level of air pollution and the total number of consultations are positively linearly related.
BUS B272 Unit 1 Solution: 0.05;   df14 - 2  = 12
BUS B272 Unit 1 Exercise 3
BUS B272 Unit 1 Computer Output For two-tailed test
BUS B272 Unit 1 Exercise 3 Decision: Conclusion: Reject H0 Critical Value(s): Reject H0 At 5% level of significance, there is evidence to believe that level of air pollution and total number of consultations are positively linearly related. 0.05 0 1.7823
BUS B272 Unit 1 You have seen how can we assess the model fitness. If the model fits satisfactorily, we can use it to forecast and estimate values of the dependent variable.  We can obtain a point prediction of Y with a given value of X  using the linear regression line. Confidence interval about the particular value of Y  or the average of Y  for a given value of X  can also be computed if desired. Estimation of Mean Values
BUS B272 Unit 1 Estimation of Mean Values Confidence interval estimate for             : The mean of Y given a particular   Size of interval varies according to distance away from mean,     Standard error of the estimate t value from table with df = n - 2
BUS B272 Unit 1 Prediction of Individual Values Prediction interval for individual response Yi at a particular  Addition of one increases width of interval from that for the mean of Y
BUS B272 Unit 1 Interval Estimates for Different Values of X Confidence Interval for the mean of Y Prediction Interval for a individual Yi Y  Yi = b0 + b1Xi X Y given X
BUS B272 Unit 1 Example: Stores Sales Data for seven stores: Predict the annual sales for a store with 2000 square feet. Annual Store	   Square 	 Sales		     Feet	($000)    1           1,726	  3,681    2           1,542	  3,395    3	     2,816	  6,653    4	     5,555	  9,543    5	     1,292	  3,318    6	     2,208	  5,563    7	     1,313	  3,760	 Regression Model Obtained:  Yi = 1636.415 +1.487Xi
Estimation of Mean Values: Example Confidence Interval Estimate for Find the 95% confidence interval for the average annual sales for a 2,000 square-foot store.  Predicted Sales Yi = 1636.415 +1.487Xi = 4609.68 ($000) tn-2 = t5 = 2.571 X = 2350.29 BUS B272 Unit 1
Prediction Interval for Y : Example Prediction Interval for Individual Y Find the 95% prediction interval                                           for the annual sales of a 2,000 square-foot store  Predicted Sales Yi = 1636.415 +1.487Xi = 4609.68 ($000) tn-2 = t5 = 2.571 X = 2350.29 BUS B272 Unit 1
BUS B272 Unit 1 Computer Application Commands:Tools/ Data Analysis Plus/ Prediction Interval.
BUS B272 Unit 1 Computer Output
BUS B272 Unit 1 Linear Regression Assumptions 1.  Normality Y values are normally distributed for each X Probability distribution of error is normal 2.	Homoscedasticity (Constant Variance) 3.	Independence of Errors
BUS B272 Unit 1 ,[object Object]
 For each X value, the “spread” or variance around the regression line is the same.Variation of Errors around the Regression Line f(e) Y X2 X1 X Sample Regression Line .
Multiple Regression
BUS B272 Unit 1 Introduction Extension of the simple linear regression model to allow for any fixed number of independent variables. That is, the number of independent variables could be more than one.
BUS B272 Unit 1 Multiple Linear Regression To make use of computer printout to  Assess the model How well it fits the data Is it useful Are any required conditions violated? Employ the model Interpreting the coefficients Predictions using the prediction equation Estimating the expected value of the dependent variable
BUS B272 Unit 1 Allow for k independent variables to potentially be related to the dependent variable y = b0 + b1x1+ b2x2 + …+ bkxk + e Regression Coefficients Random error  variable Dependent variable Independent variables Model and Required Conditions
Multiple Regression for k = 2, Graphical Demonstration X 1 The simple linear regression model allows for one independent variable, “x” for y = b0 + b1x + e y y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 The multiple linear regression model allows for more than one independent variable. Y = b0 + b1x1 + b2x2  + e X2 BUS B272 Unit 1
BUS B272 Unit 1 The errore is normally distributed. The mean is equal to zero and the standard deviation is constant (se)for all values of y.  The errors are independent. Required conditions for the error variable
BUS B272 Unit 1 Estimating the Coefficients andAssessing the Model The procedure used to perform multiple regression analysis: ,[object Object]
Assess the model fitness using statistics obtained from the sample.
If the model assessment indicates good fit to the data, use it to interpret the coefficients and generate predictions.,[object Object]
Estimating the Coefficients and Assessing the Model, Example Physical Profitability Margin (%) Market  awareness Competition Customers Community Number Office space Income Distance Nearest Enrollment Median household income  of nearby area (in $thousands) Number of  hotels/motels rooms within  3 miles from  the site Enrollemnt in nearby university or college (in thousands) Distance to  the downtown core (in miles) Number of miles to closest competition Office space in nearby community BUS B272 Unit 1
BUS B272 Unit 1 Estimating the Coefficients and Assessing the Model, Example Data were collected from randomly selected 100 inns that belong to La Quinta, and ran for the following suggested model: Margin = b0 + b1Rooms + b2Nearest + b3Office + 	b4College + b5Income + b6Disttwn Xm18-01
BUS B272 Unit 1 Regression Analysis, Excel Output Margin = 38.14 - 0.0076Number +1.65Nearest + 0.020Office Space +0.21Enrollment + 0.41Income - 0.23Distance This is the sample regression equation  (sometimes called the prediction equation)
BUS B272 Unit 1 Model Assessment The model is assessed using two tools: The coefficient of determination The F -test of the analysis of variance The standard error of estimates participates in building the above tools.
BUS B272 Unit 1 Standard Error of Estimate The standard deviation of the error is estimated by the Standard Error of Estimate: The magnitude of seis judged by comparing it to
BUS B272 Unit 1 From the printout, se = 5.51  Calculating the mean value of y, we have It seems se is not particularly small.  Question:Can we conclude the model does not fit the data well?  Standard Error of Estimate
BUS B272 Unit 1 Coefficient of Determination The definition is: From the printout,  r 2 = 0.5251 52.51% of the variation in operating margin is explained by the six independent variables. 47.49% remains unexplained.
BUS B272 Unit 1 Testing the Validity of the Model For testing the validity of the model, the following question is asked: 	Is there at least one independent variable linearly related to the dependent variable?  To answer the question we test the hypothesis H0: b1 = b2 = … = bk = 0 H1: At least one bi is not equal to zero. If at least one bi is not equal to zero, the model has some validity or usefulness.
BUS B272 Unit 1 Testing the Validity of the La Quinta Inns Regression Model The hypotheses are tested by an ANOVA procedure ( the Excel output) MSR / MSE k      = n–k–1 =    n-1  =  SSR MSR=SSR / k SSE MSE=SSE / (n-k-1)
BUS B272 Unit 1 Testing the Validity of the La Quinta Inns Regression Model 	[Total variation in y] SS(Total) = SSR + SSE.  	Large F  results from a large SSR. That implies much of the variation in y can be explained by the regression model; the model is useful, and thus, the null hypothesis should be rejected.  Therefore, the rejection region is: F > Fa, k, n – k – 1 while the test statistic is:
BUS B272 Unit 1 Testing the Validity of the La Quinta Inns Regression Model Fa, k, n-k-1 = F0.05,6,100-6 -1 = 2.17 F = 17.14 > 2.17 Conclusion:  There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis.  At least one of the bi is not equal to zero. Thus, at least one independent variable is linearly related to y.   This linear regression model is valid. Also, the p-value (Significance F) = 0.0000; Reject the null hypothesis.

Contenu connexe

Tendances

Consumer behavior - Decision making & Types
Consumer behavior - Decision making & TypesConsumer behavior - Decision making & Types
Consumer behavior - Decision making & Typesviveksangwan007
 
Why satisfied customers defect ??
Why satisfied customers defect ??Why satisfied customers defect ??
Why satisfied customers defect ??Shreyansh Kejriwal
 
Guide: Conjoint Analysis
Guide: Conjoint AnalysisGuide: Conjoint Analysis
Guide: Conjoint AnalysisQuestionPro
 
Gino sa distribution channel management case study
Gino sa distribution channel management case studyGino sa distribution channel management case study
Gino sa distribution channel management case studySameer Mathur
 
Major types of buying situations
Major types of buying situationsMajor types of buying situations
Major types of buying situationsTopu Kawser
 
Consumer Behavior and Marketing Strategy chapter 1 moghimi
Consumer Behavior and Marketing Strategy chapter 1 moghimiConsumer Behavior and Marketing Strategy chapter 1 moghimi
Consumer Behavior and Marketing Strategy chapter 1 moghimiBahman Moghimi
 
Goodyear the aquatred launch
Goodyear the aquatred launchGoodyear the aquatred launch
Goodyear the aquatred launchSameer Mathur
 
Master scheduling
Master schedulingMaster scheduling
Master schedulingmehrdad66
 
Unit 5 consumer buyer behavior
Unit 5 consumer buyer behaviorUnit 5 consumer buyer behavior
Unit 5 consumer buyer behaviorDrSmita Choudhary
 
Metabical - Marketing Case Study
Metabical - Marketing Case StudyMetabical - Marketing Case Study
Metabical - Marketing Case StudyShrishti Gupta
 
Chapter 1 Marketing Research Malhotra
Chapter 1 Marketing Research MalhotraChapter 1 Marketing Research Malhotra
Chapter 1 Marketing Research MalhotraAADITYA TANTIA
 
BB Chapter Seven : Post Purchase Processes, Customer Satisfaction and Loyalty
BB Chapter Seven : Post Purchase Processes, Customer Satisfaction and LoyaltyBB Chapter Seven : Post Purchase Processes, Customer Satisfaction and Loyalty
BB Chapter Seven : Post Purchase Processes, Customer Satisfaction and LoyaltyBBAdvisor
 
Case base presentation life in the fast lane
Case base presentation   life in the fast laneCase base presentation   life in the fast lane
Case base presentation life in the fast laneAbhilash Khunger
 
Once upon a farm media plan
Once upon a farm media planOnce upon a farm media plan
Once upon a farm media planRebecca Ross
 
Statr sessions 9 to 10
Statr sessions 9 to 10Statr sessions 9 to 10
Statr sessions 9 to 10Ruru Chowdhury
 

Tendances (20)

Consumer behavior - Decision making & Types
Consumer behavior - Decision making & TypesConsumer behavior - Decision making & Types
Consumer behavior - Decision making & Types
 
Why satisfied customers defect ??
Why satisfied customers defect ??Why satisfied customers defect ??
Why satisfied customers defect ??
 
Guide: Conjoint Analysis
Guide: Conjoint AnalysisGuide: Conjoint Analysis
Guide: Conjoint Analysis
 
Gino sa distribution channel management case study
Gino sa distribution channel management case studyGino sa distribution channel management case study
Gino sa distribution channel management case study
 
Major types of buying situations
Major types of buying situationsMajor types of buying situations
Major types of buying situations
 
Consumer Behavior and Marketing Strategy chapter 1 moghimi
Consumer Behavior and Marketing Strategy chapter 1 moghimiConsumer Behavior and Marketing Strategy chapter 1 moghimi
Consumer Behavior and Marketing Strategy chapter 1 moghimi
 
Sainsbury in Egypt
Sainsbury in EgyptSainsbury in Egypt
Sainsbury in Egypt
 
Goodyear the aquatred launch
Goodyear the aquatred launchGoodyear the aquatred launch
Goodyear the aquatred launch
 
Malhotra14
Malhotra14Malhotra14
Malhotra14
 
S'well Presentation
S'well PresentationS'well Presentation
S'well Presentation
 
Master scheduling
Master schedulingMaster scheduling
Master scheduling
 
Unit 5 consumer buyer behavior
Unit 5 consumer buyer behaviorUnit 5 consumer buyer behavior
Unit 5 consumer buyer behavior
 
Metabical - Marketing Case Study
Metabical - Marketing Case StudyMetabical - Marketing Case Study
Metabical - Marketing Case Study
 
Chapter 1 Marketing Research Malhotra
Chapter 1 Marketing Research MalhotraChapter 1 Marketing Research Malhotra
Chapter 1 Marketing Research Malhotra
 
BB Chapter Seven : Post Purchase Processes, Customer Satisfaction and Loyalty
BB Chapter Seven : Post Purchase Processes, Customer Satisfaction and LoyaltyBB Chapter Seven : Post Purchase Processes, Customer Satisfaction and Loyalty
BB Chapter Seven : Post Purchase Processes, Customer Satisfaction and Loyalty
 
Case base presentation life in the fast lane
Case base presentation   life in the fast laneCase base presentation   life in the fast lane
Case base presentation life in the fast lane
 
Once upon a farm media plan
Once upon a farm media planOnce upon a farm media plan
Once upon a farm media plan
 
Malhotra12
Malhotra12Malhotra12
Malhotra12
 
Marketing Myopia
Marketing MyopiaMarketing Myopia
Marketing Myopia
 
Statr sessions 9 to 10
Statr sessions 9 to 10Statr sessions 9 to 10
Statr sessions 9 to 10
 

Similaire à Bus b272 f unit 1

Anova by Hazilah Mohd Amin
Anova by Hazilah Mohd AminAnova by Hazilah Mohd Amin
Anova by Hazilah Mohd AminHazilahMohd
 
Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Stephen Ong
 
Mb0040 statistics for management
Mb0040   statistics for managementMb0040   statistics for management
Mb0040 statistics for managementsmumbahelp
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarAzmi Mohd Tamil
 
In the t test for independent groups, ____.we estimate µ1 µ2.docx
In the t test for independent groups, ____.we estimate µ1 µ2.docxIn the t test for independent groups, ____.we estimate µ1 µ2.docx
In the t test for independent groups, ____.we estimate µ1 µ2.docxbradburgess22840
 
Some study materials
Some study materialsSome study materials
Some study materialsSatishH5
 
SURE Model_Panel data.pptx
SURE Model_Panel data.pptxSURE Model_Panel data.pptx
SURE Model_Panel data.pptxGeetaShreeprabha
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examplesGaurav Kamboj
 
Ch7 Analysis of Variance (ANOVA)
Ch7 Analysis of Variance (ANOVA)Ch7 Analysis of Variance (ANOVA)
Ch7 Analysis of Variance (ANOVA)Farhan Alfin
 
In a left-tailed test comparing two means with variances unknown b.docx
In a left-tailed test comparing two means with variances unknown b.docxIn a left-tailed test comparing two means with variances unknown b.docx
In a left-tailed test comparing two means with variances unknown b.docxbradburgess22840
 
Analysis of Variance
Analysis of Variance Analysis of Variance
Analysis of Variance jyothimonc
 
ForecastingBUS255 GoalsBy the end of this chapter, y.docx
ForecastingBUS255 GoalsBy the end of this chapter, y.docxForecastingBUS255 GoalsBy the end of this chapter, y.docx
ForecastingBUS255 GoalsBy the end of this chapter, y.docxbudbarber38650
 
linear regression PDF.pdf
linear regression PDF.pdflinear regression PDF.pdf
linear regression PDF.pdfJoshuaLau29
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsBabasab Patil
 
Calculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docx
Calculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docxCalculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docx
Calculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docxaman341480
 
correlation.pptx
correlation.pptxcorrelation.pptx
correlation.pptxSmHasiv
 

Similaire à Bus b272 f unit 1 (20)

Anova by Hazilah Mohd Amin
Anova by Hazilah Mohd AminAnova by Hazilah Mohd Amin
Anova by Hazilah Mohd Amin
 
Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02Mba2216 week 11 data analysis part 02
Mba2216 week 11 data analysis part 02
 
One way anova
One way anovaOne way anova
One way anova
 
Mb0040 statistics for management
Mb0040   statistics for managementMb0040   statistics for management
Mb0040 statistics for management
 
Chi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemarChi-square, Yates, Fisher & McNemar
Chi-square, Yates, Fisher & McNemar
 
Data analysis
Data analysisData analysis
Data analysis
 
In the t test for independent groups, ____.we estimate µ1 µ2.docx
In the t test for independent groups, ____.we estimate µ1 µ2.docxIn the t test for independent groups, ____.we estimate µ1 µ2.docx
In the t test for independent groups, ____.we estimate µ1 µ2.docx
 
Some study materials
Some study materialsSome study materials
Some study materials
 
SURE Model_Panel data.pptx
SURE Model_Panel data.pptxSURE Model_Panel data.pptx
SURE Model_Panel data.pptx
 
Logistic regression with SPSS examples
Logistic regression with SPSS examplesLogistic regression with SPSS examples
Logistic regression with SPSS examples
 
Ch7 Analysis of Variance (ANOVA)
Ch7 Analysis of Variance (ANOVA)Ch7 Analysis of Variance (ANOVA)
Ch7 Analysis of Variance (ANOVA)
 
In a left-tailed test comparing two means with variances unknown b.docx
In a left-tailed test comparing two means with variances unknown b.docxIn a left-tailed test comparing two means with variances unknown b.docx
In a left-tailed test comparing two means with variances unknown b.docx
 
Analysis of Variance
Analysis of Variance Analysis of Variance
Analysis of Variance
 
ForecastingBUS255 GoalsBy the end of this chapter, y.docx
ForecastingBUS255 GoalsBy the end of this chapter, y.docxForecastingBUS255 GoalsBy the end of this chapter, y.docx
ForecastingBUS255 GoalsBy the end of this chapter, y.docx
 
Factorial Experiments
Factorial ExperimentsFactorial Experiments
Factorial Experiments
 
linear regression PDF.pdf
linear regression PDF.pdflinear regression PDF.pdf
linear regression PDF.pdf
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 
Calculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docx
Calculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docxCalculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docx
Calculating Analysis of Variance (ANOVA) and Post Hoc Analyses Follo.docx
 
correlation.pptx
correlation.pptxcorrelation.pptx
correlation.pptx
 
Twoway.ppt
Twoway.pptTwoway.ppt
Twoway.ppt
 

Dernier

An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxCeline George
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文中 央社
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticspragatimahajan3
 
philosophy and it's principles based on the life
philosophy and it's principles based on the lifephilosophy and it's principles based on the life
philosophy and it's principles based on the lifeNitinDeodare
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfbu07226
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjMohammed Sikander
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...Nguyen Thanh Tu Collection
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfQucHHunhnh
 
Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
 Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatmentsaipooja36
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽中 央社
 
Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...Mark Carrigan
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxCapitolTechU
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Celine George
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17Celine George
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptxmansk2
 
Discover the Dark Web .pdf InfosecTrain
Discover the Dark Web .pdf  InfosecTrainDiscover the Dark Web .pdf  InfosecTrain
Discover the Dark Web .pdf InfosecTraininfosec train
 
Application of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesApplication of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesRased Khan
 
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdfPost Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdfPragya - UEM Kolkata Quiz Club
 
Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).Mohamed Rizk Khodair
 

Dernier (20)

An Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptxAn Overview of the Odoo 17 Discuss App.pptx
An Overview of the Odoo 17 Discuss App.pptx
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
size separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceuticssize separation d pharm 1st year pharmaceutics
size separation d pharm 1st year pharmaceutics
 
philosophy and it's principles based on the life
philosophy and it's principles based on the lifephilosophy and it's principles based on the life
philosophy and it's principles based on the life
 
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdfINU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
INU_CAPSTONEDESIGN_비밀번호486_업로드용 발표자료.pdf
 
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjjStl Algorithms in C++ jjjjjjjjjjjjjjjjjj
Stl Algorithms in C++ jjjjjjjjjjjjjjjjjj
 
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
ĐỀ THAM KHẢO KÌ THI TUYỂN SINH VÀO LỚP 10 MÔN TIẾNG ANH FORM 50 CÂU TRẮC NGHI...
 
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdfDanh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
Danh sách HSG Bộ môn cấp trường - Cấp THPT.pdf
 
Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
 Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
Envelope of Discrepancy in Orthodontics: Enhancing Precision in Treatment
 
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽會考英聽
 
Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...Navigating the Misinformation Minefield: The Role of Higher Education in the ...
Navigating the Misinformation Minefield: The Role of Higher Education in the ...
 
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptxslides CapTechTalks Webinar May 2024 Alexander Perry.pptx
slides CapTechTalks Webinar May 2024 Alexander Perry.pptx
 
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdfPost Exam Fun(da) Intra UEM General Quiz - Finals.pdf
Post Exam Fun(da) Intra UEM General Quiz - Finals.pdf
 
Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17Features of Video Calls in the Discuss Module in Odoo 17
Features of Video Calls in the Discuss Module in Odoo 17
 
How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17How to Manage Notification Preferences in the Odoo 17
How to Manage Notification Preferences in the Odoo 17
 
2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx2024_Student Session 2_ Set Plan Preparation.pptx
2024_Student Session 2_ Set Plan Preparation.pptx
 
Discover the Dark Web .pdf InfosecTrain
Discover the Dark Web .pdf  InfosecTrainDiscover the Dark Web .pdf  InfosecTrain
Discover the Dark Web .pdf InfosecTrain
 
Application of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matricesApplication of Matrices in real life. Presentation on application of matrices
Application of Matrices in real life. Presentation on application of matrices
 
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdfPost Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
Post Exam Fun(da) Intra UEM General Quiz 2024 - Prelims q&a.pdf
 
Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).Dementia (Alzheimer & vasular dementia).
Dementia (Alzheimer & vasular dementia).
 

Bus b272 f unit 1

  • 3. BUS B272 Unit 1 Analysis of Variance The Analysis of Variance (ANOVA) is a procedure that tests to determine whether differences exist between two or more populations. The techniques analyzes the variance of the data to determine whether we can infer that the populations differ.
  • 4. One way (Single-factor) analysis of variance ANOVA assumptions F test for difference among k means BUS B272 Unit 1 Topics
  • 5. BUS B272 Unit 1 General Experimental Setting Investigator controls one or more independent variables Called treatments or factors Each treatment contains two or more levels (or categories/classifications) Observe effects on dependent variable Response to different levels of independent variable Experimental design: the plan used to test hypothesis
  • 6. BUS B272 Unit 1 Completely Randomized Design Experimental units (subjects) are assigned randomly to treatments Subjects are assumed homogeneous Only one factor or independent variable With two or more treatment levels Analyzed by One-way analysis of variance (one-way ANOVA)
  • 7. BUS B272 Unit 1 Randomized Design Example   
  • 8. BUS B272 Unit 1 One-way Analysis of Variance F Test Evaluate the difference among the mean responses of 2 or more (k) populations e.g. : Several types of tires, oven temperature settings, different types of marketing strategies
  • 9.
  • 12. F test is robust to moderate departure from normality
  • 13. Populations have equal variancesAssumptions of ANOVA
  • 14. BUS B272 Unit 1 Hypotheses of One-Way ANOVA All population means are equal No treatment effect (no variation in means among groups) At least one population mean is different (others may be the same!) There is treatment effect Does not mean that all population means are different
  • 15. BUS B272 Unit 1 One-way ANOVA (No Treatment Effect) The Null Hypothesis is True
  • 16. BUS B272 Unit 1 One-way ANOVA (Treatment Effect Present) The Null Hypothesis is NOT True
  • 17. BUS B272 Unit 1 One-way ANOVA(Partition of Total Variation) Total Variation SS(Total) Variation Due to Treatment SST Variation Due to Random Sampling SSE + =
  • 18. BUS B272 Unit 1 ANOVA set-up
  • 19. BUS B272 Unit 1 Total Variation : the i-th observation in group j : the number of observations in group j n : the total number of observations in all groups k : the number of groups the overall or grand mean
  • 20. BUS B272 Unit 1 Total Variation (continued)
  • 21. BUS B272 Unit 1 Among-Treatments Variation Variation Due to Differences Among Groups
  • 22. BUS B272 Unit 1 Among-Treatments Variation (continued)
  • 23. BUS B272 Unit 1 Summing the variation within each treatment and then adding over all treatments. Within-Treatment Variation
  • 24. BUS B272 Unit 1 Within-Treatment Variation (continued)
  • 25.
  • 26. For 2 groups, use t-test. F test is more limited.For k = 2, this is the pooled-variance in the t-test.
  • 27. BUS B272 Unit 1 One-way ANOVAF Test Statistic Test statistic: MST is mean squares among or between variances MSE is mean squares within or error variances Degrees of freedom:
  • 28. BUS B272 Unit 1 One-way ANOVA Summary Table
  • 29. BUS B272 Unit 1 Features of One-way ANOVA F Statistic The F statistic is the ratio of the among estimate of variance and the within estimate of variance. The ratio must always be positive df1 = k -1 will typically be small df2 = n - k will typically be large The ratio should be closed to 1 if the null is true.
  • 30. BUS B272 Unit 1 One-way ANOVA F Test Example As production manager, you want to see if three filling machines have different mean filling times. You assign 15 similarly trained and experienced workers, five per machine, to the machines. At the 0.05 significance level, is there a difference in mean filling times? Machine1Machine2Machine3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40
  • 31. BUS B272 Unit 1 One-way ANOVA Example: Scatter Diagram Machine1Machine2Machine3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 Time in Seconds 27 26 25 24 23 22 21 20 19 • • • • • • • • • • • • • • •
  • 32. BUS B272 Unit 1 Machine 1Machine 2Machine 3 25.40 23.40 20.00 26.31 21.80 22.20 24.10 23.50 19.75 23.74 22.75 20.60 25.10 21.60 20.40 One-way ANOVA Example Computations
  • 34. BUS B272 Unit 1 Summary Table MST/MSE =25.602 3-1=2 47.1640 23.5820 15-3=12 11.0532 0.9211 15-1=14 58.2172
  • 35. BUS B272 Unit 1  = 0.05 F 0 One-way ANOVA Example Solution Critical Value(s): H0: 1 = 2 = 3 H1: Not all the means are equal Test Statistic: 3.89 df1= 2 df2 = 12 Reject H0 at  = 0.05 There is evidence to believe that at least one  i differs from the rest.
  • 36. BUS B272 Unit 1 Computer Application To obtain the Microsoft Excel computer output in the previous page, first enter the data into c columns in an Excel file, then follow the commands: Tools/ Data Analysis/ Anova: Single Factor
  • 37. BUS B272 Unit 1 Computer Output using Data Analysis of Excel
  • 38. Exercise 1 The manager of a large department store wants to test if the average size of customer transactions differs with four types of payment: Visa card, company card, cash or cheque. If there are differences in the average customer transaction size among the four types of payment, the manager will further investigate which types of payment will give rise to higher transaction volumes and hence he will design an appropriate promotional programme. A random sample of 54 customer transactions using various types of payment was drawn during the past two months. With reference to sampled data, the sample statistics are obtained as follows: BUS B272 Unit 1 Test if differences of average customer transaction size exist among the four types of payment at a 0.05 level of significance.
  • 39. Exercise 1 BUS B272 Unit 1 One factor is involved, i.e. the type of payment. Under this factor, there are k = 4 treatments (or factor levels) which represent the four types of payment: Visa card, company card, cash and cheque. The experimental units are customer transactions.
  • 40. Exercise 1 Since the test statistic of 39.16 is greater than the critical value of 2.80, reject H0. At 0.05 level of significance, there is evidence to reveal that the average customer transaction sizes are significantly different among the four types of payment. BUS B272 Unit 1
  • 41. Can ANOVA be replaced by t-Test? t-Test : any difference between two population means μ1 and μ2 Multiple t-tests are required for more than two population means Conducting multiple tests increases the probability of making Type I errors. E.g. compare 6 population means, if use ANOVA with significant level 5%, there will be a 5% chance we reject the null hypothesis when it is true. If we use t-test, we need to perform 15 tests and if same 5% significant level is set, the chance of a Type I error will be 1 – (1 - 0.05)15 = 0.54 BUS B272 Unit 1
  • 43. BUS B272 Unit 1 Linear Regression Origin of regression Determining the simple linear regression equation Assessing the fitness of the model Correlation analysis Estimation and prediction Assumptions of regression and correlation
  • 44. BUS B272 Unit 1 Origin of Regression “Regression," from a Latin root meaning "going back," is a series of statistical methods used in studying the relationship between two variables and were first employed by Francis Galton in 1877. Galton was interested in studying the relationship between a father’s height and the son’ s height. Making use of the “regression” method, he found that son’s height regress to the overall mean and the method is then called “regression”.
  • 45. BUS B272 Unit 1 Linear Regression Analysis Linear Regression analysis is used primarily to model and describe linear relationship and provide prediction among variables Predicts the value of a dependent (response) variable based on the value of at least one independent (explanatory) variable Express statistically the effect of the independent variables on the dependent variable
  • 46. BUS B272 Unit 1 Types of Regression Models Positive Linear Relationship Relationship NOT Linear Negative Linear Relationship No Relationship
  • 47. BUS B272 Unit 1 Simple Linear Regression Model The relationship between two variables, sayX and Y, is described by a linear function. The change of the variable Y, (called dependent or response variable) is associated with the change in the other variable X(called independent or explanatory variable). Explore the dependency of Y on X.
  • 48. (4, 5) (2, 2.5) (3, 2.5) (1, 2) Why Regression? The larger the sum of squares, the poor the estimate. X 1 2 3 4 Y 2 2.5 2.5 5 BUS B272 Unit 1
  • 49. BUS B272 Unit 1 Linear Relationship We wish to study whether there is any association between two quantitative variables, sayX and Y If ‘Y tends to increase as X increases’ If ‘Y tends to decrease as X increases’ If the corresponding magnitude of increase or decrease follows a specific proportion, the relationship identified is said to be a linear one. – apositive relationship – anegative relationship
  • 50. BUS B272 Unit 1 Scatter Diagram A scatter diagram is a graph plotted for all X-Y pairs of the sample data. By viewing a scatter diagram, one can determine whether a relationship exists between the two variables. It can also suggest the likely mathematical form of that relationship that allow one to judge initially and intuitively whether or not there exists a linear relationship between the two variables involved.
  • 51. BUS B272 Unit 1 Example The level of air pollution at Kwun Tong and the total number of consultations relating to respiratory diseases in a public clinic in the area were recorded during a specific time period on 14 randomly selected days.
  • 52. BUS B272 Unit 1 Population Linear Regression Population regression line is a straight line that describes the dependence of the average value (conditional mean) of one variable on the other Random Error Population SlopeCoefficient Population Y intercept Dependent (Response) Variable PopulationRegression Line (conditional mean) Independent (Explanatory) Variable
  • 53. BUS B272 Unit 1 Population Linear Regression (continued) Random Error (vertical discrepancies or residual for point i ) Y (Observed Value of Y) = (Conditional Mean) X Observed Value of Y
  • 54. BUS B272 Unit 1 Least Squares Method The line fitted by least squares is the one that makes the sum of squares of all those vertical discrepancies (residuals) as small as possible, i.e. minimum of which is the sum of squared residuals.
  • 55. BUS B272 Unit 1 Sample Y intercept Residual Sample regression line is formed by the point estimates of and , i.e., and . It provides an estimate of the population regression line as well as a predicted value of Y Sample Linear Regression Samplecoefficient of slope Sample regression line (Fitted regression line or predicted value)
  • 56. BUS B272 Unit 1 Sample Linear Regression (continued) and are obtained by finding the specific values of and that minimizes the sum of the squared residuals
  • 57. BUS B272 Unit 1 Coefficients of Sample Linear Regression For
  • 58. BUS B272 Unit 1 Interpretation of the Slope and the Intercept is the average value of Y when the value of X is zero. measures the change in the average value of Y as a result of a one-unit change in X.
  • 59. BUS B272 Unit 1 (continued) is the estimated average value of Y when the value of X is zero. is the estimated change in the average value of Y as a result of one-unit change in X. Interpretation of the Slope and the Intercept
  • 60. BUS B272 Unit 1 Example 1 : Simple Linear Regression Suppose that you want to examine the linear dependency of the annual sales among seven stores on their size in square footage. Sample data for seven stores were obtained. Find the equation of the straight line that fits the data best. Annual Store Square Sales Feet ($1000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760
  • 61. BUS B272 Unit 1 Example 1 : Scatter Diagram Excel Output
  • 62. BUS B272 Unit 1 Computation of Regression Coefficient Annual Square Sales Store Feet ($1000) XY 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 2,979,076 2,377,764 7,929,856 30,858,025 1,669,264 4,875,264 1,723,969 13,549,761 11,526,025 44,262,409 91,068,849 11,009,124 30,946,969 14,137,600 6,353,406 5,235,090 18,734,848 53,011,365 4,286,856 12,283,104 4,936,880 16,452 35,913 104,841,549 52,413,218 216,500,737
  • 63. BUS B272 Unit 1 Computation of Regression Coefficient
  • 64. BUS B272 Unit 1 Example 1 : Equation for the Sample Regression Line Yi = 1636.415 +1.487Xi 
  • 65. BUS B272 Unit 1 Example 1 : Interpretation of Results The slope of 1.487 means that for each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units. The model estimates that for each increase of one square foot in the size of the store, the expected annual sales are predicted to increase by $1487.
  • 66. BUS B272 Unit 1 Predicting Annual Sales Based on Square Footage Suppose that we would like to use the fitted model to predict the average annual sales for a store with 4,000 square feet.
  • 67. BUS B272 Unit 1 Interpolation versus Extrapolation For using regression line for prediction purpose, it is not appropriate to make predictions beyond the relevant range (in the previous example: (1,292, 5,555)) of the independent variable. That is, we may interpolate within the relevant range of X values, but we SHOULD NOT extrapolate beyond the range of X values. For example, it is not appropriate to predict the average annual sales for a store with 7,000 square feet since it is beyond the range of X values, i.e., (1,292, 5,555).
  • 68. BUS B272 Unit 1 Causal Relationship? In general, when there is a relationship identified between X and Y using regression analysis, we usually would say that ‘X is associated with Y’ instead of saying ‘X causes Y’. We cannot claim that two variables are related by cause and effect just because there is a statistical relationship between the two. In fact, you cannot infer a causal relationship from statistics alone.
  • 69. BUS B272 Unit 1 For example, the price of dog food and houses, may well be positively correlated over time. When you collect data concerning the price of dog food and the price of houses over time, you might end up with an inference that they have a positive relationship, but can you conclude that an increase in the price of dog food would directly cause the price of houses to increase too? It might be that an inflationary force is influencing both and hence they can be seen to move in the same general direction over time.
  • 70. BUS B272 Unit 1 Computer Application Import the data into two adjacent columns in an Excel file and then click Tools/Data Analysis/ Regression(See page 624-5 for detail description).
  • 71. BUS B272 Unit 1 Example 1: Computer Output
  • 72. BUS B272 Unit 1 Exercise 2 Consider the example about the level of air pollution at Kwun Tong and the total number of consultations that relate to respiratory diseases in a public clinic in the area. The corresponding data were given as follows:
  • 73. BUS B272 Unit 1 Exercise 1 (a) Determine the sample regression line to predict the number of consultations by the level of pollution. (b) Interpret the coefficients. Solution:
  • 74. BUS B272 Unit 1 Exercise 1 For , each additional increase in pollution level, the number of consultations increases, on average by 0.456701074. No meaningful interpretation for can be made, as the range of x does not include zero.
  • 75. BUS B272 Unit 1 Assessing the simple linear regression model From time to time, after we have set up a linear regression model, we wish to assess the fitness of the model. That is, we wish to find out how well the model fit to the given data. For a good fit, the data as a whole should be quite close to the regression line and the independent variable can thus be used to predict the value of the dependent variable with high accuracy. To examine how well the independent variable predicts the dependent variable, we need to develop several measures of variation.
  • 76. BUS B272 Unit 1 Total Sample Variability Unexplained Variability = Explained Variability + Measure of Variation: The Sum of Squares SS(Total) =SSR + SSE
  • 77. BUS B272 Unit 1 Measure of Variation: The Sum of Squares SS(Total) = total sum of squares Measures the variation of the Yi values around their mean Y SSR = regression sum of squares Explained variation attributable to the relationship between X and Y SSE = error sum of squares Variation attributable to factors other than the relationship between X and Y (Unexplained variation) (continued)
  • 78. BUS B272 Unit 1 Measure of Variation: The Sum of Squares _ SS(Total) = (Yi – Y )2 (continued) Y Yi  SSE=(Yi - Yi)2 _  _ SSR = (Yi - Y)2 _ Y X Xi
  • 80. BUS B272 Unit 1 Standard Error of Estimate The standard deviation of the variation of observations around the regression line.
  • 81. The smallest value that can assume is 0, which occurs when SSE = 0, that is, when all the points fall on the regression line. Thus, when is small, the fit is excellent, and the linear regression model is likely to be an effective analytical and forecasting tool. When is large, the regression model is a poor one, it is of little value to be used. BUS B272 Unit 1 Standard Error of Estimate
  • 82. BUS B272 Unit 1 The Coefficient of Determination (r 2 or R 2 ) By themselves, SSR, SSE and SS(Total) provide little that can be directly interpreted. A simple ratio of SSR and SS(Total) provides a measure of the usefulness of the regression equation. Measures the proportion of variation in Y that is explained by the independent variable X in the regression model
  • 83. BUS B272 Unit 1 Coefficients of Determination (r 2) r2 = 1 Y Y r2 = 1 ^ Y = b + b X i 0 1 i ^ Y = b + b X i 0 1 i X X r2 = 0 r2 = 0.8 Y Y ^ ^ Y = b + b X Y = b + b X i 0 1 i i 0 1 i X X
  • 84. BUS B272 Unit 1 Coefficient of Correlation Coefficient of correlation is used to measure strength of association (linear relationship) between two numerical variables) Only concerned with strength of the relationship No causal effect is implied
  • 85. BUS B272 Unit 1 (continued) Population correlation coefficient is denoted by  (Rho). Sample correlation coefficient is denoted by r . It is an estimate of  and is used to measure the strength of the linear relationship in the sample observations. Coefficient of Correlation
  • 86. BUS B272 Unit 1 Coefficient of Correlation
  • 87. BUS B272 Unit 1 Sample of Observations from Various r Values Y Y Y X X X r = –1 r = –0.6 r = 0 Y Y X X r = 0.6 r = 1
  • 88. BUS B272 Unit 1 Features of r and r Unit free Range between –1 and 1 The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship The closer to 0, the weaker the linear relationship
  • 89. BUS B272 Unit 1 There is also a more systematic way to assess model fitness, i.e., to perform a hypothesis testing on the slope of the regression line. Inference about the Slope If the two variables involved are not at all linearly related, one could observe from the scatter diagram shown on the right that the slope of the regression line will be zero.
  • 90. BUS B272 Unit 1 Hence, we can determine whether a significant relationship between the variables X and Y exists by testing whether (the true slope) is equal to zero. Inference about the Slope (There is no linear relationship) (There is a linear relationship) If is rejected, there is evidence to believe that a linear relationship exists between X and Y.
  • 91. BUS B272 Unit 1 The standard error of the slope The estimated standard error of .
  • 92. BUS B272 Unit 1 Inference about the Slope: t Test t test for a population slope Is there a linear dependency of Y on X ? Null and alternative hypotheses H0: 1 = 0 (no linear dependency) H1: 1 0 (linear dependency) Test statistic:
  • 93. BUS B272 Unit 1 Example: Store Sales Data for Seven Stores: Estimated Regression Equation: Annual Store Square Sales Feet ($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760  Yi = 1636.415 +1.487Xi The slope of this model is 1.487. Is square footage of the store affecting its annual sales?
  • 94. H0: 1 = 0 0.05 H1: 1 0 df7 - 2 = 5 Test Statistic: BUS B272 Unit 1
  • 95. BUS B272 Unit 1 Inferences about the Slope: t Test Example Reject Reject 0.025 0.025 0 2.5706 -2.5706 Decision: Conclusion: Critical Value(s): Reject H0 At 5% level of significance, there is evidence to reveal that square footage is associated with annual sales.
  • 96. BUS B272 Unit 1 (No linear relationship) (A linear relationship) (No positive linear relationship) (A positive linear relationship) (No negative linear relationship) (A negative linear relationship) Inferences about the Slope
  • 97. BUS B272 Unit 1 Exercise 3 Consider the data of Exercise 2 about the level of air pollution at Kwun Tong and the total number of consultations that relate to respiratory diseases in a public clinic in the area. Test at the 5% level of significance to determine whether level of air pollution and the total number of consultations are positively linearly related.
  • 98. BUS B272 Unit 1 Solution: 0.05; df14 - 2 = 12
  • 99. BUS B272 Unit 1 Exercise 3
  • 100. BUS B272 Unit 1 Computer Output For two-tailed test
  • 101. BUS B272 Unit 1 Exercise 3 Decision: Conclusion: Reject H0 Critical Value(s): Reject H0 At 5% level of significance, there is evidence to believe that level of air pollution and total number of consultations are positively linearly related. 0.05 0 1.7823
  • 102. BUS B272 Unit 1 You have seen how can we assess the model fitness. If the model fits satisfactorily, we can use it to forecast and estimate values of the dependent variable. We can obtain a point prediction of Y with a given value of X using the linear regression line. Confidence interval about the particular value of Y or the average of Y for a given value of X can also be computed if desired. Estimation of Mean Values
  • 103. BUS B272 Unit 1 Estimation of Mean Values Confidence interval estimate for : The mean of Y given a particular Size of interval varies according to distance away from mean, Standard error of the estimate t value from table with df = n - 2
  • 104. BUS B272 Unit 1 Prediction of Individual Values Prediction interval for individual response Yi at a particular Addition of one increases width of interval from that for the mean of Y
  • 105. BUS B272 Unit 1 Interval Estimates for Different Values of X Confidence Interval for the mean of Y Prediction Interval for a individual Yi Y  Yi = b0 + b1Xi X Y given X
  • 106. BUS B272 Unit 1 Example: Stores Sales Data for seven stores: Predict the annual sales for a store with 2000 square feet. Annual Store Square Sales Feet ($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Regression Model Obtained:  Yi = 1636.415 +1.487Xi
  • 107. Estimation of Mean Values: Example Confidence Interval Estimate for Find the 95% confidence interval for the average annual sales for a 2,000 square-foot store.  Predicted Sales Yi = 1636.415 +1.487Xi = 4609.68 ($000) tn-2 = t5 = 2.571 X = 2350.29 BUS B272 Unit 1
  • 108. Prediction Interval for Y : Example Prediction Interval for Individual Y Find the 95% prediction interval for the annual sales of a 2,000 square-foot store  Predicted Sales Yi = 1636.415 +1.487Xi = 4609.68 ($000) tn-2 = t5 = 2.571 X = 2350.29 BUS B272 Unit 1
  • 109. BUS B272 Unit 1 Computer Application Commands:Tools/ Data Analysis Plus/ Prediction Interval.
  • 110. BUS B272 Unit 1 Computer Output
  • 111. BUS B272 Unit 1 Linear Regression Assumptions 1. Normality Y values are normally distributed for each X Probability distribution of error is normal 2. Homoscedasticity (Constant Variance) 3. Independence of Errors
  • 112.
  • 113. For each X value, the “spread” or variance around the regression line is the same.Variation of Errors around the Regression Line f(e) Y X2 X1 X Sample Regression Line .
  • 115. BUS B272 Unit 1 Introduction Extension of the simple linear regression model to allow for any fixed number of independent variables. That is, the number of independent variables could be more than one.
  • 116. BUS B272 Unit 1 Multiple Linear Regression To make use of computer printout to Assess the model How well it fits the data Is it useful Are any required conditions violated? Employ the model Interpreting the coefficients Predictions using the prediction equation Estimating the expected value of the dependent variable
  • 117. BUS B272 Unit 1 Allow for k independent variables to potentially be related to the dependent variable y = b0 + b1x1+ b2x2 + …+ bkxk + e Regression Coefficients Random error variable Dependent variable Independent variables Model and Required Conditions
  • 118. Multiple Regression for k = 2, Graphical Demonstration X 1 The simple linear regression model allows for one independent variable, “x” for y = b0 + b1x + e y y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 y = b0 + b1x1 + b2x2 The multiple linear regression model allows for more than one independent variable. Y = b0 + b1x1 + b2x2 + e X2 BUS B272 Unit 1
  • 119. BUS B272 Unit 1 The errore is normally distributed. The mean is equal to zero and the standard deviation is constant (se)for all values of y. The errors are independent. Required conditions for the error variable
  • 120.
  • 121. Assess the model fitness using statistics obtained from the sample.
  • 122.
  • 123. Estimating the Coefficients and Assessing the Model, Example Physical Profitability Margin (%) Market awareness Competition Customers Community Number Office space Income Distance Nearest Enrollment Median household income of nearby area (in $thousands) Number of hotels/motels rooms within 3 miles from the site Enrollemnt in nearby university or college (in thousands) Distance to the downtown core (in miles) Number of miles to closest competition Office space in nearby community BUS B272 Unit 1
  • 124. BUS B272 Unit 1 Estimating the Coefficients and Assessing the Model, Example Data were collected from randomly selected 100 inns that belong to La Quinta, and ran for the following suggested model: Margin = b0 + b1Rooms + b2Nearest + b3Office + b4College + b5Income + b6Disttwn Xm18-01
  • 125. BUS B272 Unit 1 Regression Analysis, Excel Output Margin = 38.14 - 0.0076Number +1.65Nearest + 0.020Office Space +0.21Enrollment + 0.41Income - 0.23Distance This is the sample regression equation (sometimes called the prediction equation)
  • 126. BUS B272 Unit 1 Model Assessment The model is assessed using two tools: The coefficient of determination The F -test of the analysis of variance The standard error of estimates participates in building the above tools.
  • 127. BUS B272 Unit 1 Standard Error of Estimate The standard deviation of the error is estimated by the Standard Error of Estimate: The magnitude of seis judged by comparing it to
  • 128. BUS B272 Unit 1 From the printout, se = 5.51 Calculating the mean value of y, we have It seems se is not particularly small. Question:Can we conclude the model does not fit the data well? Standard Error of Estimate
  • 129. BUS B272 Unit 1 Coefficient of Determination The definition is: From the printout, r 2 = 0.5251 52.51% of the variation in operating margin is explained by the six independent variables. 47.49% remains unexplained.
  • 130. BUS B272 Unit 1 Testing the Validity of the Model For testing the validity of the model, the following question is asked: Is there at least one independent variable linearly related to the dependent variable? To answer the question we test the hypothesis H0: b1 = b2 = … = bk = 0 H1: At least one bi is not equal to zero. If at least one bi is not equal to zero, the model has some validity or usefulness.
  • 131. BUS B272 Unit 1 Testing the Validity of the La Quinta Inns Regression Model The hypotheses are tested by an ANOVA procedure ( the Excel output) MSR / MSE k = n–k–1 = n-1 = SSR MSR=SSR / k SSE MSE=SSE / (n-k-1)
  • 132. BUS B272 Unit 1 Testing the Validity of the La Quinta Inns Regression Model [Total variation in y] SS(Total) = SSR + SSE. Large F results from a large SSR. That implies much of the variation in y can be explained by the regression model; the model is useful, and thus, the null hypothesis should be rejected. Therefore, the rejection region is: F > Fa, k, n – k – 1 while the test statistic is:
  • 133. BUS B272 Unit 1 Testing the Validity of the La Quinta Inns Regression Model Fa, k, n-k-1 = F0.05,6,100-6 -1 = 2.17 F = 17.14 > 2.17 Conclusion: There is sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis. At least one of the bi is not equal to zero. Thus, at least one independent variable is linearly related to y. This linear regression model is valid. Also, the p-value (Significance F) = 0.0000; Reject the null hypothesis.
  • 134. BUS B272 Unit 1 Interpreting the Coefficients b0 = 38.14. This is the intercept, the value of y when all the variables take the value zero. Since the data range of all the independent variables do not cover the value zero, do not interpret the intercept. b1 = – 0.0076. In this model, for each additional room within 3 mile of the La Quinta inn, the operating margin decreases on average by 0.0076% (assuming the other variables are held constant).
  • 135. BUS B272 Unit 1 Interpreting the Coefficients b2 = 1.65. In this model, for each additional mile that the nearest competitor is to a La Quinta inn, the operating margin increases on average by 1.65% when the other variables are held constant. b3 = 0.020.For each additional 1000 sq-ft of office space, the operating margin will increase on average by 0.02% when the other variables are held constant. b4 = 0.21. For each additional thousand students the operating margin increases on average by 0.21% when the other variables are held constant.
  • 136. BUS B272 Unit 1 Interpreting the Coefficients b5 = 0.41. For additional $1000 increase in median household income, the operating margin increases on average by 0.41%, when the other variables remain constant. b6 = -0.23. For each additional mile to the downtown center, the operating margin decreases on average by 0.23% when the other variables are held constant.
  • 137. BUS B272 Unit 1 Testing the Coefficients The hypothesis for each bi is Excel printout Test statistic: H0: bi= 0 H1: bi¹ 0 d.f. = n - k -1
  • 138. BUS B272 Unit 1 Using the Linear Regression Equation The model can be used for making predictions by Producing prediction interval estimate for the particular value of y, for a given set of values of xi. Producing a confidence interval estimate for the expected value of y, for a given set of values of xi. The model can be used to learn about relationships between the independent variables xi, and the dependent variable y, by interpreting the coefficients bi
  • 139. BUS B272 Unit 1 La Quinta Inns, Predictions Xm18-01 Predict the average operating margin of an inn at a site with the following characteristics: 3815 rooms within 3 miles, Closet competitor 0.9 miles away, 476,000 sq-ft of office space, 24,500 college students, $35,000 median household income, 11.2 miles away from downtown center. MARGIN = 38.14 - 0.0076(3815)+1.65(0.9) + 0.020(476) +0.21(24.5) + 0.41(35) - 0.23(11.2) = 37.1%
  • 140. BUS B272 Unit 1 La Quinta Inns, Predictions Interval estimates by Excel (Data Analysis Plus) It is predicted, with 95% confidence that the operating margin will lie between 25.4% and 48.8%. It is estimated the average operating margin of all sites that fit this category falls within 33% and 41.2%. Both of them suggested that the given site would not be profitable (less than 50%).