Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
NG BB 36 Simple Linear Regression
1. UNCLASSIFIED / FOUO
UNCLASSIFIED / FOUO
National Guard
Black Belt Training
Module 36
Simple Linear Regression
UNCLASSIFIED / FOUO
This material is not for general distribution, and its contents should not be quoted, extracted for publication, or otherwise
copied or distributed without prior coordination with the Department of the Army, ATTN: ETF. UNCLASSIFIED / FOUO
2. UNCLASSIFIED / FOUO
CPI Roadmap – Analyze
8-STEP PROCESS
6. See
1.Validate 2. Identify 3. Set 4. Determine 5. Develop 7. Confirm 8. Standardize
Counter-
the Performance Improvement Root Counter- Results Successful
Measures
Problem Gaps Targets Cause Measures & Process Processes
Through
Define Measure Analyze Improve Control
ACTIVITIES TOOLS
• Value Stream Analysis
• Identify Potential Root Causes • Process Constraint ID
• Reduce List of Potential Root • Takt Time Analysis
Causes • Cause and Effect Analysis
• Brainstorming
• Confirm Root Cause to Output
• 5 Whys
Relationship
• Affinity Diagram
• Estimate Impact of Root Causes • Pareto
on Key Outputs • Cause and Effect Matrix
• FMEA
• Prioritize Root Causes
• Hypothesis Tests
• Complete Analyze Tollgate • ANOVA
• Chi Square
• Simple and Multiple
Regression
Note: Activities and tools vary by project. Lists provided here are not necessarily all-inclusive. UNCLASSIFIED / FOUO
3. UNCLASSIFIED / FOUO
Learning Objectives
Terminology and data requirements for conducting a
regression analysis
Interpretation and use of scatter plots
Interpretation and use of correlation coefficients
The difference between correlation and causation
How to generate, interpret, and use regression
equations
Simple Linear Regression UNCLASSIFIED / FOUO 3
4. UNCLASSIFIED / FOUO
Application Examples
Administrative – A financial analyst wants to predict
the cash needed to support growth and increases in
training
Market/Customer Research – The main exchange
wants to determine how to predict a customer’s
buying decision from demographics and product
characteristics
Hospitality – The MWR Guest House wants to see if
there is a relationship between room service delays
and order size
Simple Linear Regression UNCLASSIFIED / FOUO 4
5. UNCLASSIFIED / FOUO
When Should I Use Regression?
Independent Variable (X)
Continuous Attribute
Continuous
Dependent Variable (Y)
Regression ANOVA
Attribute
Logistic Chi-Square (2)
Regression Test
The tool depends on the data type. Regression is typically used with a continuous
input and a continuous response but can also be used with count or categorical
inputs and outputs.
Simple Linear Regression UNCLASSIFIED / FOUO 5
6. UNCLASSIFIED / FOUO
General Strategy for Regression Modeling
Planning and • What variables?
Data Collection • How will I get the data?
• How much data do I need?
Initial Analysis and • What input variables have the biggest
Reduction of Variables effect on the response variable?
• What are some candidate prediction
models?
Select and Refine • What is the best model?
Models
Validate • How well does the model predict new
Model observations?
Simple Linear Regression UNCLASSIFIED / FOUO 6
7. UNCLASSIFIED / FOUO
Regression Terminology
Types of Variables
Input Variable (Xs)
These are also called predictor
variables or independent variables
Best if the variables are continuous, Error
but can be count or categorical
X1
Output Variable (Ys) Process or
X2 Y
These are also called response
Product
X3
variables or dependent variables
(what we’re trying to predict)
Best if the variables are continuous,
but can be count or categorical
Simple Linear Regression UNCLASSIFIED / FOUO 7
8. UNCLASSIFIED / FOUO
Visualize the Data – A Good Start!
Scatter Plot: A graph showing a relationship (or correlation)
between two factors or variables
Lets you “see” patterns in data
Supports or refutes theories about the data
Helps create or refine hypotheses
Predicts effects under other circumstances (be careful
extending predictions beyond the range of data used)
Be Careful
Correlation does not
guarantee causation!
Simple Linear Regression UNCLASSIFIED / FOUO 8
9. UNCLASSIFIED / FOUO
Correlation vs. Causation
Correlation by itself does not imply a cause and
effect relationship!
Other examples?
Average life expectancy
Gas mileage
# divorces/10,000 Price of automobiles
Lurking
variables!
When is it correct to infer causation?
Simple Linear Regression UNCLASSIFIED / FOUO 9
10. UNCLASSIFIED / FOUO
Example: Mortgage Estimates
A Belt is trying to reduce the call length for military
clients calling for a good faith estimate on a VA loan
The Belt thinks that there is a relationship between
broker experience and call length, and creates a
scatter plot to visualize the relationship
Simple Linear Regression UNCLASSIFIED / FOUO 10
11. UNCLASSIFIED / FOUO
Example: Mortgage Estimate Scatter Plot
Hypothesis:
Brokers with more experience can provide
estimates in a shorter time.
60
50
Call Length
40
30
20
10 20 30
Broker Experience
Does it look like a relationship exists between Broker Experience and Call Length?
Simple Linear Regression UNCLASSIFIED / FOUO 11
12. UNCLASSIFIED / FOUO
Scatter Plot - Structure
Y Axis
60
Paired
(Result?) Data
50
Call Length
40
X Axis
30 ( Suspected
Influence )
20
10 20 30
Broker Experience
Paired Data?
To use a scatter plot, you must have measured two factors for a single observation or item (ex: for a
given measurement, you need to know both the call length and the broker’s experience). You have to
make sure that the data “pair-up” properly in Minitab, or the diagram will be meaningless.
Simple Linear Regression UNCLASSIFIED / FOUO 12
13. UNCLASSIFIED / FOUO
Input, Process, Output Context
PREDICTOR MEASURES RESULTS MEASURES
Y (X) (X) (Y)
Input Process Output
• Arrival • Customer
Time Satisfaction
• Accuracy • Total
• Cost Defects
• Key Specs • Cycle Time
• Cost
• Time Per Task
• In-Process Errors
• Labor Hours
• Exceptions
X Axis – Y Axis –
Independent Variable Dependent Variable
X
Simple Linear Regression UNCLASSIFIED / FOUO 13
14. UNCLASSIFIED / FOUO
Scatter Plots
No Correlation Negative Curvilinear Positive
See how one factor relates to changes in another
Develop and/or verify hypotheses
Judge strength of relationship by width or tightness of
scatter
Don’t assume a causal relationship!
Simple Linear Regression UNCLASSIFIED / FOUO 14
15. UNCLASSIFIED / FOUO
Exercise: Interpreting Scatter Plots
1. As a team, review assigned Scatter Plots – see next pages
2. What kind of correlation do you see? (Name)
3. What does it mean?
4. What can you conclude?
5. What data might this represent? (Example)
Simple Linear Regression UNCLASSIFIED / FOUO 15
16. UNCLASSIFIED / FOUO
Example One
Simple Linear Regression UNCLASSIFIED / FOUO 16
17. UNCLASSIFIED / FOUO
Example Two
Simple Linear Regression UNCLASSIFIED / FOUO 17
18. UNCLASSIFIED / FOUO
Example Three
Simple Linear Regression UNCLASSIFIED / FOUO 18
19. UNCLASSIFIED / FOUO
Minitab Example: Scatter Plot
Next, we will work through a Minitab example using
data collected at the Anthony’s Pizza company
The Belt suspects that the customers have to wait too
long on days when there are many deliveries to make
at Anthony’s Pizza
Simple Linear Regression UNCLASSIFIED / FOUO 19
20. UNCLASSIFIED / FOUO
Minitab Example: Pizza Scatter Plot
A month of data was collected, and stored in the
Minitab file Regression-Pizza.mtw
Simple Linear Regression UNCLASSIFIED / FOUO 20
22. UNCLASSIFIED / FOUO
Pizza Scatter Plot (Cont.)
When you click on Scatterplots,
this is the first dialog box that
comes up
3. Select the Simple Scatterplot
4. Click on OK to move to the
next dialog box
Simple Linear Regression UNCLASSIFIED / FOUO 22
23. UNCLASSIFIED / FOUO
Pizza Scatter Plot (Cont.)
5. Double click on
C5 Wait Time to enter it
as the Y variable, then
double click on
C6 Deliveries to enter it
as the X variable
6. Edit dialog box options
(Optional)
7. Click OK
Simple Linear Regression UNCLASSIFIED / FOUO 23
24. UNCLASSIFIED / FOUO
Pizza Scatter Plot (Cont.)
Does it look like the number of Deliveries
influences the customer’s Wait Time?
Scatterplot of Wait Time vs Deliveries
55
50
Wait Time
45
40
35
10 15 20 25 30 35
Deliveries
Simple Linear Regression UNCLASSIFIED / FOUO 24
25. UNCLASSIFIED / FOUO
Pizza Scatter Plot (Cont.)
Note: Hold your cursor over any
point on the Scatterplot and Minitab will identify the
Row, X-Value and Y-Value for that point
Simple Linear Regression UNCLASSIFIED / FOUO 25
26. UNCLASSIFIED / FOUO
Correlation Coefficients (r & r2)
Numbers that indicate the strength of the correlation
between two factors
r - strength and the direction of the relationship
Also called Pearson’s Correlation Coefficient
r2 - percentage of variation in Y attributable to the
independent variable X.
Adds precision to a person’s visual judgment about
correlation
Test the power of your hypothesis
How much influence does this factor have?
Are there other, more important, “vital few” causes?
Simple Linear Regression UNCLASSIFIED / FOUO 26
27. UNCLASSIFIED / FOUO
Interpreting Correlation Coefficients
r falls on or between -1 and 1
Calculate in Minitab
Figures below -0.65 and above
0.65 indicate a meaningful
correlation
1 = “Perfect” positive correlation
r=0
-1 = “Perfect” negative
correlation
Use to calculate r2
r=-.8
Simple Linear Regression UNCLASSIFIED / FOUO 27
28. UNCLASSIFIED / FOUO
Pearson Correlation Coefficient (r) – Mortgage
Betty Black Belt used the scatter plot to get a visual
picture of the relationship between broker experience
and call length
Now she uses the Pearson Correlation Coefficient, r,
to quantify the strength of the relationship
60
50
Call Length
40
r = - 0.896
30
(a strong negative correlation)
20
10 20 30
Broker Experience
Simple Linear Regression UNCLASSIFIED / FOUO 28
29. UNCLASSIFIED / FOUO
Exercise: Correlation
The scatter plot shows that the customers are waiting
longer when Anthony’s Pizza has to make more
deliveries
Next, the Belt wants to quantify the strength of that
relationship
To do that, we will calculate the Pearson Correlation
Coefficient, r
Simple Linear Regression UNCLASSIFIED / FOUO 29
31. UNCLASSIFIED / FOUO
Correlation Input Window
2. Double click on C5 Wait
Time and C6 Deliveries
to add them to the
Variables box
3. Uncheck the box,
Display p-values
4. Click OK
Simple Linear Regression UNCLASSIFIED / FOUO 31
32. UNCLASSIFIED / FOUO
Correlation Coefficient
Since r, the Pearson correlation, is 0.970, there is a meaningful
correlation between the wait time and number of deliveries
Simple Linear Regression UNCLASSIFIED / FOUO 32
33. UNCLASSIFIED / FOUO
Interpreting Coefficients – r2
First, we obtained r from the Correlation analysis
Next, in Regression, we will look at r2 to see how good our
model (regression equation) is
r2: Compute by multiplying r x r (Pearson correlation
squared)
Example: With an r value of .970, in the Pizza example,
the team computed r2 :
.970 x .970 = .941 or 94.1%
So, 94% of the variation in wait time is explained by the
variability in deliveries
Simple Linear Regression UNCLASSIFIED / FOUO 33
34. UNCLASSIFIED / FOUO
Regression Analysis
Regression Analysis is used in conjunction with
Correlation and Scatter Plots to predict future
performance using past results
While Correlation shows how much linear relationship
exists between two variables, Regression defines the
relationship more precisely
Use this tool when there is existing data over a
defined range
Regression analysis is a tool that uses data on
relevant variables to develop a prediction equation, or
model
Simple Linear Regression UNCLASSIFIED / FOUO 34
35. UNCLASSIFIED / FOUO
Linear Regression
In Simple Linear Regression, a single variable “X” is
used to define/predict “Y”
e.g.; Wait Time = B1 + (B2) x (Deliveries) + (error)
Simple Regression Equation: Y = B1 + (B2) x (X) +
Y B2 = Slope
y
x
X
Simple Linear Regression UNCLASSIFIED / FOUO 35
36. UNCLASSIFIED / FOUO
Exercise: Regression
Since the Pearson Correlation (r) was .970, we know
that there is a strong positive correlation between the
number of deliveries and the wait time
Next, the Belt would like to get an equation to predict
how long the customers will be waiting
Simple Linear Regression UNCLASSIFIED / FOUO 36
37. UNCLASSIFIED / FOUO
Regression (Cont.)
1. Choose Stat>Regression>Fitted Line Plot
Simple Linear Regression UNCLASSIFIED / FOUO 37
38. UNCLASSIFIED / FOUO
Fitted Line Input Window
2. Double click on
C5 Wait Time to enter it as
the Response (Y) variable
3. Double click on
C6 Deliveries to enter it as
the Predictor (X) variable
4. Make sure Linear is checked
for the type of Regression
5.Edit dialog box options
(Optional)
6. Click OK
Simple Linear Regression UNCLASSIFIED / FOUO 38
39. UNCLASSIFIED / FOUO
Pizza Regression Plot
Fitted Line Plot
Wait Time = 32.05 + 0.5825 Deliveries
55
S 1.11885
R-Sq 94.1%
R-Sq(adj) 93.9%
50
Wait Time
45
40
35
10 15 20 25 30 35
Deliveries
Simple Linear Regression UNCLASSIFIED / FOUO 39
40. UNCLASSIFIED / FOUO
Regression Analysis Results – Session Window
Prediction Equation
(Regression Model)
R-Sq is the amount of variation in the data explained by the model.
Notice that 94.1 = .970 * .970. R-Sq is the square of the Pearson
correlation from the previous analysis.
Simple Linear Regression UNCLASSIFIED / FOUO 40
41. UNCLASSIFIED / FOUO
Using the Prediction Equation
If we have 20 deliveries to make, how long will the
customer have to wait for their order?
Based on our 30 minute guarantee, how acceptable is
our performance?
Simple Linear Regression UNCLASSIFIED / FOUO 41
42. UNCLASSIFIED / FOUO
Method of “Least Squares”
Regression – Technical Note
Fitted Line Plot
Wait Time = 32.05 + 0.5825 Deliveries
55
ˆ
Y
50
“fitted” observation
(the line)
Wait Time
45
Y
40
true observation
(the data point)
35
10 15 20 25 30 35
Deliveries
Minitab will find the “best fitting” line for us. How does it do that?
•We want to have as little difference as possible between the true observations and
the fitted line
•Minitab minimizes the sums of squares of the distance between the fitted and true
observations
Simple Linear Regression UNCLASSIFIED / FOUO 42
43. UNCLASSIFIED / FOUO
Multiple Regression
Use this when you want to consider more than one
predictor variable
The benefit is that you might need more predictors to
create an accurate model
In the case of our Anthony’s Pizza example, we may
want to look at the impact that incorrect orders,
damaged pizzas, and cold pizzas have on wait time
Simple Linear Regression UNCLASSIFIED / FOUO 43
44. UNCLASSIFIED / FOUO
Individual Exercise: Pizza
As a Anthony’s Pizza Belt, you suspect that the number of
pizza defects increases when more pizzas are ordered.
You want to visualize the data and quantify the relationship
Use the Minitab file Pizza Exercise.mtw data to
investigate the relationship between “Total Pizzas” and
“Defects”
Create a scatter plot
Determine correlation
Create a fitted line plot
Determine the prediction equation
How many defects do we usually have when 50 pizzas are
on order? What do you think of this model?
Simple Linear Regression UNCLASSIFIED / FOUO 44
45. UNCLASSIFIED / FOUO
Another Exercise: Absentee Rate
The human resources director of a chain of fast-food
restaurants studied the absentee rate of employees.
Whenever employees called in sick, or simply did not
show up, the restaurant manager had to find
replacements in a hurry, or else work short-handed
The director had data on the number of absences per
100 employees per week (Y) and the average number
of months’ experience at the restaurant (X) for 10
restaurants in the chain. The director expected that
long-term employees would be more reliable and
absent less often
Simple Linear Regression UNCLASSIFIED / FOUO 45
46. UNCLASSIFIED / FOUO
Absentee Rate
1. Open an blank Minitab worksheet Experience Absences
and input the data 18.1 31.5
2. Create a scatter plot and decide 20.0 33.1
whether a straight line is a 20.8 27.4
reasonable model 21.5 24.5
3. Conduct a regression analysis and 22.0 27.0
get the linear prediction equation 22.4 27.8
4. Predict the number of absences for 22.9 23.3
employees with 19.5 months of 24.0 24.7
experience
25.4 16.9
27.3 18.1
Simple Linear Regression UNCLASSIFIED / FOUO 46
47. UNCLASSIFIED / FOUO
Takeaways
Start with a visual tool – create a scatter plot
Determine the Pearson correlation coefficient, r, to
determine the strength of the relationship
Remember that correlation does not guarantee
causation!
Create and interpret the Regression Plot
Use the prediction equation
Validate the prediction model’s r-squared using new
data (not part of the data set used in creating the
prediction equation)
Simple Linear Regression UNCLASSIFIED / FOUO 47
48. UNCLASSIFIED / FOUO
What other comments or questions
do you have?
UNCLASSIFIED / FOUO