This presentation is about sharing my experience on how to use a Measurement System Analysis on a Lean Six Sigma project and, most importantly, how to interpret statistical results from Minitab.
2. AGENDA
Objective
What is MSA
Continuous vs. Attribute MSA
Project charter
MSA
Purpose of Attribute Agreement Analysis
Prepare the study
Collect study results
Prepare Minitab data
Run Minitab tool
Attribute Agreement Analysis - Graphs
Fleiss Kappa Statistics
Minitab results – interpretation of all 4 levels of analysis
Kendall’s correlation coefficient
3. Objective
- Share the experience of using Attribute Agreement Analysis tool on a L6S project
- Present data and analysis that were not included in the GB report out
- Learn how to conduct real Gage R&R study, use Minitab and interpret statistical results
4. Q: What is MSA?
A: It’s a set of techniques that allow us to answer the question: If I use this GAGE to measure, how much can I trust the
measurements I get?
A measurement system analysis (MSA) evaluates the test method, measuring instruments, and the entire process
of obtaining measurements to ensure the integrity of data used for analysis and to understand the implications of
measurement error for decisions made about a product or process.
MSA is an important element of Six Sigma methodology and of other quality management systems.
Factors affecting measurement systems:
1. Equipment: measuring instrument, calibration, etc.
2. People: operators, training, education, skill, care
3. Process: test method, specification Conduct MSA study
4. Samples: materials, items to be tested, sampling plan, etc
5. Environment: temperature, humidity, conditioning
6. Management: training programs, metrology system, support people, support of quality management systems
5. Continuous vs. attribute MSA
Depending on the type of data different analysis can be used. Data type can be continuous (time, money, weight, height,
length or temperature, etc.) or attribute (count data, Yes or No, Good or Bad, etc.)
When the measure is continuous data a measurement device (gage, gauge) is also involved. In this situation, a Gage
R&R Study is done, where R&R stands for repeatability and reproducibility
A Gage R&R is used to estimate the total variation, the part-to-part variation, and the variation due to
measurement system.
When the measure is attribute an Attribute Gage R&R (or so called Attribute Agreement Analysis) is used to
estimate the total variation.
In the transactional world, most data is attribute and most ‘gages’ are people. When the gage is a person the term
‘’appraiser‛ or ‚operator‛ is used.
6. Project Charter – GB project
Project Title:
“Improve EMEA VO invoicing reconciliation processes
Project Definition:
“Reduce by 20% the time spent with manual checks and invoice fallouts by improving the reconciliation processes
for each of the service providers by end of July 2012
“Standardize reconciliation process across route to markets by end of July 2012
“Route to markets in scope: EMEA Volume Indirect and Volume Direct Operations
Primary Metric / Goal:
“Reduce time spent with reconciliation process activities by 20%, from 95h/week to 76h/week (358h/month to
286h/month)
Secondary Metric / Goal:
“Reduce number of invoice fallouts at service providers by 20% from 693 fallouts/month to 554 fallouts/month
7. Measurement System Analysis – GB project
Data Integrity Review
Time in hours per week to complete the invoice reconciliation process was manually measured by L1 team by each
activity step performed in the process on a daily basis. Human decision measurement step involved.
Monthly missing invoices at service providers is an accurate measurement as it is directly pulled from an Access
Database which compares SAP billing reports with Service provider billing reports.
MSA Consideration
No MSA was conducted for the monthly missing invoices at service providers as the data used is issued directly
from the internal company and service provider systems.
MSA was considered for the time to complete reconciliation process to determine if it is an accurate measurement.
Attribute Agreement Analysis was completed to confirm if desired accuracy level is met.
Wanted accuracy of 95 percent
8. Purpose of Attribute Agreement Analysis
In the ‘Improve EMEA VO invoice reconciliation processes’ project, the primary metric was about the time spent
with overall reconciliation activities, therefore the data type is continuous.
The data was collected from L1 support team feedback, without using a measuring system like systematic time
stamp or stop watch. So how do we ensure that we can trust the data, decisions, when the data is based on a
person judgment rather than an objective instrument. One way that works very well is called Attribute
Agreement Analysis.
Attribute agreement is a method of comparing the responses made by appraisers (operators) when judging the
characteristics of interest. There are four possible levels of analysis of the responses:
1. Appraisers against themselves - repeatability
2. Appraiser against other appraiser - reproducibility
3. Appraiser against a standard (if one exists)
4. Overall appraiser capability
The present Six Sigma Project case study helps explaining the tool and how to interpret results
9. Prepare the study
To prepare the study you need to establish the number of sample parts, the number of repeated readings, and the
number of operators that will be used. In our case the following was used.
The standard was set by the expert's measurement taken, the expert being the person owning the process (Alina Lisineanu)
The 3 operators (appraisers) represent
Appraiser 1 - as the original reconciliation processor => the person performing the process steps on a day to day basis (high
level of experience)
Appraiser 2 - as the second reconciliation processor => the person occasionally performing the process steps (medium to
high level of experience)
Appraiser 3 - as the third reconciliation processor => the person trained on the process steps but actually performing the
process steps for the first time (low level of experience)
The samples were established as the main process steps involving manual work. Each sample was measured 3 times on a daily basis.
Each appraiser was asked to measure the average time taken to perform each of the 8 process steps and rate them using the following
options:
a) less than 15 min
b) between 15 - 30 min
c) between 30 - 45 min
d) between 45 - 60 min
e) between 60 - 90 min
f) more than 90 min
10. Collect Study Results
You need to keep track of the study results before moving on with Minitab. It’s important to know how to arrange the
data in Minitab so the tool results are the right ones.
The way we have done it in the Six Sigma project was to collect each appraiser’s file with the responses for each trial
in an excel file. You can see below how that was done.
Table 1: Trial No.1
Sample Date Process step Appraiser 1 Appraiser 2
Appraiser 3 Standard
1 2-Jan-12 Format SP billing reports d d d d Table 3: Trial No.3
2 3-Jan-12 Format SAP billing reports b b b b Sample Date Process step Appraiser 1 Appraiser 2 Appraiser 3 Standard
3 3-Jan-12 Upload SP and SAP reports in database b b b b
4 5-Jan-12 Run all queries a a a a 1 19-Jan-12 Format SP billing reports d d d d
5 2-Jan-12 Manually review reconciliation report e e f e 2 19-Jan-12 Format SAP billing reports b b b b
6 5-Jan-12 Retrigger failed documents b b c b Upload SP and SAP reports
7 6-Jan-12 Identify business and ask for correction b c c b 3 15-Jan-12 in database b b b b
8 4-Jan-12 Operations support (per request) a a a a 4 16-Jan-12 Run all queries a a a a
Table 2: Trial No.2 Manually review
Sample Date Process step Appraiser 1 Appraiser 2 Appraiser 3 Standard 5 16-Jan-12 reconciliation report e e d e
1 10-Jan-12 Format SP billing reports d d d d 6 18-Jan-12 Retrigger failed documents b b d b
2 9-Jan-12 Format SAP billing reports b b b b Identify business and ask for
3 9-Jan-12 Upload SP and SAP reports in database b b b b 7 17-Jan-12 correction b c c b
Operations support (per
4 12-Jan-12 Run all queries a a a a
8 17-Jan-12 request) a a a a
5 11-Jan-12 Manually review reconciliation report e e f e
6 11-Jan-12 Retrigger failed documents b b d b
7 10-Jan-12 Identify business and ask for correction b c c b
8 13-Jan-12 Operations support (per request) a a a a
11. Prepare Minitab data
The next step was to arrange the data from the 3 tables in a way that we can run the Attribute Agreement Analysis
tool in Minitab
It’s easier to consolidate the data in excel and then copy paste it in Minitab
12. Run Minitab tool
As mentioned previously you can copy paste the data in Minitab or directly insert it there, as it’s best for you.
So here are the instructions to run an Attribute Agreement Analysis in Minitab If you have your spreadsheet
arranged with all responses in
one column, sample label in
another and appraiser level in
a different column, use this
section
Check here if the data is a type of ranking in degree, numerical or verbal
(like 1 through 10, A,B, C, D etc.)
13. Attribute Agreement Analysis - Graphs
There are three vertical lines, one for each appraiser. The blue dot shows the
level of agreement within their own assessment (left graph), and against the
standard (right graph)
The Original Recon Processor, for example, never changed his mind; he always
assessed the average time spent per each process step the same way (100%).
As well, he is 100% in agreement with the expert which proves the high level of
experience on the recon processes.
The 2nd Recon Processor, same as the original one, never changed his mind
and always assessed the average time spent per each process step the same
way (100%). On the other hand, he is only 87.5% in agreement with the expert
due to fact that he is occasionally performing the process acting as back-up
person.
The 3rd Recon Processor is pretty consistent too, though not as well as the first
2, being 75% in agreement with himself. However, nothing in line with the
expert. He agrees only 62.5% of the time with the expert. The case of the 3rd
expert can be explained that he is a new comer and follows the training
documentation to perform the process steps, in which case an improvement is
desired on the training level and documentation in order to increase the
assessment agreement.
14. Fleiss’ Kappa Statistics
Next we move on to interpret the session window statistics from Minitab, but before that here’s a brief explanation
of the Kappa Statistics.
The basic for the Kappa statistic is a comparison to random chance. Imagine flipping a coin to make a quality
decision on a process, that’s random chance. Kappa compares the results gathered through the study with the
possibility that those results could be randomly generated as if flipping a coin or rolling a die.
Kappa ranges from -1 to +1 with a value of 0 indicating random chance. If kappa = 1, there is a perfect agreement. If
kappa = 0, the agreement is the same as would be expected by chance. The stronger the agreement, the higher the
value of kappa. Negative values occur when agreement is weaker than expected by chance.
The Hypothesis regarding Kappa goes as follows:
H0: The agreement within appraiser is due to chance
H1: The agreement within appraiser is not due to chance
The p-value provides the likelihood of obtaining the sample, with its kappa statistics, if the null hypothesis is true. If
the p-value is less than or equal to a predetermined level of significance (alpha level), reject the null hypothesis and
conclude in the favor of the alternative hypothesis.
Alpha = 0.05 for a 95% level of confidence
15. Minitab results– within appraisers
Did each appraiser rate the average time per process step consistently across the trials?
Appraiser 1 and 2 agreed 100% between their three trials with
themselves. Appraiser 3 however, only agreed 75% between
the trials with himself.
Appraiser 1 didn’t choose ratings c or f, so Kappa could not be calculated for those options. Overall
the appraiser 1 agreed with themselves 100% so the Kappa for the ‘within’ portion is 1 which
indicates perfect agreement. And P-value <0.05 means we can reject the null hypothesis and
conclude the agreement within appraiser is not due to chance.
Appraiser 2 didn’t choose rating f , so Kappa could not be calculated for that one. Overall the
appraiser 2 agreed with themselves 100% so the Kappa for the ‘within’ portion is 1 which indicates
perfect agreement. And P-value <0.05 means we can reject the null hypothesis and conclude the
agreement within appraiser is not due to chance.
Appraiser 3 didn’t choose rating e , so Kappa could not be calculated for that one. Overall the
appraiser 3 agreed with themselves 75% so the Kappa for the ‘within’ portion is 0.78571 which
indicates strong agreement. And P-value <0.05 means we can reject the null hypothesis and
conclude the agreement within appraiser is not due to chance.
16. Minitab results – each appraiser vs standard
How did each appraiser rate the average time per process step against the standard?
Appraiser 1 agrees 100% of the time with the standard, while
appraiser 2 and 3 agree only 87.5%, respectively 62.5%, with
the known standard.
For appraiser 1 all responses are in perfect agreement with the standard, kappa for the ‘vs standard’
portion is 1. Exception those 2 options for which kappa cannot be computed. P-value < 0.05
For appraiser 2 not all responses are in perfect agreement with the standard, but the overall kappa
for the ‘vs standard’ portion is 0.82418 which indicates strong agreement. There is a negative value
for option c, indicating that the result is worse than what would be expected by chance. The p-value
for this option (0.06280 > 0.05) means that we accept the null hypothesis, however the overall P-
value for appraiser 2 < 0.05
For appraiser 3 not all responses are in perfect agreement with the standard and the overall kappa
for the ‘vs standard’ portion is 0.49634 which indicates poor agreement and improvement is
required. There are 2 negative values for option c and e, indicating that the result is worse than what
would be expected by chance. The p-value for the 2 options is higher than 0.05 which means that we
accept the null hypothesis, however the overall P-value for appraiser 3 < 0.05
17. Minitab results – between appraisers
How did each appraiser rate the average time per process step against the other appraisers?
All appraisers agree with each other 62.5% for this study, with a 95% confidence level
that they will agree with each other between 24.49% and 91.48% of the time.
The overall kappa for all 3 appraiser is 0.73217 which indicates good agreement and the
p-value < 0.05 meaning the result is not by chance. The agreement between the appraiser
is perfect when a process step was rated with option a and f, a good agreement when they
chose option b and d, and improvement is required for options c and e.
18. Minitab results – all appraisers vs standard
How did all appraisers rate the average time per process step against the standard?
All appraisers agree with the standard 62.5% for this study, with a 95% confidence level that they
will agree with the standard between 24.49% and 91.48% of the time.
The overall kappa for all appraiser is 0.77351 which indicates good agreement and the p-value <
0.05 meaning the result is not by chance. The agreement of all appraiser against the standard is
perfect when a process step was rated with option a, a good agreement when they chose option
b, d, e. Kappa cannot be computed for options c and f as these responses are not among the
standard.
19. Kendall’s correlation coefficient
If the standard is known and the data are ordered, Minitab computes Kendall’s coefficient of concordance, which can
range from -1 to 1. Positive values indicate positive association, and negative values indicate negative association.
In addition, the higher the magnitude, the stronger the association.
Within Appraisers
Kendall's Coefficient of Concordance Between Appraisers
Kendall's Coefficient of Concordance The Kendall’s Coefficient of Concordance for all 4 levels
Appraiser Coef Chi - Sq DF P of analysis is 1, indicated perfect agreement or is very
Appraiser 1 1.00000 21.0000 7 0.0038 Coef Chi - Sq DF P close to 1, also indicating very good agreement.
Appraiser 2 1.00000 21.0000 7 0.0038 0.948595 59.7615 7 0.0000 Using kappa statistics, we found previously that
Appraiser 3 0.98194 20.6208 7 0.0044 appraiser 2 and 3 did not always apply certain ratings
with absolute consistency. However, the Kendall’s
coefficients indicate that these rating discrepancies are
Each Appraiser vs Standard All Appraisers vs Standard
not major. That is, the appraisers did not seriously
Kendall's Correlation Coefficient Kendall's Correlation Coefficient
misclassify average time per process steps.
Appraiser Coef SE Coef Z P Coef SE Coef Z P
Appraiser 1 1.00000 0.166667 5.92857 0.0000 0.934962 0.0962250 9.67517 0.0000
Appraiser 2 0.93541 0.166667 5.54106 0.0000
Appraiser 3 0.86947 0.166667 5.14540 0.0000
Project CharterMandatory slide for all belt levels.Format is mandatoryReference:HP Lean Sigma Memory Jogger, p. 20Project Charter: IEEV3, section 4.9; ISS2, section 1.13Primary Metric, IEEV3, section 4.6Secondary Metric, IEEV3, section 4.8Detailed DMAIC Roadmap, block 1.5Other instructions:This shows the overall plan for the projectProject TitleUse descriptive Project Names/Titles. Do not use acronyms or “code names.” Good project name is “Improve xxx Process” where xxx clearly identifying the process to be improvedProject DefinitionDescribe the main project objectives (what you want to accomplish) of the process improvement project, but do not replicate the primary & secondary metrics / goal statements. Describe the project scope and boundaries.Describe the desired project timeline.Use bullet points instead of large paragraphsMetrics and Goals:You may have an initial idea at first about what your project metrics should be, but then will later use the Critical to Quality Tree to completely define your project metrics.Only 1 primary metric and 1 secondary metric are allowed. Secondary metric is optional. Additional project measures can be used in the project, but they are not to be listed here.
Project CharterMandatory slide for all belt levels.Format is mandatoryReference:HP Lean Sigma Memory Jogger, p. 20Project Charter: IEEV3, section 4.9; ISS2, section 1.13Primary Metric, IEEV3, section 4.6Secondary Metric, IEEV3, section 4.8Detailed DMAIC Roadmap, block 1.5Other instructions:This shows the overall plan for the projectProject TitleUse descriptive Project Names/Titles. Do not use acronyms or “code names.” Good project name is “Improve xxx Process” where xxx clearly identifying the process to be improvedProject DefinitionDescribe the main project objectives (what you want to accomplish) of the process improvement project, but do not replicate the primary & secondary metrics / goal statements. Describe the project scope and boundaries.Describe the desired project timeline.Use bullet points instead of large paragraphsMetrics and Goals:You may have an initial idea at first about what your project metrics should be, but then will later use the Critical to Quality Tree to completely define your project metrics.Only 1 primary metric and 1 secondary metric are allowed. Secondary metric is optional. Additional project measures can be used in the project, but they are not to be listed here.
Project CharterMandatory slide for all belt levels.Format is mandatoryReference:HP Lean Sigma Memory Jogger, p. 20Project Charter: IEEV3, section 4.9; ISS2, section 1.13Primary Metric, IEEV3, section 4.6Secondary Metric, IEEV3, section 4.8Detailed DMAIC Roadmap, block 1.5Other instructions:This shows the overall plan for the projectProject TitleUse descriptive Project Names/Titles. Do not use acronyms or “code names.” Good project name is “Improve xxx Process” where xxx clearly identifying the process to be improvedProject DefinitionDescribe the main project objectives (what you want to accomplish) of the process improvement project, but do not replicate the primary & secondary metrics / goal statements. Describe the project scope and boundaries.Describe the desired project timeline.Use bullet points instead of large paragraphsMetrics and Goals:You may have an initial idea at first about what your project metrics should be, but then will later use the Critical to Quality Tree to completely define your project metrics.Only 1 primary metric and 1 secondary metric are allowed. Secondary metric is optional. Additional project measures can be used in the project, but they are not to be listed here.