The presentations covers important topics like- an introduction to six sigma (DMAIC) along with basics of statistics - data, sample & population, data representation, central tendency, data distribution, variance etc.
8. Let’s discuss -
• Basics of Statistics
• Data & It’s Types
• Population & Sampling
• Confidence Level & Interval
• Average Mean, Median & Mode
• Data Presentation
• Variance & Standard Deviation
• Six Sigma – DMAIC & DMADV
• DMAIC – an introduction
• Common Quality Tools
9. Statistics -
The science of collecting,
organizing, presenting,
analyzing & interpreting
Numeric data to assist in
making more effective
decisions.
10. Statistics -
DESCRIPTIVE – To
summarize & describe ‘data’
or ‘information’ collected
through an experiment, a
survey or historical record.
INFERENTIAL – Techniques
by which decision are made
only on a sample observed.
Probability, hypothesis testing,
correlation & regression
analysis are being used.
11. Data -
Collection of RAW figures like
numbers, symbols,
characters, images, audio,
video etc. representing an
information.
It gives the status of the past
activities and enables us to
make decisions.
Data must be interpreted by a
human or machine to derive
its meaning.
12. Data -
QUALITATIVE DATA –
NOMINAL (with no
inherent order)
ORDINAL (with an ordered
series)
BINARY (with two options)
QUANTITATIVE DATA –
DISCRETE (Counted)
CONTINUOUS
(Measured)
13. • Population is the collection of all
individuals or items under
consideration in a statistical study
• Sample is that part of the
population from which
information is collected.
• Sample is a subset of Population.
• Sample should be a true
representative of population.
• Sample can inform us about the
population
Population & Sample
14. 2 Types of Sampling frame
–
• Probability Sample
• Simple Random Sampling
• Systematic Sampling
• Stratified Sampling etc.
• Non Probability Sample
• Convenience Sampling
• Quota Sampling
• Purposive Sampling etc.
Sampling
17. Confidence Level–
Probability that the value of
statistical parameter (eg arithmetic
mean) for any sample is also TRUE
for the Population
A survey asked 2,000 Indians over 18 years,
years, whether they were in favor of the smoking
the smoking ban in restaurants. Overall, 75% of
Overall, 75% of the respondents answered 'yes'.
answered 'yes'. The confidence level for the
for the survey had been set at 95%, the margin of
the margin of error was set to 2%.
It means that if the survey is conducted again,
there is a 95% probability to get the same result.
Confidence Level & Confidence Interval
Confidence Interval–
Also called as Margin Of Error (+/-)
Probability that the value of statistical
parameter (eg arithmetic mean) of a
sample is between the margin
Interval for the entire Population
A survey asked 2,000 Indians over 18 years,
whether they were in favor of the smoking ban in
restaurants. Overall, 75% of the respondents
answered 'yes'. The confidence level for the
survey had been set at 95%, the margin of error
was set to 2%.
It means that if the responses for the entire
population will be between 73% (75%-2%) and
77% (75%+2%).
18. • Equal size of each class into
range of a variable is
divided.
• Used to represent data
through Frequency Tables,
Graphs, Histograms, Bell
Curves etc.
Data Presentation – CLASS INTERVAL
Class Interval
(Scores)
Frequency
(occurrence)
Cumulative
Frequency
0-20
21-40
41-60
61-80
81-100
19. • Number of occurrence
• To summarize large volume
of data
• Used to represent data
through Graphs,
Histograms, Bell Curves etc.
Data Presentation - FREQUENCY
Class Interval
(Scores)
Frequency
(occurrence)
Cumulative
Frequency
0-20 0 0
21-40 0 0
41-60 4 4
61-80 10 14
81-100 13 27
20. • Number expressing the
central or typical value in
a set of data.
• MEAN = Sum of values /
Count of values (n)
• Also called Arithmetic
Mean
_
• Denoted by X
Average Mean, Median & Mode
• Represents the middle
number in a given
sequence of numbers
when its orders by rank.
• MEDIAN = n/2 OR
(n+1)/2
• Most frequent value in a
set of data.
21. • Presents grouped data with
rectangular bars
• Lengths proportional to the
values that they represent
Data Presentation – Bar Charts
0
10
20
30
40
50
60
70
80
90
100
A B C D E F G H I J K L M N O P Q R S T
TEST SCORES
Scores %
22. • Diagram consisting of
rectangles whose area is
proportional to the
frequency of a variable.
• Width is equal to the class
interval
Data Presentation – Histogram
23. • Shows distribution for
variable from the mean
value
• The most common is also
called Normal Distribution
Data Presentation – BellCurve
24. • Difference between expected and
actual output.
• Variance is everywhere….every
process has variance.
• The least the Variance, better the
process efficiency.
• Always aim to control the
variance.
• Sum of squared differences from
the mean divided by the total
sample count.
• Used to calculate Standard
Deviation
Variance
25. • Shows the variation in data
• If the data is close together,
standard deviation will be small
• Denoted by a greek letter SIGMA
Standard Deviation
26. Normal Distribution & Standard Deviation
1 Sigma - 68%
2 Sigma - 95%
3 Sigma - 99.73 %
27. Class Interval
Mid Number
(M)
Frequency
(F)
(FM)
MEAN
_
X
Deviation from
Mean
_
( M – x )
Squared
Difference from
Mean
X2
( FX2 )
0-10 5 2 10 49.33 -44.3 1962.49 3924.98
11-20 15 6 90 49.33 -34.3 1176.49 7058.94
21-30 25 12 300 49.33 -24.3 590.49 7085.88
31-40 35 25 875 49.33 -14.3 204.49 5112.25
41-50 45 34 1530 49.33 -4.3 18.49 628.66
51-60 55 30 1650 49.33 5.7 32.49 974.7
61-70 65 20 1300 49.33 15.7 246.49 4929.8
71-80 75 15 1125 49.33 25.7 660.49 9907.35
81-90 85 5 425 49.33 35.7 1274.49 6372.45
91-100 95 1 95 49.33 45.7 2088.49 2088.49
n=150 7400 48083.5
Calculating Standard Deviation -
Average Mean = FM/Sum of Frequency (F) = 7400/150 = 49.33
Standard Deviation (Sigma) = = 48083.5/150 = 17.9
31. The Process Capability is a measurable
property of a process to the specification.
Expressed as a process capability index
(e.g., Cpk or Cpm)
Two parts of process capability are:
1) Measure the variability of the output of a
process
2) Compare that variability with a
specification or product tolerance.
Process Capability
Cpk = (USL – LSL)/ 6 σ
32. FOCUS Of Six Sigma
Which one should be
controlled???
Input
OR
Output
‘Y’
OUTPUT
EFFECT
RESULT
Dependent
Symptom
Monitor
‘X’
INPUT
CAUSE
FACTOR
Independent
Problem
Infuse
41. • VOC to be further transformed
into CTQs.
• EG. A requirement of the
customer is that the tiffins
should be delivered on time.
Thus for the customer, Delivery
is Critical to the Quality
(CTQ).
• TOOLS USED –
• VOICE OF CUSTOMER/CTQ
TREE
• AFFINITY DIAGRAM
DEFINE – IDENTIFY PROJECT CTQ
42. DEFINE – #CTQ TREE
Delivery
On Time –
Within +/- 30
min of Lunch
Time
43. • BUSINESS CASE – WHY?
• PROBLEM STATEMENT –
WHAT?
• GOAL STATEMENT – WHEN?
• ROLES – WHO?
• TOOLS USED –
• STAKEHOLDER ANALYSIS
• GNATT CHARTS
DEFINE – DEVELOP PROJECT CHARTER
45. • Clearly define In & Out of
Scope Items
• TOOLS USED –
• SIPOC
• STRATIFICATION ANALYSIS
DEFINE – IDENTIFY PROJECT SCOPE
46. DEFINE – #STRATIFICATION ANALYSIS
Is Is Not Distinctions
Geography INDIRAPURAM
VAISHALI &
VASUNDHARA
VAISHALI &
VASUNDHARA
are sub
contracted
Output Delivery time
Mixups, Hygiene,
Temperature
Customer
Lower and Middle
Income Higher Income
Premium service
for higher
income group
Time After Jul’18 Before Jul’18
Increased
employees in
Jul’18
51. • Collect Data for the identified CTQ
• Translate CTQ to the measurable
output Y
• Eg for delivery –
• No. of deliveries delivered during
shipping window
• Time taken to travel from DABBAWALA
to customer’s place
• Actual delivery time perceived by the
customer
• MEASUREMENT SYSTEM
ASSESSMENT
• TOOLS USED –
• BRAINSTOMING
• CHECK SHEETS
MEASURE– DATA COLLECTION, PROJECT Y & MSA
53. • What is the definition of a
defect?
• What is customer’s
requirement on delivery time?
EG –
For Delivery
-Capture Target Delivery Time
-Get the allowed specification
limits on Y
MEASURE– DEFINE PERFOMANCE STANDARDS
54. 15 15 30 45
LATE DELIVERY
LSL USL
Visualize customer requirements
Customer
does not want
earlier than
this
The
customer
tolerance
window is 30
minutes on
either side
This is the
target
delivery time
45 30
EARLY DELIVERY
0
(MINUTES)
Customer
does not want
later than this
MEASURE– DEFINE PERFOMANCE STANDARDS
56. ANALYZE – ESTABLISH PROCESS CAPABILITY
• What are the chances of your
process creating defects?
• Measure Variation in the
current process
• TOOLS USED –
• HISTOGRAM
• BOX PLOT
• STANDARD DEVIATION
58. 15 30
LATE DELIVERY
45
Measure the process output
45 30 15
EARLY DELIVERY
0
(MINUTES)
NUMBEROFDATAPOINTS
Dabbawala has an
average delivery time ()
2 minutes late and a
standard deviation () of
10 minutes
ANALYZE – #HISTOGRAM
59. 15 30
LATE DELIVERY
45
Measure the Process Output
45 30 15
EARLY DELIVERY
0
(MINUTES)
USLLSL
30 minutes
10 minutesZ =
Process standard deviation
Customer tolerance
Z =
Z = 3
ANALYZE – #STANDARD DEVIATION
60. ANALYZE – #SIGMA LEVEL
• Z OR SIGMA LEVEL
determines the
process capability.
• Z value = 3*CPk
• Z can be calculated as
=(USL – Mean)/Standard
Deviation
61. Target
LSL USL
Center
Process
Reduce
Spread
Excessive Variation in Process
T a rg e t
U S LL S L
T a rg e t
U S LL S L
Process Off Target
ANALYZE – DEFINE PERFORMANCE OBJECTIVE
• Check the actual
problem of your
process –
• Shift the target OR
reduce the variance
• TOOLS USED –
• DATA DISTRIBUTION
62. ANALYZE – IDENTIFY SORUCES OF VARIANCE
• Identify possible
factors leading the
problem
• TOOLS USED –
• FISHBONE OR
CAUSE & EFFECT
OR ISHIKAWA
• FMEA
• BRAINSTORMING
• PARETO
63. ANALYZE – #FISHBONE
Delivery
Time
MACHINE MOTHER NATURE
MAN METHODS
Poor
dispatching
Delivery person gets lost
Delivery person does
not show up
Poor handling of largeorders
Run out of storage space
on vehicles
sacks
Don’tknow
routes
High turnover
Get wrong
Did not
understand
labels
No
teamwork
No training
Unreliable bikes
Delivery persons own junk
Cant locate
employees homes
Not on std routes
Did not
understand
labels
Too few
delivery
persons
Uneven distribution of
delivery loads
No money for
repairs
per person
Too few delivery
persons
Too many orders
Large items
difficult to carry
in bus /bikes
Too many
MATERIALS
Sacks
too small
Too much traffic
Weather
Bus service
unreliable in peak
hours
Parkingspace
problem
64. ANALYZE – #PARETO
• 80-20 RULE
• Identify VITAL few – CUT
OFF level to be decided
by the team on the basis
of
• Process Knowledge
• Resource Availibility
• TOOLS USED –
• FISHBONE OR CAUSE
& EFFECT OR
ISHIKAWA
• FMEA
• BRAINSTORMING
• PARETO
Others
Location
Order Size
Parking
Traffic
Label
66. IMPROVE – SCREEN & VALIDATE SOLUTIONS
• Screen all possible
solutions
• Discover casual
relationship & identify
the most effective ones
• TOOLS USED –
• POKA YOKE
• 5S
• SCATTER PLOTS
• REGRESSION
ANALYSIS
DELIVERYTIMESPAN
67. IMPROVE – #SCATTER PLOT
• TWO dimensional data
visualization
• Dots represents values for
two different variables
• Also called as Correlation
Plots
•
• Discover casual
relationship & identify the
most effective ones
• TOOLS USED –
• POKA YOKE
• 5S
• SCATTER PLOTS
• REGRESSION ANALYSIS
71. CONTROL – ESTABLISH NEW PROCESS CAPABILITY
• Gauge process
performance and
establish new Process
Capability
15 30
LATE DELIVERY
4545 30 15
EARLY DELIVERY
0
(MINUTES)
USLLSL
Z = 4.5
72. CONTROL – IMPLEMENT PROCESS CONTROL
• Control Plan should be in
place to ensure
sustained improvement
• Use Of Control Charts
• Documentation of
Control Plan
• Standardization
• TOOLS USED –
• CONTROL CHARTS
• RUN CHARTS
• PROCESS FLOW
• SOP
30
20
10
0
No.lunches/person
73. CONTROL – #RUN CHARTS
• Line Graph of data
plotted over TIME
• Shows Trend and
Pattern in a process
• Shows how the
process is performing
• Unlike Control Charts,
RUN CHARTS cant
say if a process is
stable
74. CONTROL – #CONTROL CHARTS
• Line Graph of data
plotted over TIME
• Shows Trend and
Pattern in a process
• Central Line for
Average with Upper
Control Limit & Lower
Control Limit, helps to
gauge is process is
stable
• Two Types – XBAR &
RCHARTS