2. Cluster analysis
The class of technique used to classify objects or
cases into relatively homogenous groups
called clusters. Also known as classification
analysis or numerical taxonomy.
Example: Clustering of variables on the
variables like quality consciousness(var1) and
Price sensitivity(var2)
It requires no prior information about sample
3. Uses of Cluster Analysis
• Segmenting the market(benefits soughts)
• Understanding Buyer behavior
• Assess new product opportunities(brands or
markets)
• Selecting test markets(grouping cities)
• Effort to reduce clusters
4. Steps
• Formulation of problem: Selecting relevant
variables on interval scale.
• Select a distance measure: how close or
different objects are?
Euclidean Distance
• Select clustering procedure
• Interpret or profiling clusters
• Assess reliability of clustering
6. Steps in SPSS
1. ANALYZE from SPSS
2. Click CLASSIFY and then HIERARCHICAL
CLUSTER
3. Move the VARIABLES into VARIABLE box
4. In Cluster check CASES. In DISPLAY Box check
STATISTICS and PLOTS
5. Click on statistics. In pop up window check
agglomeration schedule. In cluster
membership
8. Agglomeration Schedule
• “Stage” with 19 clusters
• Respondents 14 & 16 are combined “ Clusters
combined”
• Euclidean distance betwn two respondents
“Cofficients”
• “Stage cluster first appears” indicates the stage at
which first cluster is formed. Entry of 1 in stage 6,
respondent 14 was first grouped in stage 1
• “Next Stage” the stage at which another cluster is
combined with this one. Number is 6 so at the stage 6,
10 and 14 combined to form a single cluster
9. Icicle plot
• Columns corresponds to objects being clustered, 1
through 20.
• Row corresponds to number of clusters
• Figure is read from bottom to top
• First all cases are considered, last row 20 initial clusters
• First step, two closest objects are combined resulting
in 19 clusters, 14 and 16 are combined, X’s
• Row 18 corresponds, 18 clusters, 6 and 7 are
combined. Here 16 are individual, two contains two
respondents.
• Each step leads to a new cluster
10. Dendogram
• Read fro left to right
• Vertical lines represent clusters that r joined
together.
• Position of line represents the distance at
which clusters were joined
• Initially its less different as distances increase
it becomes clear.
11. Deciding the Clusters
• Practical , theoretical or conceptual
considerations while deciding number of
clusters
• In hierarchical clustering, the distances at
which clusters are formed are a criteria. In
“coefficients” column suddenly more than
doubles between stages 17 (three clusters)
and 18 (clusters). That can be seen in last two
stages of dendogram.
12. Interpret and profiling the clusters
• Cluster 1 : High values variables V1(shopping is fun) and V 3(I
combine shopping with eating out). It has a low value for V5( I
don’t care about shopping). Cluster 1 can be labeled as “fun
loving and concerned shoppers”. This consists of respondents
or cases 1,3, 6,7,8,12,15 and 17.
• Cluster 2 is just opposite with low values on V1 and V3 and
high values V5 so it can be labeled as “Apathetic shoppers”. It
consists of cases 2,5, 9, 11, 13 and 20.
• Cluster 3 has high values of V2(shopping upsets budget, V4(I
try to get best buys) and V6( comparing saves money) so they
can be labeled as economical shoppers. It consists of cases 4,
10,14, 16, 18 and 19.
14. • The Initial Cluster center are the values of three
randomly selected cases. Each case is assigned to
nearest classification cluster center
• The results also displays the cluster membership and
the distance between each case and its classification
center
• Cluster 1 of hierarchical clustering is same sa cluster
3 of non hieararchical clustering
• Cluster 3 of hierarchical clustering is same as cluster
1 of non hierarchical clustering
15. • The distance between the final cluster centers
indicated that the pair of clusters are well
seperated
• Univarite F test for each clustering variable is
presented. It is only desriptive
17. • AIC is at minimum (97.594) for a three cluster
solution. A comparison of cluster centroids
show that cluster 1(two step cluster)
corresponds to cluster 2 (hierarchical). Cluster
2(two step cluster) corresponds to cluster
3(hierarchical) .
• The results are same ensures validity of
clustering