This document discusses exploratory factor analysis (EFA). EFA is used to identify underlying factors that explain the pattern of correlations within a set of observed variables. The document outlines the steps of EFA, including testing assumptions, constructing a correlation matrix, determining the number of factors, rotating factors, and interpreting the factor loadings. It provides an example of running EFA on a dataset with 11 physical performance and anthropometric variables from 21 participants. The analysis extracts 3 factors that explain over 80% of the total variance.
1. • RAKESH KUMAR
• MUKESH CHANDRA
BISHT(PhD Scholar, LNIPE)
A Presentation by
AN INTRODUCTION TO EXPOLRATORY
FACTOR ANALYSIS
2. “When you CAN MEASURE what you are speaking about and
express it in numbers, you know something about it; but
when you CANNOT express it in numbers your knowledge is
of a mearge and unsatisfactory kind.”
Measurement is necessary.
LORD KELVIN, British Scientist
3. FIRST NOTABLE MENTION
Charles Edward Spearmen was known for his
seminal work on testing and measuring of HUMAN
INELLIGENCE by using the FACTOR ANALYSIS
during World War I.
CHARLES EDWARD SPEARMEN
(BRITISH PSYCHOLOGIST)
4. A factor is a linear combination of variables. It
is a construct that is not directly observed but
that needs to be inferred from the input
variables.
5. • Variable reduction technique
• Reduces a set of variable in terms of a small number of
latent factors(unobservable).
• Factor analysis is a correlational method used to find and
describe the underlying factors driving data values for a
large set of variables.
6. SIMPLE PATH DIAGRAM FOR A FACTOR ANALYSIS MODEL
•F1 and F2 are two common factors. Y1,Y2,Y3,Y4, and Y5 are observed
variables, possibly 5 subtests or measures of other observations such as
responses to items on a survey.
• e1,e2,e3,e4, and e5 represent residuals or unique factors, which are assumed
to be uncorrelated with each other.
8. Testing the Assumptions
Construction of correlation Matrix
Interpretation of Factors
Rotation of Factors
Determination of Number of Factors
Method of Factor Analysis
9. 1. No outliers in the data set.
2. Normality of the data set.
3. Adequate sample size.
4. Multi collinearity and singularity among the
variables does not exist.
5. Homoscedasticity does not exist between the
variables because factor analysis is a linear function
of measured variables.
6. Variables should be linear in nature.
7. Data should be metric in nature i.e. on interval and
ratio scale.
KMO test is used
10. Bartlett test of sphericity
It test the null hypothesis that all the correlation between
the variables is Zero.
It also test whether the correlation matrix is a identity matrix
or not.
If it is an identity matrix then factor analysis becomes in
appropriate.
Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy
This test checks the adequacy of data for running the factor
analysis. The value of KMO ranges from 0 to 1. The larger the value
of KMO more adequate is the sample for running the factor
analysis. Kaiser recommends accepting values greater than 0.5 as
acceptable.
11. Testing the Assumptions
Construction of correlation Matrix
Problem formulation
Interpretation of Factors
Rotation of Factors
Determination of Number of Factors
Method of Factor Analysis
12. •Analyses the pattern of correlations between variables in the
correlation matrix
•Which variables tend to correlate highly together?
•If variables are highly correlated, likely that they represent the
same underlying dimension
Factor analysis pinpoints the clusters of high correlations between
variables and for each cluster, it will assign a factor
13. Q1 Q2 Q3 Q4 Q5 Q6
Q1 1
Q2 .987 1
Q3 .801 .765 1
Q4 -.003 -.088 0 1
Q5 -.051 .044 .213 .968 1
Q6 -.190 -.111 0.102 .789 .864 1
• Q1-3 correlate strongly with each other and hardly at all with 4-6
• Q4-6 correlate strongly with each other and hardly at all with 1-3
• Two factors!
14. Testing the Assumptions
Construction of correlation Matrix
Problem formulation
Interpretation of Factors
Rotation of Factors
Determination of Number of Factors
Method of Factor Analysis
15. Method of Factor Analysis
(A) Principal component analysis
•Provides a unique solution, so that the original data can be
reconstructed from the results
•It looks at the total variance among the variables that is the
unique as well as the common variance.
•In this method, the factor explaining the maximum variance is
extracted first.
16. Uses an estimate of common variance among the original
variables to generate factor solution.
Because of this, the number of factors will always be less than
the number of original variables
(B) Common factor analysis
Un weighted least squares, Generalized least squares, Maximum
likelihood, Principal axis factoring, Alpha factoring, and Image
factoring.
Other Method s Includes:-
17. Variable
Specific
Variance
Error
Variance
Common
Variance
Variance unique
to the variable
itself
Variance due to
measurement
error or some
random,
unknown source
Variance that a
variable shares
with other
variables in a
matrix
When searching for the factors underlying the relationships between a
set of variables, we are interested in detecting and explaining the
common variance
Total Variance = common variance + specific
variance + error variance
18. Testing the Assumptions
Construction of correlation Matrix
Problem formulation
Interpretation of Factors
Rotation of Factors
Determination of Number of Factors
Method of Factor Analysis
19. Determination of Number of Factors
EIGEN VALUE
•The Eigen value for a given factor measures the variance in all the
variables which is accounted for by that factor.
•It is the amount of variance explained by a factor. It is also called as
characteristic root.
Kaiser Guttmann Criterion
This method states that the number of factors to be extracted should
be equal to the number of factors having an Eigen value of 1 or greater
than 1.
20. The Scree Plot
The examination of the Scree plot provides a visual of the
total variance associated with each factor.
The steep slope shows the large factors.
The gradual trailing off (scree) shows the rest of the factors
usually lower than an Eigen value of 1.
22. Testing the Assumptions
Construction of correlation Matrix
Problem formulation
Interpretation of Factors
Rotation of Factors
Determination of Number of Factors
Method of Factor Analysis
23. • Maximizes high item loadings and minimizes low item loadings,
thereby producing a more interpretable and simplified solution.
• Two common rotation techniques orthogonal rotation and
oblique rotation.
Rotation of Factors
Rotation
Orthogonal Oblique
Varimax Qudramax Equamax Direct Oblimin Promax
24.
25.
26. Testing the Assumptions
Construction of correlation Matrix
Problem formulation
Interpretation of Factors
Rotation of Factors
Determination of Number of Factors
Method of Factor Analysis
27. Factor Loading
• It can be defined as the correlation coefficient between the variable
and the factor.
• The squared factor loading of a variable indicates the percentage
variability explained by the factor in that variable. A factor loading of
0.7 is considered to be sufficient.
28. COMMUNALITY
•The communality is the amount of variance each variable in the
analysis shares with other variables.
•Squared multiple correlation for the variable as dependent using the
factors as predictors and is denoted by h2.
• The value of communality may be considered as the indicator of
reliability of a variable.
38. Click on Rotation
Click on Continue
Click on Rotation
Click on Continue
Click on OK
Select VARIMAX Rotation
39. Descriptive Statistics
Mean Std. Deviation Analysis N
Standing Broad Jump 212.3810 15.45793 21
Shuttle Run 10.2514 .51167 21
Fifty Meter Dash 7.6938 .80880 21
Twelve Meter run and walk 2488.9524 222.46696 21
Anerobic capacity 39.9071 12.70207 21
Weight 37.8095 7.67215 21
Height 148.3810 10.18566 21
Leg Length 76.3333 5.18009 21
Calf Girth 28.5238 1.99045 21
Thigh Girth 40.5238 3.51595 21
Shoulder Width 38.1429 4.43041 21
40. Correlation Matrix
Standin
g Broad
Jump
Shuttle
Run
Fifty
Meter
Dash
Twelve
Meter
run and
walk
Anerob
ic
capacit
y
Weigh
t
Heigh
t
Leg
Lengt
h
Calf
Girth
Thigh
Girth
Shoul
der
Width
Correlation
Standing
Broad
Jump
1.000
Shuttle
Run
-.651 1.000
Fifty Meter
Dash
-.359 .277 1.000
Twelve
Meter run
and walk
.539 -.691 -.492 1.000
Anerobic
capacity
.608 -.709 -.322 .686 1.000
Weight .469 -.087 -.231 -.045 .255 1.000
Height .416 -.048 -.358 .010 .142 .947 1.000
Leg Length .513 -.321 -.354 .151 .292 .687 .675 1.000
Calf Girth .606 -.495 -.400 .366 .602 .577 .522 .739 1.000
Thigh Girth .584 -.515 -.186 .269 .589 .632 .543 .646 .773 1.000
Shoulder
Width
.455 -.483 .128 .279 .410 .405 .244 .322 .377 .451 1.000
41. KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .687
Bartlett's Test of Sphericity
Approx. Chi-Square 165.579
df 55
Sig. .000
Since the value of KMO is more than 0.5 so the sample
taken in the study is adequate to run the factor analysis.
Since the value for significance in Bartlett test of
sphericity is less than 0.05 so the null hypothesis i.e. all
the correlation between the variables is 0 is rejected. So
the correlation matrix is not an identity matrix and that is
good.
42. Total Variance Explained
Compon
ent
Initial Eigenvalues Extraction Sums of Squared
Loadings
Rotation Sums of Squared
Loadings
Total % of
Variance
Cumulati
ve %
Total % of
Variance
Cumulati
ve %
Total % of
Variance
Cumulati
ve %
1 5.429 49.355 49.355 5.429 49.355 49.355 3.890 35.364 35.364
2 2.157 19.608 68.963 2.157 19.608 68.963 3.692 33.559 68.924
3 1.241 11.285 80.247 1.241 11.285 80.247 1.246 11.324 80.247
4 .595 5.407 85.654
5 .421 3.831 89.485
6 .367 3.336 92.821
7 .243 2.214 95.035
8 .216 1.967 97.001
9 .180 1.637 98.638
10 .137 1.241 99.880
11 .013 .120 100.000
Extraction Method: Principal Component Analysis.
We are looking for
an Eigen value above
1.0
Cumulative percent of
variance explained.
43. These three factors will be extracted out as they have an eigen value
greater than 1.
44. Factor loadings of all the variables on each of the two factors have been
shown here. Since this is an unrotated factor solution, some of the
variables may show their contribution in more than one factor. In order
to avoid this situation, the factors are rotated by using the varimax
rotation technique.
Unrotated Component Matrix
Component
1 2 3
Standing Broad Jump .814 -.179 .020
Shuttle Run -.682 .587 -.136
Fifty Meter Dash -.469 .108 .808
Twelve Meter run and walk .549 -.694 -.230
Anerobic capacity .731 -.484 .053
Weight .700 .650 .050
Height .647 .663 -.159
Leg Length .762 .396 -.087
Calf Girth .863 .088 -.051
Thigh Girth .835 .138 .199
Shoulder Width .560 -.082 .660
Extraction Method: Principal Component Analysis.
a. 3 components extracted.
45. Rotated Component Matrix
Component
1 2 3
Standing Broad Jump .469 .689 -.003
Shuttle Run -.091 -.901 -.090
Fifty Meter Dash -.292 -.356 .820
Twelve Meter run and walk -.069 .868 -.279
Anaerobic capacity .200 .855 .012
Weight .954 .010 .079
Height .930 -.047 -.128
Leg Length .828 .230 -.074
Calf Girth .690 .524 -.058
Thigh Girth .696 .483 .194
Shoulder Width .332 .479 .646
Extraction Method: Principal Component Analysis.
Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 4 iterations.
After varimax rotation factors will have non-overlapping variables. If the variable
has factor loadings more than 0.7, it indicates that the factor extracts sufficient
variance from that variable. Thus, all those variables having loadings more than 0.7
or more on a particular factor is identified in that factor.
46. Shuttle Run
Fifty Meter Dash
Twelve Meter run and
walk
ANTHROPOMETRIC
Weight
Height
Leg Length
Name each factor as per your wish
PHYSICAL