Exploring the Future Potential of AI-Enabled Smartphone Processors
Johnny Aqm Presentation
1. Effect of Number of Categories and Category
Boundaries on Recovery of Latent Linear
Correlations from Optimally Weighted
Categorical Data
Johnny Lin
Advisor: Peter Bentler
November 19, 2008
2. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
3. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
4. Introducing LINEALS
A Method of Optimal Scaling
Algorithm
An iterative process that minimizes m m 2 2 2
l=1 (ηjl − rjl ) where ηjl
j=1
is a measure of nonlinearity.
Developed by Jan de Leeuw and implemented by Patrick Mair.
Assumption
That bi-linearization is possible. No assumption of normality.
5. Plot of LINEALS Transformation
Criterion: Linearize both X on Y and Y on X simultaneously.
Figure: Red: X on Y , Blue: Y on X
6. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
7. Questions to ask
First, define good recovery as small deviation from true score.
1. Does LINEALS recover true population correlations better
than Pearson for categorical data?
2. Is the performance of LINEALS robust?
3. What factors influence good recovery?
8. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
9. Conditions tested
Correlation Type, True Population Correlation, Number of
Categories, and Homogeneity
Condition Parameters
{0=LINEALS, 1=Pearson}
1. Correlation Type (r)
{0.3,0.5,0.7,0.9}
2. True Population Correlation (P)
{2,3,5,7,10}
3. Number of Categories (V)
{0=Non-Homogeneous, 1=Homogeneous}
4. Homogeneity (h)
Total of 80 combinations (2x4x5x2).
10. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
11. Creating functions in R
For each combination (total of 80):
1. Generate 1000 sets of bivariate normal data.
2. Make “cuts” (homogeneous vs. non-homogeneous).
3. Run through LINEALS / Pearson.
4. Calculate deviation of result and true population correlation.
5. Repeat Steps 1 - 4 twenty-five times.
Result: Total of 2000 deviations (80x25).
12. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
13. Hierarchical Regression
Description
DV: deviation of sample correlation from true population
correlation |ρ12 | − |ˆ12 |
ρ
IVs: main effect and interactions of four conditions (total of
15)
Four main effects (h,r,P,V)
Six 2-way interactions (hr, hP, hV, . . . )
Four 3-way interactions (hrP, hrV, . . . )
One 4-way interaction (hrPV)
14. Hierarchical Regression
Model Selection
Tested full model against nested models.
Confirmed with Best Subset Regression.
Optimal Adj. R 2 and Mallow’s CP found with 7-8 parameters.
(a) Adj. R 2 (b) Mallow’s CP
15. Final Model
SPSS Output
Coefficients(a)
Unstandardized Standardized
Model Coefficients Coefficients t Sig.
B Std. Error Beta
1 (Constant) .189 .006 31.240 .000
h -.113 .012 -.620 -9.299 .000
r .007 .002 .041 3.054 .002
V -.024 .001 -.773 -40.558 .000
P .098 .008 .241 12.655 .000
hV .013 .002 .487 7.164 .000
hP .117 .018 .435 6.392 .000
hPV -.017 .003 -.422 -6.326 .000
a Dependent Variable: difference
Difference between LINEALS and Pearson deviations is .007
controlling for other factors.
16. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
17. Plot of Main Effects I
Figure: Main Effect of Number of
Figure: Main Effect of Population
Categories V
Correlation P
18. Plot of Main Effects II
Figure: Main Effect of Homogeneity h Figure: Main Effect of Correlation Type r
19. Outline
Introduction
LINEALS
Forming a Hypothesis
Method
Description
Simulation
Analysis
Results
Main Effects
Interactions
20. Plot of Significant Interactions
Note: The significant 3-way interaction hPV is not plotted.
Figure: Population Correlation by Levels Figure: Number of Categories by Levels
of Homogeneity hP of Homogeneity hV
21. Interaction of Correlation Type and Number of Categories
When rV added into regression model, the main effect of
Correlation Type r goes away.
Suggests that number of categories may contribute to the LINEALS vs.
Pearson difference.
Figure: Number of Categories by Correlation Type (rV, marginally sig.)
22. Summary
1. LINEALS performs slightly better than Pearson under
bivariate normal categorizations.
2. The non-significant interactions with Correlation Type suggest
that LINEALS is robust.
3. Recovery of true population correlations is highly influenced by
homogeneity (i.e., the underlying equality of interval widths).
Future Studies
How does it compare against polychoric correlations?
Is the resulting matrix positive definite?