2. Me:
Political Science PhD, Data Scientist, Teacher, Do-
Gooder. Check me out on twitter: @ruchowdh, or on
my website: rummanchowdhury.com (psst, I post
cool jobs there)
What’s Metis?
Metis accelerates the careers of data scientists by
providing full-time immersive bootcamps, evening
part-time professional development courses, online
training, and corporate programs.
Who is Rumman? What’s a Metis?
3. What is PCA?
Why do we need dimensionality reduction?
Intuition behind Principal Components Analysis
Coding example
18. Thousand dimensions:
I specified you with such high
resolution, with so much
detail, that you don’t look
like anybody else anymore.
You’re unique.
Curse of Dimensionality
19. Height
Classification, clustering and other analysis methods
become exponentially difficult with increasing
dimensions.
Cigarettes per day
Curse of Dimensionality
20. Height
Classification, clustering and other analysis methods
become exponentially difficult with increasing
dimensions.
To understand how to divide that huge space, we
need a whole lot more data (usually much more
than we do or can have).
Cigarettes per day
Curse of Dimensionality
21. Height
Lots of features, lots of data is best. But what if
you don’t have the luxury of ginormous amounts of
data?
Not all features provide the same amount of
information. We can reduce the dimensions
(compress the data) without necessarily losing too
much information.
Cigarettes per day
Dimensionality Reduction
22. Feature Extraction
Do I have to choose the
dimensions among existing
features?
Height
Cigarettes per day
23. Feature Extraction
Do I have to choose the
dimensions among existing
features?
Height
Cigarettes per day
24. Why do we need dimensionality reduction?
- To better perform analyses
- …without sacrificing the information we
get from our features
- To better visualize our data
37. Singular Value Decomposition
The eigenvectors and eigenvalues of a covariance (or
correlation) matrix represent the "core" of a PCA:
The eigenvectors (principal components) determine
the directions of the new feature space, and the
eigenvalues determine their magnitude.
In other words, the eigenvalues explain the
variance of the data along the new feature axes.
PCA Math
38. Correlation or Covariance Matrix?
Use the correlation matrix to calculate the principal components
if variables are measured by different scales and you want to
standardize them or if the variances differ widely between
variables. You can use the covariance or correlation matrix in all
other situations.
Matrix Selection
39. Kaiser Method
Retain any components with eigenvector values
greater than 1
Scree Test
Bar plot that shows the variance explained by each
component. Ideally you will see a clear drop-off
(elbow).
Percent Variance Explained
Calculate the sum of variance explained by each
component, stop when you reach a point.
How do I know how many dimensions to
reduce by?
40. What is the intuition behind PCA?
- We are attempting to resolve the curse of
dimensionality
- by shifting our perspective
- and keeping the eigenvectors that explain the
highest amount of variance.
- We select those components based on our end
goal, or by particular methods (Kaiser, Scree, %
Variance).