Reducing the number of features
- Commonly used by data scientists to visualize the data, to figure out what might be going on
Examples
Car measurements
From 3D to 2D
Data visualization
PCA algorithm
- replaces two features with one feature
- Choose the least different axis
- Coordinate on the new axis: dot product
More principal components
PCA is not linear regression
Approximation to the original data
PCA in code
Optional pre-processing: Perform feature scaling
- Fit the data to obtain 2 (or 3) new axes (principal components)
- Optionally examine how much variance is explained by each principal component
- Transform (project) the data onto the new axes
Applications of PCA
- Visualization: reduce to 2 or 3 features
- Less frequently used for
- Data compression (to reduce storage or transmission costs)
- Speeding up training of a supervised learning model