37
Principle component analysis
•Suppose each data point is N-dimensional
–Same procedure applies:
–
–
–
–The eigenvectors of A define a new coordinate system
•eigenvector with largest eigenvalue captures the most variation among training vectors x
•eigenvector with smallest eigenvalue has least variation
•
–We can compress the data by only using the top few eigenvectors
•corresponds to choosing a “linear subspace”
–represent points on a line, plane, or “hyper-plane”