Definition

PCA is a linear dimensionality reduction technique. The correlated variables are linearly transformed onto a new coordinate system such that the directions capturing the largest variance in the data.
Population Version
Given a random vector , we find a such that is maximized: Equivalently, by the Method of Lagrange Multipliers with , By differentiation, the is given by the eigen value problem Thus the maximizing the variance of is the eigenvector corresponding to the largest Eigenvalue.
Sample Version
Given a data matrix , by Singular Value Decomposition, A matrix can be factorized as .
By algebra, , where we call the -th principal component.
Facts
Since and