Definition

Spectral clustering technique make use of spectrum of the similarity matrix of the data to perform dimensionality reduction before clustering.

Algorithms

Apply Laplacian Eigenmap to the given data .

Form an sub matrix using the first -columns of the embedded matrix , and normalize each row to norm .

Apply clustering algorithm (e.g. K-Means Clustering) to , where is the -th row of .

Mathematical Background

Consider a cluster assignment vector for each data point. Then, clustering observations is corresponding to estimating ‘s The entries of the Adjacency Matrix represents the similarity between points and .

Under the assumption that the close data points (large ) have a similar label (), the optimization problem is set up that minimizing the difference between assignments for similar points. where the constraint is imposed to avoid the trivial solution .

In a matrix notation, the problem is

When , is large and when , is small. So, works as the weight for each pair in the optimization.

The Lagrangian function is defined as And the solution of the problem is given by an Eigenvalue problem. The solution is an matrix of eigenvectors sorted in ascending by their corresponding eigenvalues.