Spectral Clustering

Clustering is a popular data mining technique that is used to place data elements into related groups of “similar behaviour”. The traditional clustering algorithm is the so-called k-means algorithm. However, k-means has some well-known problems, i.e. it does not work well on clusters with not well-defined centers, it is difficult to choose the number k of clusters to construct upfront and different initial centers can lead to different final clusters.
In recent years, spectral clustering has become popular and widely used
since its results often outperform the outcomes of the k-means algorithm. Spectral clustering is a more advanced algorithm compared to k-means as it uses several mathematical concepts (i.e. degree matrices, weight matrices, similarity matrices, similarity graphs, graph Laplacians, eigenvalues and eigenvectors) in order to divide similar data points in the same group and dissimilar data points in different groups.

This project's goal was to implement the Spectral Clustering algorithm as a module for the XVDM visual data mining tool. In addition a study of the strengths and weaknesses of the algorithm had to be done by performing experiments using the implemented solution and different, well-chosen, datasets.

Overview

Programming Language: C++
Architecture: plugin
Development State: finished
Type: Visual datamining tool - Spectral Clustering

Tags: Spectral clustering, Data mining

Team Members

Juri Strumpflohner (me)