Eigenvalue-Based Incremental Spectral Clustering

Our previous experiments demonstrated that subsets of collections of (short) documents (with several hundred entries) share a common, normalized in some way, eigenvalue spectrum of combinatorial Laplacian. Based on this insight, we propose a method of incremental spectral clustering. The method consists of the following steps: (1) split the data into manageable subsets, (2) cluster each of the subsets, (3) merge clusters from different subsets based on the eigenvalue spectrum similarity to form clusters of the entire set. This method can be especially useful for clustering methods of complexity strongly increasing with the size of the data sample, like in case of typical spectral clustering. Experiments were performed showing that in fact the clustering and merging of subsets yield clusters close to clustering of the entire dataset. Our approach differs from other research streams in that we rely on the entire set (spectrum) of eigenvalues, whereas the other researchers concentrate on few eigenvectors related to lowest eigenvalues. Such eigenvectors are considered in the literature as of low reliability.

eISSN:: 2449-6499
Langue:: Anglais

Périodicité:: 4 fois par an
Sujets de la revue:: Computer Sciences, Databases and Data Mining, Artificial Intelligence

RSS Feed de la revue

Eigenvalue-Based Incremental Spectral Clustering

Publié en ligne: 19 mars 2024

Pages: 157 - 169

Reçu: 01 sept. 2023

Accepté: 07 févr. 2024

DOI: https://doi.org/10.2478/jaiscr-2024-0009

Mots cléstext mining, artificial intelligence, machine learning, graph spectral clustering, incremental clustering, combinatorial Laplacian, eigenvalue spectrum analysis

© 2024 Mieczysław A. Kłopotek et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Mots clés
text mining, artificial intelligence, machine learning, graph spectral clustering, incremental clustering, combinatorial Laplacian, eigenvalue spectrum analysis