Publicado en línea: 15 ago 2024
Páginas: 14 - 23
Recibido: 12 abr 2024
Aceptado: 10 jul 2024
DOI: https://doi.org/10.2478/acss-2024-0003
Palabras clave
© 2024 Vadim Romanuke., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
An approach to speed up the DBSCAN algorithm is suggested. The planar clusters to be revealed are assumed to be tightly packed and correlated constituting, thus, a serpentine dataset developing rightwards or leftwards as time goes on. The dataset is initially divided into a few sub-datasets along the time axis, whereupon the best neighbourhood radius is determined over the first sub-dataset and the standard DBSCAN algorithm is run over all the sub-datasets by the best neighbourhood radius. To find the best neighbourhood radius, it is necessary to know ground truth cluster labels of points within a region. The factual speedup registered in a series of 80 000 dataset computational simulations ranges from 5.0365 to 724.7633 having a trend to increase as the dataset size increases.