A Novel Method for Drift Detection in Streaming Data Based on Measurement of Changes in Feature Ranks
Published Online: Feb 05, 2025
Page range: 147 - 166
Received: Oct 02, 2024
Accepted: Dec 01, 2024
DOI: https://doi.org/10.2478/jaiscr-2025-0008
Keywords
© 2025 Piotr Porwik et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Hidden changes in the data stream are unknown to learning algorithms and are referred to in the literature as drifts of various types. The accuracy of the classifier may degrade due to the occurrence of drift in non-stationary data streams. In such situations, the classifier must detect significant data changes and adjust its predictions. This article aims to present a new method of drift detection based on analyzing changes in feature ranks across adjacent chunks of data. The proposed strategy involves determining the ranking of the most important feature and tracking its fluctuations within the chunks into which the input data stream is divided. Changes in feature rankings between adjacent chunks serve as symptoms of data drift. The Least Absolute Shrinkage and Selection Operator (LASSO) procedure was proposed as an efficient rank pointer. We compared well-known and popular drift detection algorithms, such as the Drift Detection Method (DDM), Early Drift Detection Method (EDDM), ADaptive WINdowing (ADWIN), and Principal Component Analysis Feature Drift Detection (PCA-FDD), with our approach in comparative studies. The tests were conducted on different artificial data streams (sudden, gradual, recurring, and incremental) as well as real data. Comparative studies were performed on both two-class and multi-class datasets. The experiments confirm that the proposed feature drift detection strategy produces valuable results.