Open Access

Dynamic domain analysis for predicting concept drift in engineering AI-enabled software

, , ,  and   
May 07, 2025

Cite
Download Cover

Figure 1.

Class and co-variant drifts and how our approach tends to eventually address the domain concept drifts in AIS.
Class and co-variant drifts and how our approach tends to eventually address the domain concept drifts in AIS.

Figure 2.

High-level overview of our iterative process.
High-level overview of our iterative process.

Figure 3.

Current research focuses on the prediction of concept drift, while future work aims at addressing the drift.
Current research focuses on the prediction of concept drift, while future work aims at addressing the drift.

Figure 4.

Sudden drift in P(X): The frequency of pedestrian safety-related topics, on March 18, 2018, the day a self-driving Uber ran over a pedestrian in Arizona.
Sudden drift in P(X): The frequency of pedestrian safety-related topics, on March 18, 2018, the day a self-driving Uber ran over a pedestrian in Arizona.

Figure 5.

Model-generated captions and detected objects.
Model-generated captions and detected objects.

Figure 6.

The change in mean similarity scores for “pedestrian” in social topics before, during, and after car accidents.
The change in mean similarity scores for “pedestrian” in social topics before, during, and after car accidents.

Figure 7.

The Gaussian probability density function shows the interval probabilities as areas under the curve.
The Gaussian probability density function shows the interval probabilities as areas under the curve.

Figure 8.

The set of terms and their probability shifts in the car accident data.
The set of terms and their probability shifts in the car accident data.

Figure 9.

The change in mean similarity scores for “pedestrian” in social topics before, during, and after Halloween.
The change in mean similarity scores for “pedestrian” in social topics before, during, and after Halloween.

Figure 10.

The set of terms and their probability shifts in the Halloween data.
The set of terms and their probability shifts in the Halloween data.

Figure 11.

Probability shifts of terms in the Airplane Crashes dataset.
Probability shifts of terms in the Airplane Crashes dataset.

Figure 12.

The change in mean similarity scores for the term “airplane” in social topics before, during, and after the Plane Crash.
The change in mean similarity scores for the term “airplane” in social topics before, during, and after the Plane Crash.

Qualitative comparison of concept drift detection methods_

Metric Proposed Framework FiCSUM DDM/EDDM
Proactivity High (Proactive) Moderate (Recurring Drifts) Low (Post Hoc)
Adaptability High (Domain-Agnostic) Moderate (Recurring Drifts) Low (Frequent Retraining)
Feature Semantic + Visual Meta-Features Error-Based Only
Efficiency High Moderate Moderate
Detection Accuracy High High Moderate

Top five words returned by different search queries on Google Books N-gram_

“pedestrian” + [verb] “pedestrian” + [noun]
pedestrian crossing pedestrian traffic
pedestrian walks pedestrian mall
pedestrian killed pedestrian bridge
pedestrian pass pedestrian street
pedestrian moving pedestrian zone

Top ten similar words to pedestrian from Wikipedia and Google News corpora_

Wiki terms Similarity Google terms Similarity
walkway 0.6928 bicyclist 0.6166
lanes 0.6808 crosswalk 0.5942
sidewalks 0.6572 motorist 0.5460
roadway 0.6411 bike lanes 0.5416
vehicular 0.6380 pedestrian walkways 0.5328
thoroughfare 0.6337 bicycle lanes 0.5256
subway 0.6296 bikeway 0.5248
underpass 0.6193 traffic calming 0.5239
overpass 0.6157 roadway 0.5181
parking 0.6129 traffic 0.5173

Collected datasets for autonomous car accidents_

Date Accident # Tweets
29 July 2016 Tesla 89,881
18 March 2018 Uber 119,121
26 April 2019 Tesla 154,916

Summary of the GDELT dataset for airplane crash events_

Airplane Crash Date of Crash Total News Articles Articles Containing “Airplane”
California 2020 Jan 26, 2020 1,813,710 3,900 (before: 1,327, during: 1,313, after: 1,260)
Washington 2022 Sep 4, 2022 1,126,899 1,979 (before: 576, during: 768, after: 635)
Washington 2025 Jan 29, 2025 1,886,173 11,541 (before: 1,437, during: 6,984, after: 3,120)
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining