[
Arjovsky, M., Chintala, S. and Bottou, L. (2017). Wasserstein generative adversarial networks, International Conference on Machine Learning, Sydney, Australia, pp. 214–223.
]Search in Google Scholar
[
Barua, S., Islam, M.M. and Murase, K. (2013). PROWSYN: Proximity weighted synthetic oversampling technique for imbalanced data set learning, Advances in Knowledge Discovery and Data Mining: 17th Pacific-Asia Conference, PAKDD 2013, Gold Coast, Australia, pp. 317–328.
]Search in Google Scholar
[
Bourou, S., El Saer, A., Velivassaki, T.-H., Voulkidis, A. and Zahariadis, T. (2021). A review of tabular data synthesis using GANs on an IDS dataset, Information 12(09): 375.
]Search in Google Scholar
[
Breiman, L. (2001). Random forests, Machine Learning 45(1): 5–32.
]Search in Google Scholar
[
Breiman, L. (2017). Classification and Regression Trees, Routledge, London.
]Search in Google Scholar
[
Budach, L., Feuerpfeil, M., Ihde, N., Nathansen, A., Noack, N., Patzlaff, H., Naumann, F. and Harmouch, H. (2022). The effects of data quality on machine learning performance, arXiv: 2207.14529.
]Search in Google Scholar
[
Chaabane, I., Guermazi, R. and Hammami, M. (2020). Enhancing techniques for learning decision trees from imbalanced data, Advances in Data Analysis and Classification 14(3): 1–69.
]Search in Google Scholar
[
Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P. (2002). SMOTE: Synthetic minority over-sampling technique, Journal of Artificial Intelligence Research 16: 321–357.
]Search in Google Scholar
[
Chen, J., Huang, H., Cohn, A.G., Zhang, D. and Zhou, M. (2022). Machine learning-based classification of rock discontinuity trace: SMOTE oversampling integrated with GBT ensemble learning, International Journal of Mining Science and Technology 32(2): 309–322.
]Search in Google Scholar
[
Chen, J., Yan, Z., Lin, C., Yao, B. and Ge, H. (2023). Aero-engine high speed bearing fault diagnosis for data imbalance: A sample enhanced diagnostic method based on pre-training WGAN-GP, Measurement 213(7): 112709.
]Search in Google Scholar
[
Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification, IEEE Transactions on Information Theory 13(1): 21–27.
]Search in Google Scholar
[
Cui, J., Zong, L., Xie, J. and Tang, M. (2023). A novel multi-module integrated intrusion detection system for high-dimensional imbalanced data, Applied Intelligence 53(1): 272–288.
]Search in Google Scholar
[
Derrac, J., Garcia, S., Sanchez, L. and Herrera, F. (2015). Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework, Journal of Multiple-Valued Logic and Soft Computing 17(2–3): 255–287.
]Search in Google Scholar
[
Douzas, G. and Bacao, F. (2018). Effective data generation for imbalanced learning using conditional generative adversarial networks, Expert Systems with Applications 91(1): 464–471.
]Search in Google Scholar
[
Dua, D. and Graff, C. (2019). UCI Machine Learning Repository, http://archive.ics.uci.edu/ml.
]Search in Google Scholar
[
Fernández, A., Garcia, S., Herrera, F. and Chawla, N.V. (2018). Smote for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary, Journal of Artificial Intelligence Research 61: 863–905.
]Search in Google Scholar
[
Freund, Y. and Schapire, R.E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55(1): 119–139.
]Search in Google Scholar
[
García, S., Luengo, J. and Herrera, F. (2016). Tutorial on practical tips of the most influential data preprocessing algorithms in data mining, Knowledge-Based Systems 98(7): 1–29.
]Search in Google Scholar
[
Gazzah, S. and Amara, N.E.B. (2008). New oversampling approaches based on polynomial fitting for imbalanced data sets, 2008 8th IAPR International Workshop on Document Analysis Systems, Nara, Japan, pp. 677–684.
]Search in Google Scholar
[
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. and Bengio, Y. (2014). Generative adversarial nets, Advances in Neural Information Processing Systems 27: 2672–2680.
]Search in Google Scholar
[
Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V. and Courville, A.C. (2017). Improved training of Wasserstein GANs, Advances in Neural Information Processing Systems 30: 5767–5777.
]Search in Google Scholar
[
Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection, Journal of Machine Learning Research 3(Mar): 1157–1182.
]Search in Google Scholar
[
He, H. and Garcia, E.A. (2009). Learning from imbalanced data, IEEE Transactions on Knowledge and Data Engineering 21(9): 1263–1284.
]Search in Google Scholar
[
Hernandez, M., Epelde, G., Alberdi, A., Cilla, R. and Rankin, D. (2022). Synthetic data generation for tabular health records: A systematic review, Neurocomputing 493(27): 28–45.
]Search in Google Scholar
[
James, G., Witten, D., Hastie, T. and Tibshirani, R. (2013). An Introduction to Statistical Learning: With Applications to R, 2nd Edn, Springer, New York.
]Search in Google Scholar
[
Janicka, M., Lango, M. and Stefanowski, J. (2019). Using information on class interrelations to improve classification of multiclass imbalanced data: A new resampling algorithm, International Journal of Applied Mathematics and Computer Science 29(4): 769–781, DOI: 10.2478/amcs-2019-0057.
]Search in Google Scholar
[
Japkowicz, N. (2003). Class imbalances: Are we focusing on the right issue, Workshop on Learning from Imbalanced Data Sets II, Washington, USA, p. 63.
]Search in Google Scholar
[
Kaggle (2024), Datasets: Lower Back Pain, https://www.kaggle.com/datasets/sammy123/lower-back-pain-symptoms-dataset, and Telecom Churn, https://www.kaggle.com/datasets/mnassrib/telecom-churn-datasets.
]Search in Google Scholar
[
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection, 14th International Joint Conference on Artificial Intelligence (IJCAI), Montreal, Canada, pp. 1137–1145.
]Search in Google Scholar
[
Kovács, G. (2019). An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets, Applied Soft Computing 83(9): 105662.
]Search in Google Scholar
[
Liu, X.-Y., Wu, J. and Zhou, Z.-H. (2008). Exploratory undersampling for class-imbalance learning, IEEE Transactions on Systems, Man, and Cybernetics B: Cybernetics 39(2): 539–550.
]Search in Google Scholar
[
López, V., Fernández, A., García, S., Palade, V. and Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics, Information Sciences 250(33): 113–141.
]Search in Google Scholar
[
Mirza, M. and Osindero, S. (2014). Conditional generative adversarial nets, arXiv: 1411.1784.
]Search in Google Scholar
[
Miyato, T., Kataoka, T., Koyama, M. and Yoshida, Y. (2018). Spectral normalization for generative adversarial networks, arXiv: 1802.05957.
]Search in Google Scholar
[
Moreo, A., Esuli, A. and Sebastiani, F. (2016). Distributional random oversampling for imbalanced text classification, Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, Pisa, Italy, pp. 805–808.
]Search in Google Scholar
[
Napierala, K. and Stefanowski, J. (2016). Types of minority class examples and their influence on learning classifiers from imbalanced data, Journal of Intelligent Information Systems 46: 563–597.
]Search in Google Scholar
[
Nik, A.H.Z., Riegler, M.A., Halvorsen, P. and Storås, A.M. (2023). Generation of synthetic tabular healthcare data using generative adversarial networks, International Conference on Multimedia Modeling, Bergen, Norway, pp. 434–446.
]Search in Google Scholar
[
Ohsaki, M., Wang, P., Matsuda, K., Katagiri, S., Watanabe, H. and Ralescu, A. (2017). Confusion-matrix-based kernel logistic regression for imbalanced data classification, IEEE Transactions on Knowledge and Data Engineering 29(9): 1806–1819.
]Search in Google Scholar
[
Park, N., Mohammadi, M., Gorde, K., Jajodia, S., Park, H. and Kim, Y. (2018). Data synthesis based on generative adversarial networks, Proceedings of the VLDB Endowment 11(10): 1071–1083.
]Search in Google Scholar
[
Park, S. and Park, H. (2021). Combined oversampling and undersampling method based on slow-start algorithm for imbalanced network traffic, Computing 103(3): 401–424.
]Search in Google Scholar
[
Powers, D.M. (2020). Evaluation: From precision, recall and f-measure to ROC, informedness, markedness and correlation, arXiv: 2010.16061.
]Search in Google Scholar
[
Ren, J., Wang, Y., Cheung, Y.-m., Gao, X.-Z. and Guo, X. (2023). Grouping-based oversampling in kernel space for imbalanced data classification, Pattern Recognition 133(1): 108992.
]Search in Google Scholar
[
Sáez, J.A., Luengo, J., Stefanowski, J. and Herrera, F. (2015). SMOTE–IPF: Addressing the noisy and borderline examples problem in imbalanced classification by a re-sampling method with filtering, Information Sciences 291(2): 184–203.
]Search in Google Scholar
[
Sun, B., Zhou, Q., Wang, Z., Lan, P., Song, Y., Mu, S., Li, A., Chen, H. and Liu, P. (2023). Radial-based undersampling approach with adaptive undersampling ratio determination, Neurocomputing 553(39): 126544.
]Search in Google Scholar
[
Sun, Y., Wong, A.K. and Kamel, M.S. (2009). Classification of imbalanced data: A review, International Journal of Pattern Recognition and Artificial Intelligence 23(04): 687–719.
]Search in Google Scholar
[
Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference, Springer, New York.
]Search in Google Scholar
[
Wold, S., Esbensen, K. and Geladi, P. (1987). Principal component analysis, Chemometrics and Intelligent Laboratory Systems 2(1–3): 37–52.
]Search in Google Scholar
[
Woods, K.S., Doss, C.C., Bowyer, K.W., Solka, J.L., Priebe, C.E. and Kegelmeyer Jr, W.P. (1993). Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography, International Journal of Pattern Recognition and Artificial Intelligence 7(06): 1417–1436.
]Search in Google Scholar
[
Xie, Y. and Zhang, T. (2018). Imbalanced learning for fault diagnosis problem of rotating machinery based on generative adversarial networks, 2018 37th Chinese Control Conference (CCC), Wuhan, China, pp. 6017–6022.
]Search in Google Scholar
[
Xu, L., Skoularidou, M., Cuesta-Infante, A. and Veeramachaneni, K. (2019). Modeling tabular data using conditional GAN, Advances in Neural Information Processing Systems 32: 7335–7345.
]Search in Google Scholar
[
Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O. and Li, H. (2017). High-resolution image inpainting using multi-scale neural patch synthesis, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 6721–6729.
]Search in Google Scholar
[
Zhang, M., Wan, X., Gang, L., Lv, X., Wu, Z. and Liu, Z. (2021). An automated driving strategy generating method based on WGAIL–DDPG, International Journal of Applied Mathematics and Computer Science 31(3): 461–470, DOI: 10.34768/amcs-2021-0031.
]Search in Google Scholar
[
Zhang, Y., Liu, Y., Wang, Y. and Yang, J. (2023). An ensemble oversampling method for imbalanced classification with prior knowledge via generative adversarial network, Chemometrics and Intelligent Laboratory Systems 235(4): 104775.
]Search in Google Scholar
[
Zhao, Y., Li, H., Bissyandé, T.F., Klein, J. and Grundy, J. (2021). On the impact of sample duplication in machine-learning-based android malware detection, ACM Transactions on Software Engineering and Methodology 30(3): 1–38.
]Search in Google Scholar
[
Zhao, Z., Kunar, A., Birke, R. and Chen, L.Y. (2021). CTAB-GAN: Effective table data synthesizing, Asian Conference on Machine Learning, pp. 97–112, (virtual).
]Search in Google Scholar
[
Zheng, M., Li, T., Zhu, R., Tang, Y., Tang, M., Lin, L. and Ma, Z. (2020a). Conditional wasserstein generative adversarial network-gradient penalty-based approach to alleviating imbalanced data classification, Information Sciences 512(7): 1009–1023.
]Search in Google Scholar
[
Zheng, W. and Zhao, H. (2020b). Cost-sensitive hierarchical classification for imbalance classes, Applied Intelligence 50(8): 2328–2338.
]Search in Google Scholar
[
Zhu, B., Pan, X., vanden Broucke, S. and Xiao, J. (2022). A GAN-based hybrid sampling method for imbalanced customer classification, Information Sciences 609(28): 1397–1411.
]Search in Google Scholar