Default Prediction in the Finance Industry Based on Ensemble Learning: Combining Machine Learning and Deep Learning

1 Abdelmoula, A. K. (2015). Bank credit risk analysis with k-nearest-neighbor classifier: Case of Tunisian banks. Accounting and Management Information Systems, 14(1), 79-106. https://doi.org/10.1109/OCIT59427.2023.10431007 Search in Google Scholar

2 Abhiram, P., Artham, N., Reddy, N., & Kumari, K. V. (2023 ). Predicting the borrower’s genuineness in loan repayment through big data analytics. 2023 OITS International Conference on Information Technology (OCIT) (pp. 767-774). IEEE: Piscataway, NJ, USA. https://doi.org/10.1109/OCIT59427.2023.10431007 Search in Google Scholar

3 Acito, F. (2023). k nearest neighbors. In F. Acito (Ed.), Predictive analytics with KNIME: Analytics for citizen data scientists (pp. 209-227). Cham, Switzerland: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-45630-5_10 Search in Google Scholar

4 Adedapo, K. D. (2007). Analysis of default risk of agricultural loan by some selected commercial banks in Osogbo, Osun State, Nigeria. International Journal of Applied Agriculture and Apiculture Research, 4(1&2), 24-29. Search in Google Scholar

5 Alaloul, W. S., & Qureshi, A. H. (2020). Data processing using artificial neural networks. In D. Harkut (Ed.), Dynamic data assimilation: Beating the uncertainties (pp. 81–107). IntechOpen. https://doi.org/10.5772/intechopen.91935 Search in Google Scholar

6 Ali, A., Hamraz, M., Gul, N., Khan, D. M., Aldahmani, S., & Khan, Z. (2023). A k nearest neighbour ensemble via extended neighbourhood rule and feature subsets. Pattern Recognition, 142(1), 109641. https://doi.org/10.1016/j.patcog.2023.109641 Search in Google Scholar

7 Basha, S. A., Elgammal, M. M., & Abuzayed, B. M. (2021). Online peer-to-peer lending: A review of the literature. Electronic Commerce Research and Applications, 48, 101069. https://doi.org/10.1016/j.elerap.2021.101069 Search in Google Scholar

8 Brownlee, J. (2016). XGBoost with Python: Gradient boosted trees with XGBoost and scikit-learn. S.l.: Machine Learning Mastery. https://machinelearningmastery.com/xgboost-with-python/ Search in Google Scholar

9 Chen, D., Ye, J., & Ye, W. (2023). Interpretable selective learning in credit risk. Research in International Business and Finance, 65(C), 101940. https://doi.org/10.1016/j.ribaf.2023.101940 Search in Google Scholar

10 Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). New York: Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785 Search in Google Scholar

11 Chen, Y. R., Leu, J. S., Huang, S. A., Wang, J. T., & Takada, J. I. (2021). Predicting default risk on peer-to-peer lending imbalanced datasets. IEEE Access, 9, 73103-73109. https://doi.org/10.1109/ACCESS.2021.3079701 Search in Google Scholar

12 Chi Tin. (2023, 07 26). Ministry of Finance Makes a Breakthrough in Administrative Reform and Digital Transformation. Retrieved from Ministry of Finance of Vietnam: https://mof.gov.vn/webcenter/portal/ttncdtbh/pages_r/l/chi-tiettin?dDocName=MOFUCM278175 Search in Google Scholar

13 Vietnam Governance. (2010, 6 16). Law No. 47/2010/QH12 by the National Assembly: LAW ON CREDIT INSTITUTIONS. Retrieved from Government Document System of Vietnam: https://vanban.chinhphu.vn/default.aspx?pageid=27160&docid=96074 Search in Google Scholar

14 Dhruv, C., Paul, D., Kumar, M. H., & Reddy, M. S. (2023). Framework for bank loan repayment prediction and income prediction. 2023 Third International Conference on Secure Cyber Computing and Communication (ICSCCC) (pp. 833-840). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/ICSCCC58608.2023.10176363 Search in Google Scholar

15 Dong, X., Yu, Z., Cao, W., Shi, Y., & Ma, Q. (2020). A survey on ensemble learning. Frontiers of Computer Science, 14, 241-258. https://link.springer.com/article/10.1007/s11704-019-8208-z Search in Google Scholar

16 Emmanuel, I., Sun, Y., & Wang, Z. (2024). A machine learning-based credit risk prediction engine system using a stacked classifier and a filter-based feature selection method. Journal of Big Data, 11(1), 23. https://doi.org/10.1186/s40537-024-00882-0 Search in Google Scholar

17 Fan, S. (2023). Design and implementation of a personal loan default prediction platform based on LightGBM model. 2023 IEEE 3rd International Conference on Power, Electronics and Computer Applications (ICPECA) (pp. 1232-1236). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/ICPECA56706.2023.10076254 Search in Google Scholar

18 Fang, J., & Ji, Z. (2024). Application of machine learning in loan default prediction. Mathematical Modeling and Algorithm Application, 2(2), 33-35. https://doi.org/10.54097/75k4fe13 Search in Google Scholar

19 Fauzi, M. A., & Yuniarti, A. (2018). Ensemble method for indonesian twitter hate speech detection. Indonesian Journal of Electrical Engineering and Computer Science, 11(1), 294-299. http://doi.org/10.11591/ijeecs.v11.i1.pp294-299 Search in Google Scholar

20 George, N. (2021, 2 1). All Lending Club loan data. Retrieved from Kaggle: https://www.kaggle.com/datasets/wordsforthewise/lending-club/data Search in Google Scholar

21 Gupta, A., Pant, V., Kumar, S., & Bansal, P. K. (2020). Bank Loan Prediction System using Machine Learning. 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART) (pp. 423-426). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/SMART50582.2020.9336801 Search in Google Scholar

22 Hakkal, S., & Lahcen, A. A. (2024). XGBoost to enhance learner performance prediction. Computers and Education: Artificial Intelligence, 7, 100254. https://doi.org/10.1016/j.caeai.2024.100254 Search in Google Scholar

23 Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735 Search in Google Scholar

24 Jayaram, E. S., Balachandar, G., & Kumar, K. (2024). Machine learning-based loan default prediction: Models, insights, and performance evaluation in peer-to-peer lending platforms. Educational Administration: Theory and Practice, 30(5), 12975-12989. http://dx.doi.org/10.53555/kuey.v30i5.5637 Search in Google Scholar

25 Jin, Y., & Zhu, Y. (2015). A data-driven approach to predict default risk of loan for online peer-to-peer (P2P) lending. 2015 Fifth International Conference on Communication Systems and Network Technologies (pp. 609-613). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/CSNT.2015.25 Search in Google Scholar

26 Kabari, L. G., & Onwuka, U. C. (2019). Comparison of bagging and voting ensemble machine learning algorithm as a classifier. International Journals of Advanced Research in Computer Science and Software Engineering, 9(3), 19-23. Search in Google Scholar

27 Kalule, R., Abderrahmane, H. A., Alameri, W., & Sassi, M. (2023). Stacked ensemble machine learning for porosity and absolute permeability prediction of carbonate rock plugs. Scientific Reports, 13(1), 9855. https://doi.org/10.1038/s41598-023-36096-2 Search in Google Scholar

28 Ke, G., Meng, Q., Finely, T., Wang, T., Chen, W., Ma, W., . . . Liu, T. (2017, 12). LightGBM: A Highly Efficient Gradient Boosting Decision Tree. Retrieved from Microsoft Research1: https://www.microsoft.com/en-us/research/publication/lightgbm-a-highly-efficient-gradient-boosting-decision-tree/ Search in Google Scholar

29 Kim, H., Cho, H., & Ryu, D. (2020). Corporate Default Predictions Using Machine Learning: Literature Review. Sustainability, 12(16), 6325. https://doi.org/10.3390/su12166325 Search in Google Scholar

30 Koç, U., & Sevgili, T. (2020). Consumer loans’ first payment default detection: A predictive model. Turkish Journal of Electrical Engineering and Computer Sciences, 28(1), 167-181. https://doi.org/10.3906/elk-1809-190 Search in Google Scholar

31 Kumari, S., Kumar, D., & Mittal, M. (2021). An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering, 2, 40-46. https://doi.org/10.1016/j.ijcce.2021.01.001 Search in Google Scholar

32 Li, F., Zhang, L., Chen, B., Gao, D., Cheng, Y., Zhang, X., . . . Huang, Z. (2020). An optimal stacking ensemble for remaining useful life estimation of systems under multi-operating conditions. IEEE Access, 8, 31854-31868. https://doi.org/10.1109/ACCESS.2020.2973500 Search in Google Scholar

33 Li, S., Ma, K., Niu, X., Wang, Y., Ji, K., Yu, Z., & Chen, Z. (2019). Stacking-based ensemble learning on low dimensional features for fake news detection. 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2730-2735. https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00383 Search in Google Scholar

34 Machado, M. R., Karray, S., & De Sousa, I. T. (2019). LightGBM: An effective decision tree gradient boosting method to predict customer loyalty in the finance industry. 2019 14th International Conference on Computer Science & Education (ICCSE) (pp. pp. 1111-1116). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/ICCSE.2019.8845529 Search in Google Scholar

35 Pandey, D., & Pandey, B. K. (2022). An efficient deep neural network with adaptive galactic swarm optimization for complex image text extraction. In (Eds), V. Yadav, A. K. Dubey, H. P. Singh, G. Dubey, & E. Suryani, Process mining techniques for pattern recognition (pp. 121-137). Boca Raton, FL: CRC Press. https://doi.org/10.1201/9781003169550-10 Search in Google Scholar

36 Qi, X. (2023). Factors influence loan default–A credit risk analysis. International Conference on Economic Management and Green Development (pp. 849-862). Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-97-0523-8_79 Search in Google Scholar

37 Rincy, T. N., & Gupta, R. (2020). Ensemble learning techniques and its efficiency in machine learning: A survey. 2020 2nd International Conference on Data, Engineering and Applications (IDEA) (pp. 1-6). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/IDEA49133.2020.9170675 Search in Google Scholar

38 Sain, K., & Kumar, P. C. (2022). An Overview of Artificial Neural Networks. In K. Sain, & P. C. Kumar, Meta-Attributes and Artificial Networking: A New Tool for Seismic Interpretation (pp. 73-93). Hoboken, New Jersey: John Wiley & Sons. https://doi.org/10.1002/9781119481874 Search in Google Scholar

39 Satpute, S., Jayabalan, M., Kolivand, H., Assi, J., Aldhaibani, O. A., Liatsis, P., & Mahyoub, M. (2022). Loan default forecasting using StackNet. The International Conference on Data Science and Emerging Technologies (pp. 434-447). Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-99-0741-0_31 Search in Google Scholar

40 Schonlau, M. (2023). Logistic regression. In M. Schonlau, Applied statistical learning: With case studies in Stata (pp. 49-71). Cham, Switzerland: Springer International Publishing. https://doi.org/10.1007/978-3-031-33390-3_4 Search in Google Scholar

41 Sokolova, M., Japkowicz, N., & Szpakowicz, S. (2006). Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Australasian joint conference on artificial intelligence (pp. 1015-1021). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/11941439_114 Search in Google Scholar

42 Thorat, M., Pandit, S., & Balote, S. (2022). Artificial neural network: A brief study. Asian Journal for Convergence in Technology (AJCT), 8(3), 12-16. https://doi.org/10.33130/AJCT.2022v08i03.003 Search in Google Scholar

43 Uddin, N., Ahamed, M. K., Uddin, M. A., Islam, M. M., Talukder, M. A., & Aryal, S. (2023). An ensemble machine learning based bank loan approval predictions system with a smart application. International Journal of Cognitive Computing in Engineering, 4(6), 327-339. https://doi.org/10.1016/j.ijcce.2023.09.001 Search in Google Scholar

44 Uddin, S., Haque, I., Lu, H., Moni, M. A., & Gide, E. (2022). Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Scientific Reports, 12(1), 6256. https://doi.org/10.1038/s41598-022-10358-x Search in Google Scholar

45 Wang, C., Han, D., Liu, Q., & Luo, S. (2018). A deep learning approach for credit scoring of peer-to-peer lending using attention mechanism LSTM. IEEE Access, 7, 2161-2168. https://doi.org/10.1109/ACCESS.2018.2887138 Search in Google Scholar

46 Wang, W., Zuo, X., & Han, D. (2024). Predict credit risk with XGBoost. Applied and Computational Engineering, 74(1), 164-177. https://doi.org/10.54254/2755-2721/74/20240462 Search in Google Scholar

47 Wolpert, D. H. (1992). Stacked generalization. Neural networks, 5(2), 241-259. https://doi.org/10.1016/S0893-6080(05)80023-1 Search in Google Scholar

48 Xia, Y., He, L., Li, Y., Liu, N., & Ding, Y. (2020). Predicting loan default in peer-to-peer lending using narrative data. Journal of Forecasting, 39(2), 39(2), 260-280. https://doi.org/10.1002/for.2625 Search in Google Scholar

49 Yadav, D., Sahoo, L., Mandal, S. K., Ravivarman, G., Vijayaraghavan, P., & Prasad, B. (2023). Using long short-term memory units for time series forecasting. 2023 2nd International Conference on Futuristic Technologies (INCOFT) (pp. 1-6). Piscataway, NJ, USA: IEEE. https://doi.org/10.1109/INCOFT60753.2023.10425756 Search in Google Scholar

50 Zhou, Y. (2023). Loan default prediction based on machine learning methods. Proceedings of the 3rd International Conference on Big Data Economy and Information Management (BDEIM 2022). Zhengzhou, China: EAI. http://doi.org/10.4108/eai.2-12-2022.2328740 Search in Google Scholar

Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 2 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Wirtschaftswissenschaften, Betriebswirtschaft, Management, Organisation und Unternehmensführung, Grundsätze dern Unternehmensführung, Betriebswirtschaft, andere, Mathematik und Statistik für Ökonomen, Mathematik

Zeitschrift RSS Feed

Default Prediction in the Finance Industry Based on Ensemble Learning: Combining Machine Learning and Deep Learning

Le Hoanh-Su

Le Quang Chan Phong

Truong Cong Vinh

Ho Mai Minh Nhat

Jong-Hwa Lee

Online veröffentlicht: 20. Juni 2025

Seitenbereich: 198 - 218

Eingereicht: 05. Mai 2024

Akzeptiert: 11. Okt. 2024

DOI: https://doi.org/10.2478/bsrj-2025-0010

Schlüsselwörterdefault prediction, risk assessment, machine learning, deep learning, ensemble learning, online lending

© 2025 Le Hoanh-Su et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Schlüsselwörter
default prediction, risk assessment, machine learning, deep learning, ensemble learning, online lending