Will they repay their debt? Identification of borrowers likely to be charged off

Recent increase in peer-to-peer lending prompted for development of models to separate good and bad clients to mitigate risks both for lenders and for the platforms. The rapidly increasing body of literature provides several comparisons between various models. Among the most frequently employed ones are logistic regression, Support Vector Machines, neural networks and decision tree-based models. Among them, logistic regression has proved to be a strong candidate both because its good performance and due to its high explainability. The present paper aims to compare four pairs of models (for imbalanced and under-sampled data) meant to predict charged off clients by optimizing F1 score. We found that, if the data is balanced, Logistic Regression, both simple and with Stochastic Gradient Descent, outperforms LightGBM and K-Nearest Neighbors in optimizing F1 score. We chose this metric as it provides balance between the interests of the lenders and those of the platform. Loan term, debt-to-income ratio and number of accounts were found to be important positively related predictors of risk of charge off. At the other end of the spectrum, by far the strongest impact on charge off probability is that of the FICO score. The final number of features retained by the two models differs very much, because, although both models use Lasso for feature selection, Stochastic Gradient Descent Logistic Regression uses a stronger regularization. The analysis was performed using Python (numpy, pandas, sklearn and imblearn).

eISSN:: 2069-8887
Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 4 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Wirtschaftswissenschaften, Betriebswirtschaft, Marketing, Vertrieb, Kundenbeziehungen, Management, Organisation und Unternehmensführung, Grundsätze der Unternehmensführung, Unternehmensentwicklung, Management des Wandels, Kooperationen, Existenzgründung, Unternehmertum

Zeitschrift RSS Feed

Will they repay their debt? Identification of borrowers likely to be charged off

Online veröffentlicht: 08. Okt. 2020

Seitenbereich: 393 - 409

DOI: https://doi.org/10.2478/mmcks-2020-0023

Schlüsselwörterpeer-to-peer lending, creditworthiness, Logistic Regression, KNN, LightGBM

© 2020 Raluca Dana Caplescu et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Schlüsselwörter
peer-to-peer lending, creditworthiness, Logistic Regression, KNN, LightGBM