Early Warning System for Debt Group Migration: The Case of One Commercial Bank in Vietnam
10. Sept. 2024
Über diesen Artikel
Online veröffentlicht: 10. Sept. 2024
Seitenbereich: 195 - 216
DOI: https://doi.org/10.2478/fman-2024-0012
Schlüsselwörter
© 2024 Quoc Hung Nguyena et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

The results of the evaluation criteria for the B Score model (Source: Author’s own calculation)
Dataset | Criteria (%) | Tuned by MCC | Tuned by F-Recall | ||||||
LR | SVM | DT | RF | LR | SVM | DT | RF | ||
Train | Accuracy | 64.62 | 83.22 | 96.29 | 64.62 | 47.05 | 50.31 | ||
Recall | 65.19 | 86.98 | 98.20 | 65.19 | 91.53 | 90.05 | |||
Precision | 47.59 | 69.86 | 91.29 | 47.59 | 37.73 | 39.19 | |||
F1 | 55.02 | 77.48 | 94.62 | 55.02 | 53.44 | 54.61 | |||
F-Recall | 60.70 | 82.91 | 96.74 | 60.70 | 71.22 | 71.49 | |||
F-Precision | 50.31 | 72.72 | 92.59 | 50.31 | 42.76 | 44.18 | |||
MCC | 27.92 | 65.34 | 91.94 | 27.92 | 19.60 | 22.82 | |||
Validation | Accuracy | 64.66 | 70.58 | 74.09 | 64.66 | 46.62 | 50.41 | ||
64.98 | 61.58 | 66.95 | 64.98 | 87.83 | 70.78 | ||||
Precision | 47.62 | 55.08 | 59.79 | 47.62 | 37.52 | 39.02 | |||
F1 | 54.96 | 58.15 | 63.17 | 54.96 | 53.22 | 54.04 | |||
60.56 | 60.16 | 65.39 | 60.56 | 70.26 | 70.46 | ||||
F-Precision | 50.31 | 56.27 | 61.10 | 50.31 | 42.54 | 43.90 | |||
MCC | 27.89 | 35.71 | 43.45 | 27.89 | 18.94 | 21.29 |
Matrix for different types of alerts (Source: Reihart, et al_, 2010)
- | The event occurred | The event did not occur |
---|---|---|
There is a warning signal | A | B |
There is no warning signal | C | D |
Summary of parameters of models (Source: Author’s compilation)
Model | Parameter | Description |
---|---|---|
LG | None | The baseline model is a linear regression model combined with the sigmoid (logit) activation function, so no tuning is required |
SVM | Kernel function | The activation function used to transform data into a different feature space for linear separation includes Linear, Polynomial, Sigmoid, and RBF |
C | The coefficient for balancing the weight between distance and noise | |
d | The degree parameter when using the Polynomial kernel, which takes a natural number value | |
γ | The gamma parameter for Polynomial, Sigmoid, and RBF kernels, which takes a non-negative value | |
r | The intercept for the Polynomial and Sigmoid kernels | |
DT | Depth | It is necessary to limit the depth of the DT to avoid overfitting and reduce computational cost |
Number of leaf nodes | It is necessary to limit the number of leaf nodes of the DT to avoid overfitting and reduce computational cost | |
RF | Depth | It is necessary to limit the depth of each DT to avoid overfitting and reduce computational cost |
Number of leaf nodes | It is necessary to limit the number of leaf nodes of each DT to avoid overfitting and reduce computational cost | |
Number of DTs | The number of DTs in Random Forest Classifier (RF) needs to be considered for computational cost when the number is too high |
Confusion matrix (Source: Author’s illustration)
Target variable | Predicted: 1 | Predicted: 0 | Total |
---|---|---|---|
Actual: 1 | TP: True positives | FN: False negatives | P |
Actual: 0 | FP: False positives | TN: True negatives | N |
Total | P + N |
Early warning system deployment for B Score customers (Source: Author’s own calculation)
Criteria (%) | Tuned by MCC | Tuned by F-Recall | |
---|---|---|---|
RF (best) | SVM (best) | RF (second best) | |
Accuracy | 81.84 → |
46.62 → |
79.85 → |
Recall | 67.45 → |
91.48 → |
70.78 → |
Precision | 75.26 → |
37.52 → |
69.20 → |
F-Recall | 68.88 → |
71.04 → |
70.46 → |
The results of the evaluation criteria for the C Score model (Source: Author’s own calculation)
Customer | Model | Selection | Parameters |
---|---|---|---|
B Score | Best | RF tuned by MCC | n_estimators = 100; |
SVM tuned by F-Recall | kernel = ‘sigmoid’; |
||
Second best | RF tuned by F-Recall | n_estimators = 100; |
|
C Score | Best | SVM tuned by MCC | kernel = ‘poly’; |
SVM tuned by F-Precision | kernel = ‘poly’; |
Model tuning parameters in Scikit-learn (Source: Author’s own research)
Model | Parameter | Parameters in Scikit-learn | Range of values for tuning |
---|---|---|---|
LG | None | None | None |
SVM | Kernel function | ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’ | |
C | 0.01, 0.1, 1, 10 | ||
d | 2, 3, 4, 5 (this is applicable only when |
||
γ | 0.01, 0.1, 1, 10 (not applicable when kernel is set to ‘linear’) | ||
DT | Depth | The range from 2 to 21 (with a step size of 2) and none | |
Number of leaf nodes | The range from 2 to 21 (with a step size of 2) and none | ||
RF | Depth | Similar to DT | Similar to DT |
Number of leaf nodes | Similar to DT | Similar to DT | |
Number of DTs | 10, 50, 100 |
Early warning system deployment for C Score customers (Source: Author’s own calculation)
Criteria (%) | Tuned by MCC | Tuned by F-Precision |
---|---|---|
SVM (best) | SVM (best) | |
Accuracy | 70.78 → |
71.60 → |
Recall | 53.57 → |
45.24 → |
Precision (*) | 58.44 → |
62.30 → |
F-Precision (*) | 57.40 → |
57.93 → |