Cite

Figure 1:

Research development methodology for generating synthetic data, predicting money laundering transactions using CNN, and interpreting the predictions using SHAP. AI, artificial intelligence; CNN, convolutional neural network; ML, machine learning; RF, random forest; SHAP, SHapley Additive exPlanations; SVM, support vector machine.
Research development methodology for generating synthetic data, predicting money laundering transactions using CNN, and interpreting the predictions using SHAP. AI, artificial intelligence; CNN, convolutional neural network; ML, machine learning; RF, random forest; SHAP, SHapley Additive exPlanations; SVM, support vector machine.

Figure 2:

Synthetic financial transaction data generation methodology.
Synthetic financial transaction data generation methodology.

Figure 3:

CNN architecture to predict suspicious money laundering transactions. CNN, convolutional neural network.
CNN architecture to predict suspicious money laundering transactions. CNN, convolutional neural network.

Figure 4:

ROC curve for (A) CNN, (B) RF, (C) XGBoost, and (D) SVM classifiers. CNN, convolutional neural network; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.
ROC curve for (A) CNN, (B) RF, (C) XGBoost, and (D) SVM classifiers. CNN, convolutional neural network; RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.

Figure 5:

Interpretation of CNN predictions using SHAP force plot. (A) Force plot of a transaction predicted as suspicious by CNN. (B) Force plot of a transaction predicted as legitimate by CNN. CNN, convolutional neural network; SHAP, SHapley Additive exPlanations.
Interpretation of CNN predictions using SHAP force plot. (A) Force plot of a transaction predicted as suspicious by CNN. (B) Force plot of a transaction predicted as legitimate by CNN. CNN, convolutional neural network; SHAP, SHapley Additive exPlanations.

Figure 6:

Global interpretation of predictions made by RF, XGBoost, and SVM using feature importance score. RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.
Global interpretation of predictions made by RF, XGBoost, and SVM using feature importance score. RF, random forest; SVM, support vector machine; XGBoost, extreme gradient boosting.

Synthetic financial transaction dataset summary

Parameter Value
Number of customers 442
Number of accounts 442
Approximate number of transactions per customer 184
Approximate time period of transactions 12 months
Total number of transactions 92,824
Labeled suspicious transactions 4,054
Labeled legitimate transactions 88,770

Sample of two original transaction records considered for prediction by CNN and interpretation by SHAP

Features Suspicious transaction Legitimate transaction
Transaction date 7/12/2017 6/02/2018
Transaction number 339549 359932
Transaction account 10300015 10202449
Transaction amount 6,000.00 322.00
Credit 6,000.00
Debit 322.00
Balance 52,659.00 16,054.00
Transaction type Credit Debit
Transaction subtype Cash deposit Auto-debit
Transaction description Cash deposit Health insurance
Transaction currency AUD AUD
Transaction location type ATM Online
Transaction location code 448 222
Target account 0 891141
Target country code 0 Australia
Target bank code 0 559059
Customer ID 20000736 20002452
Customer type Student Individual
Gender Male Female
Date of birth 24/09/1992 26/05/1965
Age 28 55
Marital status Single Married
Residence country Australia Australia
State New South Wales New South Wales
City Sydney New Castle
Postcode 2358 2361
Tax resident country Australia Australia
Birth country Overseas country Australia
Nationality country Overseas country Australia
Profession Student Laborers
Income category 4000 77668
KYC updated on date 22/04/2017 13/09/2019
KYC state Active Active
Risk rating 0 0.463290428
Account number 10300015 10202449
BSB number 203901 201807
Account created on date 22/04/2017 23/08/2017
Account type Savings Savings
Daily transaction limit 3,000 2,000
TFN 999528645 968305061
Statement delivery method Not set Online

CNN architecture hyperparameters

Layer Parameters
Conv1D Filters = 32, Kernel size = 2, Input shape = 51,980 × 40, Activation = ReLU
Batch normalization Axis = −1, momentum = 0.99, center = true, scale = true
Dropout 0.3
Conv1D Filters = 64, Kernel size = 2, Activation = ReLU
Batch normalization Axis = −1, momentum = 0.99, center = true, scale = true
Dropout 0.3
Conv1D Filters = 128, Kernel size = 2, Activation = ReLU
Batch normalization Axis = −1, momentum = 0.99, center = true, scale = true
Dropout 0.3
Flatten Axis = −1, momentum = 0.99, center = true, scale = true
Dropout 0.3
Dense Units = 512, Activation = ReLU
Dropout 0.3
Dense Units = 1, Activation = Sigmoid

Overlapping transaction scenarios that shares the characteristics of legitimate and suspicious transactions

S. No. Scenario description
OL-1 Wire transfer of money to offshore accounts from savings account
OL-2 Cash withdrawal from the account in the range of AU $2,000 to AU $5,000
OL-3 Wire transfer of money from offshore account into savings account
OL-4 Shopping in the range of AU $10,000 to AU $30,000

Hyperparameters for RF, XGBoost, and SVM

Classifier Hyperparameter Value
RF Number of trees in the forest 100
RF Minimum number of data points in a node prior splitting 2
RF Minimum number of data points allowed in a leaf node 1
RF Maximum number of features for splitting a node sqrt
RF Method for sampling data points True
RF Class weight 0:1, 1:100
XGBoost Minimum number of data points in a node prior splitting 2
XGBoost Minimum number of data points allowed in a leaf node 1
XGBoost Learning rate 0.1
XGBoost Number of decision trees to be boosted 100
XGBoost Subsample ratio of training data 1
XGBoost Maximum depth 3
SVM C 1.0
SVM Kernel Linear
SVM Gamma Scale

Scenarios to develop money laundering transactions

S. No. Scenario description
ML-1 Small deposits (<AU $5,000) of money through ATM by multiple people into a single account (<AU $10,000 per day) over a month. Then the same money is transferred in batches of AU $10,000 to $30,000 to multiple overseas accounts in different countries.
ML-2 Small deposits (<AU $5,000) of money through ATM by multiple people into a single account (<AU $10,000 per day) over a month. Then the same money is used to buy luxurious items locally in the range of AU $10,000 to AU $90,000 (vehicles, gold, property, etc.).
ML-3 Transfer of money from multiple overseas accounts from multiple countries and using the same to buy luxurious items in the range of AU $10,000 to AU $90,000 (vehicles, gold, property).
ML-4 Transfer of money from multiple overseas accounts from multiple countries and withdraw the same through ATM over next couple of months in a small quantity in the range of AU $2,000 to AU $4,900.
ML-5 Deposit a small amount of money in the range of AU $2,000 to AU $4,500 each month to ATM deposit machine and transfer the deposited amount online to an account in a different local bank (but same account) the next day.

Suspicious transaction prediction results of CNN, RF, XGBoost, and SVM models

Metrics CNN RF XGBoost SVM
fβ score 78.23% 61.97% 62.09% 30.86%
Recall 91.01% 59.82% 60.14% 29.10%
Precision 34.56% 91.53% 87.72% 67.47%
Accuracy 92.03% 97.95% 97.84% 96.20%
AUC 98.00% 79.80% 98.40% 83.60%
TPs 1,114 746 750 363
TNs 24,515 26,532 26,496 26,426
FPs 2,109 69 105 175
FNs 110 501 497 884
Training time 70 min 6 s 16 s 4.4 min
eISSN:
1178-5608
Idioma:
Inglés
Calendario de la edición:
Volume Open
Temas de la revista:
Engineering, Introductions and Overviews, other