Open Access

Predicting the Amount of Compensation for Harm Awarded by Courts Using Machine-Learning Algorithms


Cite

Figure 1.

Distribution of Amounts Awarded as Compensation for Harm Suffered and Their Logarithms
Distribution of Amounts Awarded as Compensation for Harm Suffered and Their Logarithms

Figure 2.

Distribution of Amounts Awarded as Compensation for Harm Suffered among the 25 Most Impactful Features
Distribution of Amounts Awarded as Compensation for Harm Suffered among the 25 Most Impactful Features

Figure 3.

Partial Dependence Plots for the Most Impactful Features
Partial Dependence Plots for the Most Impactful Features

Figure 4.

Example Model Prediction: Judgement of the Regional Court in Gliwice Dated August 31, 2014, File Number I C 1946/14
Example Model Prediction: Judgement of the Regional Court in Gliwice Dated August 31, 2014, File Number I C 1946/14

Figure 5.

Example Model Prediction: Judgement of the Regional Court in Zambrów Dated September 24, 2015, File Number I C 335/15
Example Model Prediction: Judgement of the Regional Court in Zambrów Dated September 24, 2015, File Number I C 335/15

Tokens Characterized by the Highest Differences Between Maximum and Minimum Value of Their Partial Dependence Plots

Id Token Average awarded compensation when token does not occur Minimum awarded compensation Maximum awarded compensation Difference between maximum and minimum awarded compensation
1 pension 21,027.80 21,027.80 (0) 38,725.99 (5) 17,698.18
2 family 16,532.26 16,532.26 (0) 34,000.38 (16) 17,468.12
3 hospital 20,559.30 20,559.30 (0) 35,327.96 (20) 14,768.65
4 fracture 19,852.62 19,852.62 (0) 30,671.07 (25) 10,818.46
5 year 19,473.51 19,473.51 (0) 28,048.15 (10) 8,574.65
6 life 19,712.14 19,712.14 (0) 25,905.34 (36) 6,193.20
7 dead 22,645.50 22,645.50 (0) 28,181.21 (7) 5,535.71
8 zloty_monthly 22,557.29 22,557.29 (0) 27,708.96 (29) 5,151.67
9 extend 22,570.76 22,570.76 (0) 26,658.50 (6) 4,087.74
10 substantial 22,346.35 22,346.35 (0) 25,140.15 (19) 2,793.80
11 bones 23,056.96 23,056.96 (0) 25,536.80 (25) 2,479.84
12 child 22,439.85 22,439.85 (0) 24,908.14 (39) 2,468.29
13 bond 23,108.54 23,108.54 (0) 25,423.49 (24) 2,314.95
14 cervical 23,675.45 21,368.98 (22) 23,675.45 (0) 2,306.47
15 disorders 22,655.92 22,655.92 (0) 24,947.08 (17) 2,291.17
16 collar 23,419.00 22,944.30 (11) 23,419.00 (0) 2,264.04
17 disability 22,944.30 22,944.30 (0) 25,129.94 (5) 2,185.64
18 family_bond 23,262.10 23,262.10 (0) 25,430.35 (10) 2,168.25
19 twist 23,417.04 21,541.70 (7) 23,417.04 (0) 1,875.34
20 future 22,640.79 22,640.79 (0) 24,386.69 (22) 1,745.90

Error Measures Obtained on a Test Set With Different Algorithms Applied

Algorithm applied Predictors Root mean squared error Mean Absolute percentage error Root median squared error Median absolute percentage error
OLS Token counts 308,207.87 112.32 11,747.55 52.79
TF-IDF 100,743.48 94.59 11,759.07 49.25
LASSO Token counts 287,706.24 109.68 11,474.52 52.64
TF-IDF 95,976.82 93.46 11,435.79 48.64
Random forests Token counts 75,268.12 86.43 11,064.28 47.16
TF-IDF 74,674.55 88.59 10,518.75 48.13
XGBoost Token counts 94,271.95 105.27 12,654.46 52.18
TF-IDF 192,230.66 102.79 12,821.45 51.48
Multilingual BERT - 95,665.94 426.45 38,062.44 88.03

Error Measures Obtained on a Test Set With Different Algorithms Applied and Models’ Overestimates Assumed as Not Errors

Algorithm applied Predictors Root mean squared error Mean Absolute percentage error Root median squared error Median absolute percentage error
OLS Token counts 61,829.04 23.92 1,024.79 5.29
TF-IDF 67,563.77 21.83 178.39 1.08
LASSO Token counts 62,019.89 23.83 1,061.22 5.57
TF-IDF 68,215.32 21.78 129.66 0.69
Random forests Token counts 74,939.24 23.11 1,729.41 8.01
TF-IDF 73,939.24 22.80 1,741.93 9.02
XGBoost Token counts 71,414.42 23.83 1,069.24 7.93
TF-IDF 65,657.99 24.13 1,209.66 7.68
Multilingual BERT - 82,212.22 29.04 0.00 0.00
eISSN:
2543-6821
Language:
English