Cite

Fig. 1

Examples of extremely different houses located in the same zip code and residents of which have the same expected claim frequency by the current insurer’s model.
Examples of extremely different houses located in the same zip code and residents of which have the same expected claim frequency by the current insurer’s model.

Fig. 2

Features annotated from Google Satellite View and Google Street View image of a particular address.
Features annotated from Google Satellite View and Google Street View image of a particular address.

Fig. 3

Gini coefficients obtained on 20% test sample in 20 bootstrapping trials from the null model (A), the best-in-class insurer’s model (B) and our model with newly created variables (C).
Gini coefficients obtained on 20% test sample in 20 bootstrapping trials from the null model (A), the best-in-class insurer’s model (B) and our model with newly created variables (C).

Fig. 4

Geolocation of the addresses from the dataset examined in this paper.
Geolocation of the addresses from the dataset examined in this paper.

Fig. 5

Distribution of labels and corresponding observed claim frequency for the variables generated for this study.
Distribution of labels and corresponding observed claim frequency for the variables generated for this study.

Fig. 6

Illustration of the Gini coefficient computation for one of the bootstrapping trials.
Illustration of the Gini coefficient computation for one of the bootstrapping trials.

Summary statistics of the dataset—before and after cleansing.

Original databaseAfter data cleansing
Number of polices20,00019,871
Risk exposure11,34911,209
MTPL PD claim count571570
Observed MTPL PD frequency5,03%5,09%

Statistics for seven newly created variables—original granularity, inter-rater reliability of 4 selected annotators on the common set of 500 observations and significance in our risk model after applying necessary simplifications.

VariableOriginal granularityInter-rater reliabilityRisk model
Fleiss’ kappaInterpretationGranularity after simplificationp-value
Neighbourhood typeSeven types, multi-choice0.52Moderate agreement200.01
Building densityScale 1–50.50Moderate agreementNot significant
Street View qualityGood/bad/missing0.79Substantial agreement200.02
House typeFive types, single-choice0.69Substantial agreement200.01
House ageScale 1–30.51Moderate agreement200.03
House conditionScale 1–30.54Moderate agreement200.04
Wealth of residentsScale 1–100.32Fair agreementNot significant

Data for calculation of X2 statistic for hypothesis verification whether claims in our dataset follow the Poisson distribution. On average λ = 3.9% and the corresponding X2 = 0.08 with 1 degree of freedom.

Number of claimsObserved exposure (O)Expected prob.P(X = k)Expected exposure (E)(E – O)2/E
010,78496%10,7850,00
14174%4160,01
270%80,08
All11,209
eISSN:
2543-6821
Language:
English