Thermal ablation methods, such as microwave ablation (MWA), have established themselves in recent years as a suitable therapy for various malignancies.1 The execution of a tumor ablation involves many challenges. One crucial factor for a successful ablation and the prevention of residual tumor tissue or the onset of a tumor recurrence is maintaining a sufficient safety margin. There is currently no clear consensus on what a sufficient safety distance is.2 Most authors recommend a safety margin between 5-10 mm.1,3,4,5,6,7 The precondition for the determination of a suitable safety distance, however, would initially be a proper measurement method. So far, this has been proved another major challenge. Several studies have investigated postinterventional methods and measurement techniques, such as computed tomography (CT) or magnetic resonance imaging (MRI), to be able to make a valid decision about a complete ablation.8,9,10 Other authors try to improve the intraprocedural tumor detection and the assessment of the ablation margin as a recently published study used a FDG PET/CT guided ablation for the intraprocedural determination of the safety distance and achieved good results.11 Many authors favour the MRI for ablation margin control.10,12 However, in most cases tumor ablation is performed under CT guidance and the safety margin is assessed in the CT images, at least in the periinterventional setting. For best treatment, the decision whether ablation is complete or not should be made as immediately as possible. Therefore, in most cases, native or contrast-enhanced CT scans are performed, and the extent of the ablation is decided by side-by-side comparison of the pre- and post-interventional images or by simple and fast measuring techniques during the intervention like measurement with a simple distance measurement tool. Unfortunately, we do not have reliable data on the consistency and reproducibility of these subjective estimations of a sufficient safety distance in a real-world setting. For this reason, in this study we investigated the inter- and intrareader variability of the safety margin assessment after microwave ablation of liver malignancies.
The local ethics committee approved this retrospective study. Written informed consent was obtained from all patients. A total of 58 patients were included in this study, who were treated with microwave ablation between September 2017 and June 2019. Tumor entities were hepatocellular carcinoma (HCC) and metastases of colorectal and pancreatic cancer. Exclusion criteria were the patient’s refusal to participate in the study and other tumors than those mentioned above. All patients received a CT-scan one day before ablation and on the first postinterventional day (Figure 1). Subsequently, all patients were independently assessed by three interventional radiologists regarding the safety margin between tumor and healthy liver tissue using side-by-side measurement. No special evaluation software was used to simulate the procedure in everyday practice as accurately as possible. The orientation was based on reference points that could be reproduced exactly,
Intraclass correlation (ICC) estimates and their 95% confident intervals were calculated using R irr statistical package version 3.5.1 based on a mean-rating (k = 3), absolute-agreement, 2-way mixed-effects model. The intraclass correlation coefficient (ICC) was calculated for the estimation of the minimal safety margin. ICC values less than 0.5 are considered indicative of poor reliability, values between 0.5 and 0.75 indicate moderate reliability, values between 0.75 and 0.9 indicate good reliability, and values greater than 0.90 indicate excellent reliability. Bland-Altman analyses were used to assess agreement in the side-by-side measurements between the two blinded readings (minimal safety margin) by the same radiologist and between the
readings (minimal safety margin) by the three independent radiologists.
58 patients were included and evaluated. The mean age was 62.84 (10.85) years. 53 patients (91%) were male. All 58 patients were treated with MWA. Most tumors (n = 12) were located in liver segment VII, followed by segments VIII (n = 9) and IV a and V with n = 7 each. The minority of tumors were found in segments I and IV b with n = 3 each. The baseline data are shown in Table 1.
Baseline characteristics
Number of patients | N = 58 |
---|---|
mean (years) | 62.84 (10.85) |
range (years) | 36–83 |
male (%) | 53 (91) |
microwave ablation | 58 (100) |
I | 3 (5) |
II | 4 (7) |
III | 6 (10) |
IVa | 7 (12) |
IVb | 3 (5) |
V | 7 (12) |
VI | 4 (7) |
VII | 12 (21) |
VIII | 9 (16) |
Hepatocellular carcinoma | 46 (79) |
Metastasis colorectal cancer | 9 (16) |
Metastasis pancreatic cancer | 3 (5) |
The intraclass correlation coefficient (ICC) for estimation of the interindividual variability of the assessment of the minimal safety margin for all three readers was 0.357 (95%-confidence interval 0.194–0.522), indicating a poor reliability. The ICC for estimation of the variability of two repeated estimations of reader 1 was 0.774 (95%-confidence interval 0.645–0.860), indicating a good reliability for repeated measurements.
Bland–Altman plots were calculated to show intra- and interindividual variability (Figure 2). A systematic error was not detectable. The standard deviation in the intrareader-result was smaller compared to the interindividual evaluations. Nevertheless, deviations of more than 5 mm can be detected in
some measurements. The differences of the safety margins measured by the two readers are clearly larger in comparison to the deviations between both measurements performed by one reader.
The readers achieved a sensitivity and specificity of 93%/82%/82% and 33%/17%/83%, respectively. The positive predictive value (PPV) was 91%/88%/97%. The negative predictive value (NPV) was 40%/10%/39%. The results are shown in Table 2 and 3.
Contingency table of all the three independent readings compared with the six weeks follow-up MRI as gold standard
Incomplete (6 weeks MRI) | Complete (6 weeks MRI) | |
---|---|---|
Incomplete | 2 | 3 |
Complete | 4 | 41 |
Incomplete | 1 | 8 |
Complete | 5 | 36 |
Incomplete | 5 | 8 |
Complete | 1 | 36 |
Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of the three independent readings (R 1, 2, 3)) compared with the six weeks follow-up MRI as gold standard
R 1 | R 2 | R 3 | |
---|---|---|---|
93% | 82 % | 82 % | |
33% | 17 % | 83 % | |
91 % | 88 % | 97 % | |
40 % | 10 % | 39% |
There is agreement that a safety distance is necessary after ablation of a liver tumor to prevent local tumor recurrence. When defining the optimal safety distance, there are already different approaches and no general definition. Most authors favour a minimum distance of 5 mm (1,3–7).
We agree with this in principle. In our opinion, however, the measurement methods are rarely described or questioned. Therefore, our approach was to question the measurement of the safety distance in the daily routine (Figure 3 and 4).
This confirmed our impression that measurement with the standard tools provided by the CT software can lead to difficulties in measurement and thus to considerable intraindividual differences. Although the reading was performed by three experienced interventional radiologists, the inter-reader variability was poor.
One reason could be the localization of the tumor. Subcapsular tumors represent a special measuring challenge. The same applies for tumors in the immediate vicinity of other organs or vessels that are also difficult to measure.
Another aspect that can lead to considerable differences in the evaluation of the distance is the choice of the reconstruction planes and the layer thickness. Zhao
A contentious aspect is always the experience of the interventionalist. Therefore, in our study the reading was carried out by an experienced radiologist (5 years experience), a specialist radiologist (7 years experience) and the head of the Centre for Interventional Oncological Radiology. The aim was to rule out diagnostic errors due to inexperience. Nevertheless, there were considerable differences between all three readers, which called the measuring method into question.
In our opinion, the fact that the intraindividual differences were smaller shows that there is no systematic error. The measurement results are interin-dividually different but not random. In our opinion, this indicates that our study results are reliable and meaningful.
New measurement methods or software for tumor segmentation are already being investigated in some studies.5,8,9,12,13,14,15,16 The results were promising and improved the assessment of ablation success. Our study was able to show that conventional measurement methods are inaccurate and can lead to large interindividual differences. We therefore support the development of new measurement methods to achieve more reliable measurement results.