Abstract
There are several semi-automatic tools for the delineation of a Gross Tumor Volume (GTV) in PET scans for radiation therapy planning. A problem is that in low contrast amino acid PET images, like FET-PET, the contours made by these algorithms are not uniform. Recent work on random walk (RW) algorithms showed promising results for low-contrast-PET GTV delineation. The aim of this work is to compare and evaluate RW algorithms with algorithms already in clinical use for low-contrast-PET GTV delineation and with delineations made by a clinical experienced physician. Ten FET-PET scans of patients with recurrent glioblastoma after surgical and radio-oncologic therapy were used. We used three different RW based algorithms (RW1, RW2, RW3) from different centers with different foreground, background and edge weight determination methods. For comparison the 1.6 opposite mean (OM) and the 40% , 50 % respectively 60 % of the maximum SUV methods were chosen. The reference contour was done by one experienced physician by using an individual adapted OM method. The evaluation was done by comparing the following parameters: single contoured volume (VSC), common contoured volume (VCC) and the kappa statistic (K). According to K RW2 provides the most similar delineations with the reference contour. The mean K (K*) = 0.68 (95% CI 0.57 - 0.79) [Substantial observer agreement according to Landis and Koch (OG)]. The 1.6 OM (K* = 0.58, 95% CI 0.41 - 0.75) [Moderate OG], the 40 % (K* = 0.44, 95% CI 0.23 - 0.65) [Moderate OG] , the 50 % (K* = 0.53, 95% CI 0.34 - 0.72) [Moderate OG] respectively 60 % (K* = 0.57, 95% CI 0.45 - 0.69) [Moderate OG] method have a less high K value. The VSC was lowest in the RW algorithm, which means the lowest number of falsely positive segmented voxels and translating into higher K. The VCC varies from 56 % to 87 % in the different algorithms, but has no significant influence on the K values. The presented work suggests that RW algorithms may provide clinical more useful delineations than threshold based algorithms in low contrast PET scans. The smaller CI of RW2 indicates that this algorithm is more stable and reliable than the other algorithms. Nonetheless experience showed that an issue of RW algorithms is the definition of foreground and background. RW2 the best performing RW algorithm uses a method that takes the SUV distribution of the whole brain into account in contrast to local methods used by the other algorithms. This may allow the assumption that a holistic approach outperforms local methods. Further research will explore this and the applicability of the RW algorithms on bigger datasets and in clinical practice.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have