Abstract

A clinical study regarding the potential of range verification in proton therapy (PT) by prompt gamma imaging (PGI) is carried out at our institution. Manual interpretation of the detected spot-wise range shift information is time-consuming, highly complex, and therefore not feasible in a broad routine application. Here, we present an approach to automatically detect and classify treatment deviations in realistically simulated PGI data for head-and-neck cancer (HNC) treatments using convolutional neural networks (CNNs) and conventional machine learning (ML) approaches. For 12 HNC patients and 1 anthropomorphic head phantom (n=13), pencil beam scanning (PBS) treatment plans were generated, and 1 field per plan was assumed to be monitored with a PGI slit camera system. In total, 386 scenarios resembling different relevant or non-relevant treatment deviations were simulated on planning and control CTs and manually classified into 7 classes: non-relevant changes (NR) and relevant changes (RE) triggering treatment intervention due to range prediction errors (±RP), setup errors in beam direction (±SE), anatomical changes (AC), or a combination of such errors (CB). PBS spots with reliable PGI information were considered with their nominal Bragg peak position for the generation of two 3D spatial maps of 16× 16×16 voxels containing PGI-determined range shift and proton number information. Three complexity levels of simulated PGI data were investigated: (I) optimal PGI data, (II) realistic PGI data with simulated Poisson noise based on the locally delivered proton number, and (III) realistic PGI data with an additional positioning uncertainty of the slit camera following an experimentally determined distribution. For each complexity level, 3D-CNNs were trained on a data subset (n=9) using patient-wise leave-one-out cross-validation and tested on an independent test cohort (n=4). Both the binary task of detecting RE and the multi-class task of classifying the underlying error source were investigated. Similarly, four different conventional ML classifiers (logistic regression, multilayer perceptron, random forest, and support vector machine) were trained using five previously established handcrafted features extracted from the PGI data and used for performance comparison. On the test data, the CNN ensemble achieved a binary accuracy of 0.95, 0.96, and 0.93 and a multi-class accuracy of 0.83, 0.81, and 0.76 for the complexity levels (I), (II), and (III), respectively. In the case of binary classification, the CNN ensemble detected treatment deviations in the most realistic scenario with a sensitivity of 0.95 and a specificity of 0.88. The best performing ML classifiers showed a similar test performance. This study demonstrates that CNNs can reliably detect relevant changes in realistically simulated PGI data and classify most of the underlying sources of treatment deviations. The CNNs extracted meaningful features from the PGI data with a performance comparable to ML classifiers trained on previously established handcrafted features. These results highlight the potential of a reliable, automatic interpretation of PGI data for treatment verification, which is highly desired for a broad clinical application and a prerequisite for the inclusion of PGI in an automated feedback loop for online adaptive PT.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.