In patient-specific quality assurance (QA) for static beam intensity-modulated radiation therapy (IMRT), machine-learning-based dose analysis methods have been developed to identify the cause of an error as an alternative to gamma analysis. Although these new methods have revealed that the cause of the error can be identified by analyzing the dose distribution obtained from the two-dimensional detector, they have not been extended to the analysis of volumetric-modulated arc therapy (VMAT) QA. In this study, we propose a deep learning approach to detect various types of errors in patient-specific VMAT QA. A total of 161 beams from 104 prostate VMAT plans were analyzed. All beams were measured using a cylindrical detector (Delta4; ScandiDos, Uppsala, Sweden), and predicted dose distributions in a cylindrical phantom were calculated using a treatment planning system (TPS). In addition to the error-free plan, we simulated 12 types of errors: two types of multileaf collimator positional errors (systematic or random leaf offset of 2mm), two types of monitor unit (MU) scaling errors (±3%), two types of gantry rotation errors (±2° in clockwise and counterclockwise direction), and six types of phantom setup errors (±1mm in lateral, longitudinal, and vertical directions). The error-introduced predicted dose distributions were created by editing the calculated dose distributions using a TPS with in-house software. Those 13 types of dose difference maps, consisting of an error-free map and 12 error maps, were created from the measured and predicted dose distributions and were used to train the convolutional neural network (CNN) model. Our model was a multi-task model that individually detected each of the 12 types of errors. Two datasets, Test sets 1 and 2, were prepared to evaluate the performance of the model. Test set 1 consisted of 13 types of dose maps used for training, whereas Test set 2 included the dose maps with 25 types of errors in addition to the error-free dose map. The dose map, which introduced 25 types of errors, was generated by combining two of the 12 types of simulated errors. For comparison with the performance of our model, gamma analysis was performed for Test sets 1 and 2 with the criteria set to 3%/2mm and 2%/1mm (dose difference/distance to agreement). For Test set 1, the overall accuracy of our CNN model, gamma analysis with the criteria set to 3%/2mm, and gamma analysis with the criteria set to 2%/1mm was 0.92, 0.19, and 0.81, respectively. Similarly, for Test set 2, the overall accuracy was 0.44, 0.42, and 0.95, respectively. Our model outperformed gamma analysis in the classification of dose maps containing a single type error, and the performance of our model was inferior in the classification of dose maps containing compound errors. A multi-task CNN model for detecting errors in patient-specific VMAT QA using a cylindrical measuring device was constructed, and its performance was evaluated. Our results demonstrate that our model was effective in identifying the error type in the dose map for VMAT QA.
Read full abstract