<h3>Purpose/Objective(s)</h3> The treatment planning of intensity-modulated radiotherapy (IMRT) is often time-consuming and needs multiple rounds of trials and errors. The unknown optimal dosimetry of anatomy mainly causes the difficulty. This study aims to develop a deep learning model to predict the dose distribution of a new patient, which has a similar diametric quality to previous clinical plans. <h3>Materials/Methods</h3> The dataset comes from the international competition for the dosimetry prediction of head and neck patients (2020 OpenKBP) with three target prescriptions (56 Gy, 63 Gy, and 70 Gy). The evaluation criterion is pixel-by-pixel dose discrepancy and point-by-point DVH difference. The organizer provides 200 training cases. Each has one planning CT image, one dose distribution, and one segmentation mask of seven organs-at-risks (OARs, brainstem, spinal cord, right parotid, left parotid, larynx, esophagus, and mandible). 40 cases are provided as a validation set with CT and segmentation mask only. The organizer reported the prediction quality for these 40 validation patients to help the model refining, but did not disclose the ground truth dosimetry. Finally, 100 testing cases were used to evaluate the model performance. We proposed a triple-stage cascaded U-Net to predict the dose distribution in a coarse-to-fine manner using the auto-context mechanism. Specifically, the first U-Net takes the CT image and structure contours as input and outputs a coarse dose distribution. This distribution is then fed into the second U-Net together with the CT image and contours, to predict a refined dose distribution. Finally, the third U-Net takes both dose distribution predicted by the first and second U-Net together with the CT image and contours to generate the final dose distribution, which is a refinement of the previous two predictions. The network is fully implemented in 3D manners (which means it takes the 3D CT volume as input, not 2D CT slices). It takes about 17 hours to train the model for 400 epochs, using a batch size of 6 and Adam optimizer with a base learning rate of 0.0005 on a GPU server equipped with six NVIDIA TITAN Xp graphic cards with 12 GBytes of memory each. The prediction is almost instantaneous, about 0.2 seconds per case. <h3>Results</h3> We found the proposed method's performance is better with 200 training cases than with 160 cases, which demonstrated the data reliance of the deep learning-based methods. The dose similarity score and DVH similarity score from 100 testing cases are 2.753 and 1.559, respectively. Our model has one of the best performances among the participants. The predicted dosimetry is similar to the realistic one, with no clinically meaningful difference. <h3>Conclusion</h3> We developed a novel cascaded auto-context deep learning model to predict optimal dosimetry of a head and neck patient. The model achieved satisfactory performance for clinical use.
Read full abstract