Abstract

Although commercial treatment planning systems (TPSs) can automatically solve the optimization problem for treatment planning, human planners need to define and adjust the planning objectives/constraints to obtain clinically acceptable plans. Such a process is labor-intensive and time-consuming. In this work, we show an end-to-end study to train a deep reinforcement learning (DRL) based virtual treatment planner (VTP) that can behave like a human to operate a dose-volume constrained treatment plan optimization engine following the parameters used in Eclipse TPS for high-quality treatment planning. We considered the prostate cancer IMRT treatment plan as the testbed. The VTP took the dose-volume histogram (DVH) of a plan as input and predicted the optimal strategy for constraint adjustment to improve the plan quality. The training of VTP followed the state-of-the-art Q-learning framework. Experience replay was implemented with epsilon-greedy search to explore the impacts of taking different actions on a large number of automatically generated plans, from which an optimal policy can be learned. Since a major computational cost in training was to solve the plan optimization problem repeatedly, we implemented a graphical processing unit (GPU)-based technique to improve the efficiency by 2-fold. Upon the completion of training, the established VTP was deployed to plan for an independent set of 50 testing patient cases. Connecting the established VTP with the Eclipse workstation via the application programming interface, we tested the performance the VTP in operating Eclipse TPS for automatic treatment planning with another two independent patient cases. Like a human planner, VTP kept adjusting the planning objectives/constraints to improve plan quality until the plan was acceptable or the maximum number of adjustment steps was reached under both scenarios. The generated plans were evaluated using the ProKnow scoring system. The mean plan score (± standard deviation) of the 50 testing cases were improved from 6.18 ± 1.75 to 8.14 ± 1.27 by the VTP, with 9 being the maximal score. As for the two cases under Eclipse dose optimization, the plan scores were improved from 8 to 8.4 and 8.7 respectively by the VTP. These results indicated that the proposed DRL-based VTP was able to operate the in-house dose-volume constrained TPS and Eclipse TPS to automatically generate high-quality treatment plans for prostate cancer IMRT.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.