Deep reinforcement learning for treatment planning in high-dose-rate cervical brachytherapy.

Gang Pu,Yuanjing Hu,Zhiyong Yang,Shan Jiang,Ziqi Liu

doi:10.1016/j.ejmp.2021.12.009

Gang Pu, Yuanjing Hu + Show 3 more

https://doi.org/10.1016/j.ejmp.2021.12.009

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

High-dose-rate (HDR) brachytherapy (BT) is an effective cancer treatment method in which the radiation source is placed within the body. Treatment planning is a critical component for a successful outcome. Almost all currently proposed treatment planning methods are built on stochastic heuristic algorithms, which limits the generation of higher quality plans. This study proposed a novel treatment planning method to adjust dwell times in a human-like fashion to improve the quality of the plan. We built an intelligent treatment planner network (ITPN) based on deep reinforcement learning (DRL). The network architecture of ITPN is Dueling Double-Deep Q Network. The state is the dwell time of each dwell position and the action is which dwell time to adjust and how to adjust it. A hybrid equivalent uniform dose objective function was established and assigned corresponding rewards according to its changes. Experience replay was performed with the epsilon greedy algorithm and SumTree data structure. In the evaluation of ITPN using 20 patient cases, D90, D100 and V100 showed no significant difference compared with inverse planning simulated annealing (IPSA) optimization. However, D2cc of bladder, rectum and sigmoid, V150 and V200 were significant reduced, and homogeneity index and conformity index were significantly increased. The proposed ITPN was able to generate higher quality plans based on the learned dwell time adjustment policy than IPSA. This is the first artificial intelligence system that can directly determine the dwell times of HDR BT, which demonstrated the potential feasibility of solving optimization problems via DRL.

Full Text