7-days of FREE Audio papers, translation & more with Prime
7-days of FREE Prime access
7-days of FREE Audio papers, translation & more with Prime
7-days of FREE Prime access
https://doi.org/10.1088/1361-6560/ad965d
Copy DOIJournal: Physics in medicine and biology | Publication Date: Nov 22, 2024 |

To develop a deep reinforcement learning (DRL) agent to self-interact with the treatment planning system (TPS) to automatically generate intensity modulated radiation therapy (IMRT) treatment plans for head-and-neck (HN) cancer with consistent organ-at-risk (OAR) sparing performance.
Methods:
With IRB approval, one hundred and twenty HN patients receiving IMRT were included. The DRL agent was trained with 20 patients. During each inverse optimization process, the intermediate dosimetric endpoints' value, dose volume constraints value and structure objective function loss were collected as the DRL states. By adjusting the objective constraints as actions, the agent learned to seek optimal rewards by balancing OAR sparing and planning target volume (PTV) coverage. Reward computed from current dose-volume-histogram (DVH) endpoints and clinical objectives were sent back to the agent to update action policy during model training. The trained agent was evaluated with the rest 100 patients. 
Results:
The DRL agent was able to generate a clinically acceptable IMRT plan within 12.4±3.1 minutes without human intervention. DRL plans showed lower PTV maximum dose (109.2%) compared to clinical plans (112.4%) (p<.05). Average median dose of left parotid, right parotid, oral cavity, larynx, pharynx of DRL plans were 15.6Gy, 12.2Gy, 25.7Gy, 27.3Gy and 32.1Gy respectively, comparable to 17.1 Gy,15.7Gy, 24.4Gy, 23.7Gy and 35.5Gy of corresponding clinical plans. The maximum dose of cord+5mm, brainstem and mandible were also comparable between the two groups. In addition, DRL plans demonstrated reduced variability, as evidenced by smaller 95% confidence intervals. The total MU of the DRL plans was 1611 vs 1870 (p<.05) of clinical plans. The results signaled the DRL's consistent planning strategy compared to the planners' occasional back-and-forth decision-making during planning.
Conclusion:
The proposed deep reinforcement learning (DRL) agent is capable of efficiently generating HN IMRT plans with consistent quality. 
.
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.