Deep reinforcement learning architectures for automatic organ segmentation

Valentin Ogrean,Remus Brad

doi:10.1016/j.bspc.2023.105919

Abstract

In the automated segmentation field, the most promising results were obtained using Deep learning (DL) architectures that employ a mix of “Fully convolutional neural networks” (FCN), “Generative adversarial networks” (GAN), or “Recurrent neural networks” (RNN). In our paper we aim to challenge the status-quo of using supervised learning algorithms by proposing and implementing different Deep Reinforcement learning (DRL) architectures that execute the same task – human organ automatic segmentation. We present a completely new approach that defines the segmentation as a Markov Decision Process (MDP) problem where an agent navigates in the environment (CTs) and learns a policy to delineate the contour of the human organ. For solving this problem, we utilized different architectures built on DRL agents. The first architecture employs the Deep Q-Network (DQN) algorithm and uses discrete actions to navigate through CTs while executing the automatic segmentation for the heart. The second architecture implements the Proximal Policy Optimization (PPO) algorithm and utilizes a continuous action space in order to execute the same segmentation process. We tested these two architectures in different setups and using varied reward systems and our results prove that DRL can be used successfully for automated medical segmentation. We also proved that both off-policy, discrete action algorithms or on-policy, continuous action algorithms converge to generalized results.

Full Text