End-to-end multimodal image registration via reinforcement learning.

Jing Hu,Siwei Lyu,Qi Song,Ziwei Luo,Shanhui Sun,Xi Wu,Kunlin Cao,Xin Wang,Youbing Yin

doi:10.1016/j.media.2020.101878

Abstract

Multimodal image registration is a vital initial step in several medical image applications for providing complementary information from different data modalities. Since images with different modalities do not exhibit the same characteristics, finding their accurate correspondences remains a challenge. For convolutional multimodal registration methods, two components are quite significant: descriptive image feature as well as the suited similarity metric. However, these two components are often custom-designed and are infeasible to the high diversity of tissue appearance across modalities. In this paper, we translate image registration into a decision-making problem, where registration is achieved via an artificial agent trained by asynchronous reinforcement learning. More specifically, convolutional long-short-term-memory is incorporated after stacked convolutional layers in this method to extract spatial-temporal image features and learn the similarity metric implicitly. A customized reward function driven by landmark error is advocated to guide the agent to the correct registration direction. A Monte Carlo rollout strategy is also leveraged to perform as a look-ahead inference in the testing stage, to increase registration accuracy further. Experiments on paired CT and MR images of patients diagnosed as nasopharyngeal carcinoma demonstrate that our method achieves state-of-the-art performance in medical image registration.

Full Text