Abstract

Self-supervised learning aims to learn transferable representations from unlabeled data for downstream tasks. Inspired by masked language modeling in natural language processing, masked image modeling (MIM) has achieved certain success in the field of computer vision, but its effectiveness in medical images remains unsatisfactory. This is mainly due to the high redundancy and small discriminative regions in medical images compared to natural images. Therefore, this paper proposes an adaptive hard masking (AHM) approach based on deep reinforcement learning to expand the application of MIM in medical images. Unlike predefined random masks, AHM uses an asynchronous advantage actor-critic (A3C) model to predict reconstruction loss for each patch, enabling the model to learn where masking is valuable. By optimizing the non-differentiable sampling process using reinforcement learning, AHM enhances the understanding of key regions, thereby improving downstream task performance. Experimental results on two medical image datasets demonstrate that AHM outperforms state-of-the-art methods. Additional experiments under various settings validate the effectiveness of AHM in constructing masked images.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.