Abstract

Recently, deep multi-instance neural networks have been successfully applied for medical image classification, where only image-level labels rather than fine-grained patch-level labels are available for use. One key issue for these multi-instance neural networks is how to aggregate all patch (instance) features into an entire image (bag) representation, referred to as multi-instance pooling, e.g., max, mean, and attention based pooling. Nevertheless, these multi-instance pooling operations do not take the structural information within an image into account. This is obviously inappropriate for medical image classification since there often exist certain dependencies among regional patches/lesions. We propose an adaptive recurrent pooling based deep multi-instance neural network in this paper. In this network, we first extract multi-view global structural features from every bag using the self-attention mechanism, and then aggregate these multi-view features into a whole bag representation based on the adaptive recurrent pooling operation in order to further capture the contextual information within the bag. Moreover, we introduce the cross-normalization operation used in the Unit Force Operated Vision Transformer into the self-attention module to reduce its computational complexity. We have experimentally evaluated the performance of the proposed network on three medical image datasets, namely UCSB breast, Messidor, and Colon cancer. The results demonstrate the advantage of our network over current state-of-the-art deep multi-instance networks in terms of classification accuracy and interpretability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.