Abstract

Precise and efficient polyp segmentation plays a crucial role in colonoscopy, which is important for the prevention of colorectal cancer. Despite CNN-based methods have achieved great progress in the polyp segmentation task, they are incapable of modeling long-range dependencies. Transformer-based models utilize self-attention mechanism to overcome this problem while suffering from heavy computing cost. Benefiting from simple structures, MLP-based models seem to be an alternative. However, they struggle with dealing with flexible input scales and modeling long-term dependencies. Both of these two factors are important for image segmentation, which could explain why the MLP architecture performs poorly compared to the Transformer. To remedy this issue, we propose a novel Polyp-Mixer, which utilizes MLP-based structures in both encoder and decoder. In particular, we use CycleMLP as the encoder to overcome the fixed input scale issue. Besides, we propose a Multi-head Mixer by converting the current CycleMLP into a Multi-head fashion, allowing our model to explore rich context information from various subspaces. In addition, we build a powerful Contextual Bridger Module between the encoder and decoder, which can capture semantics from larger receptive fields and combine them with various decoder layers. Experiments demonstrate the proposed method with fewer parameters ( <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\sim 16\text{M}$ </tex-math></inline-formula> ) achieves SOTA on 4 public benchmarks. Our code will be released at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/shijinghuihub/Polyp-Mixer</uri>

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.