Abstract

Transformer has shown excellent performance in various visual tasks, making its application in medicine an inevitable trend. Nevertheless, simply using transformer for small-scale cervical nuclei datasets will result in disastrous performance. Scarce nuclei pixels are not enough to compensate for the lack of CNNs-inherent intrinsic inductive biases, making transformer difficult to model local visual structures and deal with scale variations. Thus, we propose a Pixel Adaptive Transformer(PATrans) to improve the segmentation performance of nuclei edges on small datasets through adaptive pixel tuning. Specifically, to mitigate information loss resulting from mapping different patches into similar latent representations, Consecutive Pixel Patch (CPP) embeds rich multi-scale context into isolated image patches. In this way, it can provide intrinsic scale invariance for 1D input sequences to maintain semantic consistency, allowing the PATrans to establish long-range dependencies quickly. Futhermore, due to the existing handcrafted-attention is agnostic to the widely varying pixel distributions, the Pixel Adaptive Transformer Block (PATB) effectively models the relationships between different pixels across the entire feature map in a data-dependent manner, guided by the important regions. By collaboratively learning local features and global dependencies, PATrans can adaptively reduce the interference of irrelevant pixels. Extensive experiments demonstrate the superiority of our model on three datasets(Ours, ISBI, Herlev).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call