Detection and segmentation of cell nuclei in hematoxylin and eosin-stained tissue images pose critical clinical challenges, including intricate backgrounds, variable nuclear appearances, overlapping nuclei, and unclear boundaries. These complexities render automated instance segmentation a difficult research area. Thus, this paper proposes a deep aggregation transformer network for automatic nucleus instance segmentation. The method employs a robust Vision Transformer-based backbone to extract informative features from images. To fuse multi-level information, we propose the group aggregation multi-level feature fusion module, which merges low-level features for shape representation and high-level features for pixel edge segmentation, enhancing overall model performance. The depthwise integrated multilayer perceptron module enhances feature learning by combining the global context-dependent representation of MLP and the local representation of convolution through channel separation. This fosters feature learning across different modes, thereby improving network representation. The bidirectional path interaction feature processing module is introduced for multi-scale information extraction and fusion. This module enhances the entire feature pyramid hierarchy by transferring deep semantic information to shallow layers with robust spatial information via bidirectional paths. The Large kernel Attention module at each layer extensively improves feature extraction through a large receptive field. Our approach outperforms state-of-the-art models, including StarDIST, CPP-Net, TSFD-Net and TransNuSeg, by 7.8%, 6.3%, 1.5% and 3.5% in average panoptic quality for different cell nuclei segmentation categories across tissues of the Pannuke dataset. Furthermore, DAT-Net outperforms all other state-of-the-art models in terms of the F1 score for the five classes of cell nuclei in the PanNuke dataset.
Read full abstract