Artificial intelligence-based segmentation models can assist the early-stage detection of lung COVID-19 infections or lesions from medical images with higher efficiency versus traditional techniques, yet challenges remain attributable to factors such as heterogeneities in infection traits, small infection regions, blurred boundary context, mixtures of varying infection regions, and obscure intensity contrast between lesions and normal tissues. This study aims to improve the performance of automatic segmentation of lung COVID-19 infections by proposing a novel approach named parallel pyramid dual-stream modeling network (PDSMNet) with two major contributions: (1) refined design of the framework including (a) a transformer module termed parallel pyramid dual-stream module (PDSM) to effectively preserve channel, spatial, and other latent features; (b) fusion of a multi-scale pyramid parallel-pooling module (MPM) that extracts features in parallel at differentiated scales; (c) calibration of the skip-connection architectures to optimize prediction and preserve features; (d) calibration of attention mechanisms reflecting multiple sources of information including parallel-, serial-, and cross-attention contexts; and (e) improvement of the integral functionality of the framework in curtailing the burdens of parameter computations in a multi-modality scenario. (2) loss functions accounting for the training losses of normal tissues, diseases, and boundaries respectively to enhance the performance of the network. The calibrated loss design allowing for a margin improves the capacity of predictions. We conducted experiments using three different datasets with different modalities and compared the proposed framework PDSMNet with two other benchmarks and eight similar state-of-the-art (SOTA) networks. The experiments observed consistent performance improvements across all datasets. PDSMNet attained asymptotically a maximum increase of 16.5%, 6.2%, and 15.5% for mean F1 score (mF1S), mean dice-score coefficient (mDSC), mean intersection over union (mIoU) versus SOTA PDEAtt-UNet, a maximum increase of 49.6%, 25.6%, and 38.0% for mF1S, mDSC, and mIoU versus InfNet, a maximum increase of 26.1%, 10.8%, 21.8% versus MiniSeg, a maximum increase of 14.0%, 3.0%, and 13.4% versus TransUNet, and a maximum increase of 37.9%, 16.8%, 31.2% versus Attention-UNet respectively. For other SOTA models such as MT-UNet, UCTransNet, and UTNetV2, a sizeable enhancement of performance was observed as well. PDSMNet also yielded asymptotically a maximum 30.6%, 14.8%, 26.3% increase of mF1S, mDSC, mIoU versus benchmark UNet, and a maximum increase of 42.8%, 21.5%, 33.3% versus benchmark UNet++ respectively. PDSMNet demonstrated reduced computational costs as well, yielding approximately 0.53 M of parameters and 6.55G of floating-point operations per second (FLOPs) respectively using different datasets, and the quantitative and qualitative ablation tests reinforced the effectiveness of the various components of the proposed framework.
Read full abstract