Parallel Encoder Research Articles

Lumbar vertebral body cancellous bone location and segmentation is crucial in an automated lumbar spine processing pipeline. Accurate and reliable analysis of lumbar spine image is expected to advantage practical medical diagnosis and population-based analysis of bone strength. However, the design of automated algorithms for lumbar spine processing is demanding due to significant anatomical variations and scarcity of publicly available data. In recent years, convolutional neural network (CNN) and vision transformers (Vits) have been the de facto standard in medical image segmentation. Although adept at capturing global features, the inherent bias of locality and weight sharing of CNN constrains its capacity to model long-range dependency. In contrast, Vits excel at long-range dependency modeling, but they may not generalize well with limited datasets due to the lack of inductive biases inherent to CNN. In this paper, we propose a deep learning-based two-stage coarse-to-fine solution to address the problem of automatic location and segmentation of lumbar vertebral body cancellous bone. Specifically, in the first stage, a Swin-transformer based model is applied to predict the heatmap of lumbar vertebral body centroids. Considering the characteristic anatomical structure of lumbar spine, we propose a novel loss function called LumAnatomy loss, which enforces the order and bend of the predicted vertebral body centroids. To inherit the excellence of CNN and Vits while preventing their respective limitations, in the second stage, we propose an encoder–decoder network to segment the identified lumbar vertebral body cancellous bone, which consists of two parallel encoders, i.e., a Swin-transformer encoder and a CNN encoder. To enhance the combination of CNNs and Vits, we propose a novel multi-scale attention feature fusion module (MSA-FFM), which address issues that arise when fusing features given at different encoders. To tackle the issue of lack of data, we raise the first large-scale lumbar vertebral body cancellous bone segmentation dataset called LumVBCanSeg containing a total of 185 CT scans annotated at voxel level by 3 physicians. Extensive experimental results on the LumVBCanSeg dataset demonstrate the proposed algorithm outperform other state-of-the-art medical image segmentation methods. The data is publicly available at: https://zenodo.org/record/8181250. The implementation of the proposed method is available at: https://github.com/sia405yd/LumVertCancNet.

Read full abstract

Deep learning-based networks have become increasingly popular in the field of medical image segmentation. The purpose of this research was to develop and optimize a new architecture for automatic segmentation of the prostate gland and normal organs in the pelvic, thoracic, and upper gastro-intestinal (GI) regions. We developed an architecture which combines a shifted-window (Swin) transformer with a convolutional U-Net. The network includes a parallel encoder, a cross-fusion block, and a CNN-based decoder to extract local and global information and merge related features on the same scale. A skip connection is applied between the cross-fusion block and decoder to integrate low-level semantic features. Attention gates (AGs) are integrated within the CNN to suppress features in image background regions. Our network is termed "SwinAttUNet." We optimized the architecture for automatic image segmentation. Training datasets consisted of planning-CT datasets from 300 prostate cancer patients from an institutional database and 100 CT datasets from a publicly available dataset (CT-ORG). Images were linearly interpolated and resampled to a spatial resolution of (1.0×1.0×1.5) mm3 . A volume patch (192×192×96) was used for training and inference, and the dataset was split into training (75%), validation (10%), and test (15%) cohorts. Data augmentation transforms were applied consisting of random flip, rotation, and intensity scaling. The loss function comprised Dice and cross-entropy equally weighted and summed. We evaluated Dice coefficients (DSC), 95th percentile Hausdorff Distances (HD95), and Average Surface Distances (ASD) between results of our network and ground truth data. SwinAttUNet, DSC values were 86.54±1.21, 94.15±1.17, and 87.15±1.68% and HD95 values were 5.06±1.42, 3.16±0.93, and 5.54±1.63mm for the prostate, bladder, and rectum, respectively. Respective ASD values were 1.45±0.57, 0.82±0.12, and 1.42±0.38mm. For the lung, liver, kidneys and pelvic bones, respective DSC values were: 97.90±0.80, 96.16±0.76, 93.74±2.25, and 89.31±3.87%. Respective HD95 values were: 5.13±4.11, 2.73±1.19, 2.29±1.47, and 5.31±1.25mm. Respective ASD values were: 1.88±1.45, 1.78±1.21, 0.71±0.43, and 1.21±1.11mm. Our network outperformed several existing deep learning approaches using only attention-based convolutional or Transformer-based feature strategies, as detailed in the results section. We have demonstrated that our new architecture combining Transformer- and convolution-based features is able to better learn the local and global context for automatic segmentation of multi-organ, CT-based anatomy.

Read full abstract

Parallel Encoder Research Articles

Related Topics

Articles published on Parallel Encoder

Multi-scale dual-channel feature embedding decoder for biomedical image segmentation

An efficient XOR-free implementation of polar encoder for reconfigurable hardware

EAAC-Net: An Efficient Adaptive Attention and Convolution Fusion Network for Skin Lesion Segmentation.

INTREPPPID-an orthologue-informed quintuplet network for cross-species prediction of protein-protein interaction.

SRTRP-Net: A multi-task learning network for segmentation and prediction of stereotactic radiosurgery treatment response in brain metastases

An LDPC-RS Concatenation and Decoding Scheme to Lower the Error Floor for FTN Signaling

Moiré pattern generation-based image steganography

LumVertCancNet: A novel 3D lumbar vertebral body cancellous bone location and segmentation method based on hybrid Swin-transformer

Aided Diagnosis Model Based on Deep Learning for Glioblastoma, Solitary Brain Metastases, and Primary Central Nervous System Lymphoma with Multi-Modal MRI.

A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information

Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI.

An efficient reconfigurable code rate cooperative low-density parity check codes for gigabits wide code encoder/decoder operations

A RGB-Thermal based adaptive modality learning network for day–night wildfire identification

LMQFormer: A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal

Parallel encoder–decoder framework for image captioning

A new architecture combining convolutional and transformer-based networks for automatic 3D multi-organ segmentation on CT images.

DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation

End-to-End Dual-Stream Transformer with a Parallel Encoder for Video Captioning

Multi parallel U-net encoder network for effective polyp image segmentation

Aggregating transformers and CNNs for salient object detection in optical remote sensing images

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Parallel Encoder Research Articles

Related Topics

Articles published on Parallel Encoder

Multi-scale dual-channel feature embedding decoder for biomedical image segmentation

An efficient XOR-free implementation of polar encoder for reconfigurable hardware

EAAC-Net: An Efficient Adaptive Attention and Convolution Fusion Network for Skin Lesion Segmentation.

INTREPPPID-an orthologue-informed quintuplet network for cross-species prediction of protein-protein interaction.

SRTRP-Net: A multi-task learning network for segmentation and prediction of stereotactic radiosurgery treatment response in brain metastases

An LDPC-RS Concatenation and Decoding Scheme to Lower the Error Floor for FTN Signaling

Moiré pattern generation-based image steganography

LumVertCancNet: A novel 3D lumbar vertebral body cancellous bone location and segmentation method based on hybrid Swin-transformer

Aided Diagnosis Model Based on Deep Learning for Glioblastoma, Solitary Brain Metastases, and Primary Central Nervous System Lymphoma with Multi-Modal MRI.

A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information

Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI.

An efficient reconfigurable code rate cooperative low-density parity check codes for gigabits wide code encoder/decoder operations

A RGB-Thermal based adaptive modality learning network for day–night wildfire identification

LMQFormer: A Laplace-Prior-Guided Mask Query Transformer for Lightweight Snow Removal

Parallel encoder–decoder framework for image captioning

A new architecture combining convolutional and transformer-based networks for automatic 3D multi-organ segmentation on CT images.

DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation

End-to-End Dual-Stream Transformer with a Parallel Encoder for Video Captioning

Multi parallel U-net encoder network for effective polyp image segmentation

Aggregating transformers and CNNs for salient object detection in optical remote sensing images