Pooling Operations Research Articles

ABSTRACT As a classic problem of high-resolution remote sensing image tasks, land cover classification has many challenges, for instance, variable scales and low-edge discrimination. Traditional spatial semantic segmentation methods based on single-scale feature extraction are challenging to model multi-scale features. They cannot effectively fit high-resolution remote sensing images with variable scales and are prone to misclassification issues. On the other hand, spatial multi-scale semantic segmentation methods lack edge constraints, which can easily lead to edge segmentation ambiguity issues. We propose a segmentation network based on a multi-scale frequency-domain attention gating mechanism (MFAGNet) to solve the above problems. Specifically, we obtain multi-scale features through multi-scale input and design the discrete cosine transform channel attention module (DCTCAM). In DCTCAM, we extract global low-frequency features and effectively alleviate misclassification caused by scale changes. The edge features of the image are obtained through the high-frequency feature extraction module (HFEM) by using edge detection operators to enhance the model’s ability to recognize edges. Additionally, no further pooling operations are employed except for a single max pooling layer in the backbone network. In this paper, we design a new gating mechanism to dynamically control the extraction ratio of various frequency features so that it can adapt to input images of different scales. This paper conducts comparative experiments on three high-resolution land cover datasets: GID, Vaihingen, and Potsdam. The new algorithm proposed compares with classic semantic segmentation models such as DANet, PSPNet, SegNet, UNet, DeepLabv3+, FcaNet, BiSeNet, AFN, and FastFCN. The experimental results show that MFAGNet is superior to the comparison models. With no significant increase in computing overhead, MFAGNet obtains MIoU of 69.05%, 73.40%, and 79.42% on the standard data sets GID, Vaihingen, and Potsdam, respectively.

Read full abstract

Objective.Despite advancements in medical imaging technology, the diagnosis and positioning of lumbar disc diseases still heavily rely on the expertise and experience of medical professionals. This process is often time-consuming, labor-intensive, and susceptible to subjective factors. Achieving automatic positioning and segmentation of lumbar intervertebral disc (LID) is the first and critical step in intelligent diagnosis of lumbar disc diseases. However, due to the complexity of the vertebral body and the ambiguity of the soft tissue boundaries of the LID, accurate and intelligent segmentation of LIDs remains challenging. The study aims to accurately and intelligently segment and locate LIDs by fully utilizing multi-modal lumbar magnetic resonance Images (MRIs).Approach.A novel multi-modal assistant segmentation network (MAS-Net) is proposed in this paper. The architecture consists of four key components: the multi-branch fusion encoder (MBFE), the cross-modality correlation evaluation (CMCE), the channel fusion transformer (CFT), and the selective Kernel (SK) based decoder. The MBFE module captures and integrates various modal features, while the CMCE module facilitates the fusion process between the MBFE and decoder. The CFT module selectively guides the flow of information between the MBFE and decoder and effectively utilizes skip connections from multiple layers. The SK module computes the significance of each channel using global pooling operations and applies weights to the input feature maps to improve the models recognition of important features.Main results.The proposed MAS-Net achieved a dice coefficient of 93.08% on IVD3Seg and 93.22% on DualModalDisc dataset, outperforming the current state-of-the-art network, accurately segmenting the LIDs, and generating a 3D model that can precisely display the LIDs.Significance.MAS-Net automates the diagnostics process and addresses challenges faced by doctors. Simplifying and enhancing the clarity of visual representation, multi-modal MRI allows for better information complementation and LIDs segmentation. By successfully integrating data from various modalities, the accuracy of LID segmentation is improved.

Read full abstract

Pooling Operations Research Articles

Related Topics

Articles published on Pooling Operations

Improved channel attention methods via hierarchical pooling and reducing information loss

Research on image recognition of three Fritillaria cirrhosa species based on deep learning

Movie recommendation system using taymon optimized deep learning network

MFAGNet: multi-scale frequency attention gating network for land cover classification

Deep learning based hybrid prediction model for predicting the spread of COVID-19 in the world's most populous countries.

Classification of urban interchange patterns using a model combining shape context descriptor and graph convolutional neural network

DenseNet-random forest model based galaxy classification

An Efficient and Accurate Convolution-Based Similarity Measure for Uncertain Trajectories

TransPCGC: Point Cloud Geometry Compression Based on Transformers

Fine‐grained crack segmentation for high‐resolution images via a multiscale cascaded network

RMAU-Net: Breast Tumor Segmentation Network Based on Residual Depthwise Separable Convolution and Multiscale Channel Attention Gates

Capturing the diffusive behavior of the multiscale linear transport equations by Asymptotic-Preserving Convolutional DeepONets

Lightweight Single Shot Multi-Box Detector: A fabric defect detection algorithm incorporating parallel dilated convolution and dual channel attention

An efficient astronomical seeing forecasting method by random convolutional Kernel transformation

DMCA-GAN: Dual Multilevel Constrained Attention GAN for MRI-Based Hippocampus Segmentation.

Dynamic Clustering Strategies Boosting Deep Learning in Olive Leaf Disease Diagnosis

DMNet: Dynamic Memory Network for RGB-D Salient Object Detection

SFPFusion: An Improved Vision Transformer Combining Super Feature Attention and Wavelet-Guided Pooling for Infrared and Visible Images Fusion.

Enhancing the accuracies by performing pooling decisions adjacent to the output layer

MAS-Net:Multi-modal Assistant Segmentation Network For Lumbar Intervertebral Disc.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Pooling Operations Research Articles

Related Topics

Articles published on Pooling Operations

Improved channel attention methods via hierarchical pooling and reducing information loss

Research on image recognition of three Fritillaria cirrhosa species based on deep learning

Movie recommendation system using taymon optimized deep learning network

MFAGNet: multi-scale frequency attention gating network for land cover classification

Deep learning based hybrid prediction model for predicting the spread of COVID-19 in the world's most populous countries.

Classification of urban interchange patterns using a model combining shape context descriptor and graph convolutional neural network

DenseNet-random forest model based galaxy classification

An Efficient and Accurate Convolution-Based Similarity Measure for Uncertain Trajectories

TransPCGC: Point Cloud Geometry Compression Based on Transformers

Fine‐grained crack segmentation for high‐resolution images via a multiscale cascaded network

RMAU-Net: Breast Tumor Segmentation Network Based on Residual Depthwise Separable Convolution and Multiscale Channel Attention Gates

Capturing the diffusive behavior of the multiscale linear transport equations by Asymptotic-Preserving Convolutional DeepONets

Lightweight Single Shot Multi-Box Detector: A fabric defect detection algorithm incorporating parallel dilated convolution and dual channel attention

An efficient astronomical seeing forecasting method by random convolutional Kernel transformation

DMCA-GAN: Dual Multilevel Constrained Attention GAN for MRI-Based Hippocampus Segmentation.

Dynamic Clustering Strategies Boosting Deep Learning in Olive Leaf Disease Diagnosis

DMNet: Dynamic Memory Network for RGB-D Salient Object Detection

SFPFusion: An Improved Vision Transformer Combining Super Feature Attention and Wavelet-Guided Pooling for Infrared and Visible Images Fusion.

Enhancing the accuracies by performing pooling decisions adjacent to the output layer

MAS-Net:Multi-modal Assistant Segmentation Network For Lumbar Intervertebral Disc.