Irrelevant Regions Research Articles

Establishing reliable feature matches between a pair of images in various scenarios is a long-standing open problem in photogrammetry. Attention-based detector-free matching with coarse-to-fine architecture has been a typical pipeline to build matches, but the cross-attention module with global receptive field may compromise the structural local consistency by introducing irrelevant regions (outliers). Motion field can maintain structural local consistency under the assumption that matches for adjacent features should be spatially proximate. However, motion field can only estimate local displacements between consecutive images and struggle with long-range displacements estimation in large-scale variation scenarios without spatial correlation priors. Moreover, large-scale variations may also disrupt the geometric consistency with the application of mutual nearest neighbor criterion in patch-level matching, making it difficult to recover accurate matches. In this paper, we propose a unified feature-motion consistency framework for robust image matching (MOMA), to maintain structural consistency at both global and local granularity in scale-discrepancy scenarios. MOMA devises a motion consistency-guided dependency range strategy (MDR) in cross attention, aggregating highly relevant regions within the motion consensus-restricted neighborhood to favor true matchable regions. Meanwhile, a unified framework with hierarchical attention structure is established to couple local motion field with global feature correspondence. The motion field provides local consistency constraints in feature aggregation, while feature correspondence provides spatial context prior to improve motion field estimation. To alleviate geometric inconsistency caused by hard nearest neighbor criterion, we propose an adaptive neighbor search (soft) strategy to address scale discrepancy. Extensive experiments on three datasets demonstrate that our method outperforms solid baselines, with AUC improvements of 4.73/4.02/3.34 in two-view pose estimation task at thresholds of 5°/10°/20° on Megadepth test, and 5.94% increase of accuracy at threshold of 1px in homography task on HPatches datasets. Furthermore, in the downstream tasks such as 3D mapping, the 3D models reconstructed using our method on the self-collected SYSU UAV datasets exhibit significant improvement in structural completeness and detail richness, manifesting its high applicability in wide downstream tasks. The code is publicly available at https://github.com/BunnyanChou/MOMA.

Read full abstract

Thyroid ultrasound video provides significant value for thyroid diseases diagnosis, but the ultrasound imaging process is often affected by the speckle noise, resulting in poor quality of the ultrasound video. Numerous video denoising methods have been proposed to remove noise while preserving texture details. However, existing methods still suffer from the following problems: (1) relevant temporal features in the low-contrast ultrasound video cannot be accurately aligned and effectively aggregated by simple optical flow or motion estimation, resulting in the artifacts and motion blur in the video; (2) fixed receptive field in spatial features integration lacks the flexibility of aggregating features in the global region of interest and is susceptible to interference from irrelevant noisy regions. In this work, we propose a deformable spatial-temporal attention denoising network to remove speckle noise in thyroid ultrasound video. The entire network follows the bidirectional feature propagation mechanism to efficiently exploit the spatial-temporal information of the whole video sequence. In this process, two modules are proposed to address the above problems: (1) a deformable temporal attention module (DTAM) is designed after optical flow pre-alignment to further capture and aggregate relevant temporal features according to the learned offsets between frames, so that inter-frame information can be better exploited even with the imprecise flow estimation under the low contrast of ultrasound video; (2) a deformable spatial attention module (DSAM) is proposed to flexibly integrate spatial features in the global region of interest through the learned intra-frame offsets, so that irrelevant noisy information can be ignored and essential information can be precisely exploited. Finally, all these refined features are rectified and merged through residual convolution blocks to recover the clean video frames. Experimental results on our thyroid ultrasound video (US-V) dataset and the DDTI dataset demonstrate that our proposed method exceeds 1.2 1.3dB on PSNR and has clearer texture detail compared to other state-of-the-art methods. In the meantime, the proposed model can also assist thyroid nodule segmentation methods to achieve more accurate segmentation effect, which provides an important basis for thyroid diagnosis. In the future, the proposed model can be improved and extended to other medical image sequence datasets, including CT and MRI slice denoising. The code and datasets are provided at https://github.com/Meta-MJ/DSTAN .

Read full abstract

Irrelevant Regions Research Articles

Related Topics

Articles published on Irrelevant Regions

CrackNet: A Hybrid Model for Crack Segmentation with Dynamic Loss Function

Bilevel progressive homography estimation via correlative region-focused transformer

A Patch-Level Region-Aware Module with a Multi-Label Framework for Remote Sensing Image Captioning

Polyp-Mamba: A Hybrid Multi-Frequency Perception Gated Selection Network for polyp segmentation

A unified feature-motion consistency framework for robust image matching

CRENet: Crowd region enhancement network for multi-person 3D pose estimation

Application of U-Net Network Utilizing Multiattention Gate for MRI Segmentation of Brain Tumors.

Decoupled Cross-Modal Transformer for Referring Video Object Segmentation.

MRSAPose: Multi-level routing sparse attention for multi-person pose estimation

Guiding attention in flow-based conceptual models through consistent flow and pattern visibility

Deep attention network for identifying ligand-protein binding sites

Identification of Pepper Leaf Diseases Based on TPSAO-AMWNet.

DSTAN: A Deformable Spatial-temporal Attention Network with Bidirectional Sequence Feature Refinement for Speckle Noise Removal in Thyroid Ultrasound Video.

Plane-wave medical image reconstruction based on dynamic Criss-Cross attention and multi-scale convolution.

Segmentation of ethnic clothing patterns with fusion of multiple attention mechanisms

LSKANet: Long Strip Kernel Attention Network for Robotic Surgical Scene Segmentation.

Multi-scale features and attention guided for brain tumor segmentation

TCNet: Continuous Sign Language Recognition from Trajectories and Correlated Regions

Determination of quality classes for material extrusion additive manufacturing using image processing

Fast and high-precision compressible flowfield inference method of transonic airfoils based on attention UNet

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Irrelevant Regions Research Articles

Related Topics

Articles published on Irrelevant Regions

CrackNet: A Hybrid Model for Crack Segmentation with Dynamic Loss Function

Bilevel progressive homography estimation via correlative region-focused transformer

A Patch-Level Region-Aware Module with a Multi-Label Framework for Remote Sensing Image Captioning

Polyp-Mamba: A Hybrid Multi-Frequency Perception Gated Selection Network for polyp segmentation

A unified feature-motion consistency framework for robust image matching

CRENet: Crowd region enhancement network for multi-person 3D pose estimation

Application of U-Net Network Utilizing Multiattention Gate for MRI Segmentation of Brain Tumors.

Decoupled Cross-Modal Transformer for Referring Video Object Segmentation.

MRSAPose: Multi-level routing sparse attention for multi-person pose estimation

Guiding attention in flow-based conceptual models through consistent flow and pattern visibility

Deep attention network for identifying ligand-protein binding sites

Identification of Pepper Leaf Diseases Based on TPSAO-AMWNet.

DSTAN: A Deformable Spatial-temporal Attention Network with Bidirectional Sequence Feature Refinement for Speckle Noise Removal in Thyroid Ultrasound Video.

Plane-wave medical image reconstruction based on dynamic Criss-Cross attention and multi-scale convolution.

Segmentation of ethnic clothing patterns with fusion of multiple attention mechanisms

LSKANet: Long Strip Kernel Attention Network for Robotic Surgical Scene Segmentation.

Multi-scale features and attention guided for brain tumor segmentation

TCNet: Continuous Sign Language Recognition from Trajectories and Correlated Regions

Determination of quality classes for material extrusion additive manufacturing using image processing

Fast and high-precision compressible flowfield inference method of transonic airfoils based on attention UNet