Residual Modulation Research Articles

Vision Transformer (ViT) are widely used in the field of traffic video inspection because of their wide sensing field and strong global context-capturing capability. However, the ViT model has inadequate local detail feature extraction and high computational complexity. Convoluted neural networks(CNN) are commonly used as a means of video image detection. Although it is more flexible and less computationally complex in the extraction of local details, the fixed-size kernel limits it. It is difficult to effectively utilize the full local context information, resulting in the difficulty of effectively extracting the tiny targets with uneven scale distribution in the region. It is prone to leakage and misdetection of small targets. In view of this, this study proposes a new fully localized selective large kernel network (FL-SLKNet). The network effectively captures specific targets and background information through a fully localized feature extraction method, thus making small targets in the region more distinctive features. On this basis, the Adaptive Expansion Residual Module (AERM) is proposed, which solves the problem of image resolution degradation due to sensory field expansion by combining point-by-point convolution and expansion convolution and utilizing residual connections to supplement the lost information. In order to effectively identify tiny targets in the local area, this study proposes the Multi-Scale Frequency Domain Encoder (MSFDEncoder); this encoder utilizes a high-frequency branch to capture local details and a low-frequency branch to focus on the global structure, effectively capturing granular features of targets at various scales. Experimental results on the VisDrone2019-DET and BDD-100K datasets indicate that FL-SLKNet outperforms other advanced models in traffic video detection, with an increase of 3.3% and 3.4% in mAP0.5 compared to the RT-DETR model. Meanwhile, the detection performance of FL-SLKNet is excellent compared with the YOLO series on the SZ Actual Traffic Video dataset. In addition, in the rainy and hazy simulation scenarios, FL-SLKNet’s mAP0.5 improves by 5.5% and 2.6% compared to the RT-DETR model, and the extreme weather robustness performance is better.

Read full abstract

In recent years, accurate field monitoring has been a research hotspot in the domains of aerial remote sensing and satellite remote sensing. In view of this, this study proposes an innovative cross-platform super-resolution reconstruction method for remote sensing images for the first time, aiming to make medium-resolution satellites capable of field-level detection through a super-resolution reconstruction technique. The progressive growing generative adversarial network (PGGAN) model, which has excellent high-resolution generation and style transfer capabilities, is combined with a deep residual network, forming the Res-PGGAN model for cross-platform super-resolution reconstruction. The Res-PGGAN architecture is similar to that of the PGGAN, but includes a deep residual module. The proposed Res-PGGAN model has two main benefits. First, the residual module facilitates the training of deep networks, as well as the extraction of deep features. Second, the PGGAN structure performs well in cross-platform sensor style transfer, allowing for cross-platform high-magnification super-resolution tasks to be performed well. A large pre-training dataset and real data are used to train the Res-PGGAN to improve the resolution of Sentinel-2’s 10 m resolution satellite images to 0.625 m. Three evaluation metrics, including the structural similarity index metric (SSIM), the peak signal-to-noise ratio (PSNR), and the universal quality index (UQI), are used to evaluate the high-magnification images obtained by the proposed method. The images generated by the proposed method are also compared with those obtained by the traditional bicubic method and two deep learning super-resolution reconstruction methods: the enhanced super-resolution generative adversarial network (ESRGAN) and the PGGAN. The results indicate that the proposed method outperforms all the comparison methods and demonstrates an acceptable performance regarding all three metrics (SSIM/PSNR/UQI: 0.9726/44.7971/0.0417), proving the feasibility of cross-platform super-resolution image recovery.

Read full abstract

Residual Modulation Research Articles

Related Topics

Articles published on Residual Modulation

Lightweight monocular depth estimation using a fusion-improved transformer

ResMT: A hybrid CNN-transformer framework for glioma grading with 3D MRI

RAM-YOLOv8: A multi-fault detection method for photovoltaic module

Dunhuang mural inpainting based on reference guidance and multi‐scale fusion

DSC-Net: Enhancing Blind Road Semantic Segmentation with Visual Sensor Using a Dual-Branch Swin-CNN Architecture.

A fully locally selective large kernel network for traffic video detection

TAFENet: A Two-Stage Attention-Based Feature-Enhancement Network for Strip Steel Surface Defect Detection

3D Object Detection via Residual SqueezeDet

CP-RDM: a new object detection algorithm for casting and pouring robots

Luminance decomposition and reconstruction for high dynamic range Video Quality Assessment

A Multi-Scale Liver Tumor Segmentation Method Based on Residual and Hybrid Attention Enhanced Network with Contextual Integration.

CENN: Capsule-enhanced neural network with innovative metrics for robust speech emotion recognition

Brain tumor image segmentation method using hybrid attention module and improved mask RCNN

An Effective Res-Progressive Growing Generative Adversarial Network-Based Cross-Platform Super-Resolution Reconstruction Method for Drone and Satellite Images

A Self-Supervised Network-Based Smoke Removal and Depth Estimation for Monocular Endoscopic Videos.

A Multi-Branch Feature Extraction Residual Network for Lightweight Image Super-Resolution

Apple recognition in complex environments based on FC-DETR

MRCFN: A multi-sensor residual convolutional fusion network for intelligent fault diagnosis of bearings in noisy and small sample scenarios

FCT-Net: A dual-encoding-path network fusing atrous spatial pyramid pooling and transformer for pavement crack detection

Automatic Segmentation Model for Parkinson’s Images Based on SA-U2-Net

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Residual Modulation Research Articles

Related Topics

Articles published on Residual Modulation

Lightweight monocular depth estimation using a fusion-improved transformer

ResMT: A hybrid CNN-transformer framework for glioma grading with 3D MRI

RAM-YOLOv8: A multi-fault detection method for photovoltaic module

Dunhuang mural inpainting based on reference guidance and multi‐scale fusion

DSC-Net: Enhancing Blind Road Semantic Segmentation with Visual Sensor Using a Dual-Branch Swin-CNN Architecture.

A fully locally selective large kernel network for traffic video detection

TAFENet: A Two-Stage Attention-Based Feature-Enhancement Network for Strip Steel Surface Defect Detection

3D Object Detection via Residual SqueezeDet

CP-RDM: a new object detection algorithm for casting and pouring robots

Luminance decomposition and reconstruction for high dynamic range Video Quality Assessment

A Multi-Scale Liver Tumor Segmentation Method Based on Residual and Hybrid Attention Enhanced Network with Contextual Integration.

CENN: Capsule-enhanced neural network with innovative metrics for robust speech emotion recognition

Brain tumor image segmentation method using hybrid attention module and improved mask RCNN

An Effective Res-Progressive Growing Generative Adversarial Network-Based Cross-Platform Super-Resolution Reconstruction Method for Drone and Satellite Images

A Self-Supervised Network-Based Smoke Removal and Depth Estimation for Monocular Endoscopic Videos.

A Multi-Branch Feature Extraction Residual Network for Lightweight Image Super-Resolution

Apple recognition in complex environments based on FC-DETR

MRCFN: A multi-sensor residual convolutional fusion network for intelligent fault diagnosis of bearings in noisy and small sample scenarios

FCT-Net: A dual-encoding-path network fusing atrous spatial pyramid pooling and transformer for pavement crack detection

Automatic Segmentation Model for Parkinson’s Images Based on SA-U2-Net