Improved SAR Aircraft Detection Algorithm Based on Visual State Space Models
ABSTRACTIn recent years, the development of deep learning algorithms has significantly advanced the application of synthetic aperture radar (SAR) aircraft detection in remote sensing and military fields. However, existing methods face a dual dilemma: CNN‐based models suffer from insufficient detection accuracy due to limitations in local receptive fields, whereas Transformer‐based models improve accuracy by leveraging attention mechanisms but incur significant computational overhead due to their quadratic complexity. This imbalance between accuracy and efficiency severely limits the development of SAR aircraft detection. To address this problem, this paper propose a novel neural network based on state space models (SSM), termed the Mamba SAR detection network (MSAD). Specifically, we design a feature encoding module, MEBlock, that integrates CNN with SSM to enhance global feature modelling capabilities. Meanwhile, the linear computational complexity brought by SSM is superior to that of Transformer architectures, achieving a reduction in computational overhead. Additionally, we propose a context‐aware feature fusion module (CAFF) that combines attention mechanisms to achieve adaptive fusion of multi‐scale features. Lastly, a lightweight parameter‐shared detection head (PSHead) is utilised to effectively reduce redundant parameters through implicit feature interaction. Experiments on the SAR‐AirCraft‐v1.0 and SADD datasets show that MSAD achieves higher accuracy than existing algorithms, whereas its GFLOPs are 2.7 times smaller than those of the Transformer architecture RT‐DETR. These results validate the core role of SSM as an accuracy‐efficiency balancer, reflecting MSAD's perceptual capability and performance in SAR aircraft detection in complex environments.
- Conference Article
11
- 10.1109/iaeac50856.2021.9391019
- Mar 12, 2021
With the development of depth learning and synthetic aperture radar (Synthetic Aperture Radar, SAR) technology, SAR image target detection based on convolution neural network (convolutional neural network, CNN) has achieved certain results. However, there are still problems in SAR detection of near-shore ship targets in complex environments. For improving the detection performance of the algorithm, the detection rate of SAR image near shore ship targets in complex environment is improved. This paper proposes an algorithm for SAR image ship target detection in complex environment. The algorithm first uses convolution neural network for coastal segmentation, and SAR image ship target detection through the results of coastal segmentation. The experimental results show that the algorithm has efficient detection ability for SAR image near-shore ship target detection in complex environment.
- Research Article
1
- 10.1038/s41598-024-72523-8
- Sep 13, 2024
- Scientific Reports
To address the problem of dense crowd face detection in complex environments, this paper proposes a face detection model named Deep and Compact Face Detection (DCFD), which adopts an improved lightweight EfficientNetV2 network to replace the backbone network of RetinaFace. A large kernel attention mechanism is introduced to address the face detection task more accurately. The backbone network, an improved efficient channel attention (ECA) mechanism, is added to further improve the algorithm performance. The feature fusion module is an improved neural architecture search feature pyramid network (NAS-FPN) that significantly improves the face detection accuracy in different scenes. To balance the training process of positive and negative samples, we use the focus loss function to replace the traditional cross-entropy loss function. In different environments, the DCFD algorithm has shown efficient face detection performance. This algorithm provides not only a feasible and effective solution for solving the problem of face detection in dense groups but also an important basis for improving the accuracy of face detection models in practical applications.
- Research Article
8
- 10.1007/s40747-024-01580-3
- Aug 14, 2024
- Complex & Intelligent Systems
Prohibited item detection is crucial for the safety of public places. Deep learning, one of the mainstream methods in prohibited item detection tasks, has shown superior performance far beyond traditional prohibited item detection methods. However, most neural network architectures in deep learning still lack sufficient local feature representation ability for overlapping and small targets, and ignore the problem of semantic conflicts caused by direct feature fusion. In this paper, we propose MFPIDet, a novel prohibited item detection neural network architecture based on improved YOLOV7 to achieve reliable prohibited item detection in complex environments. Specifically, a multi-scale attention module (MAM) backbone is proposed to filter the redundant information of target regions and further applied to enhance the local feature representation ability of overlapping objects. Here, to reduce the redundant information of target regions, a squeeze-excitation (SE) block is used to filter the background. Then, aiming at enhancing the feature expression ability of overlapping objects, a multi-scale feature extraction module (MFEM) is designed for local feature representation. In addition, to obtain richer context information, We design an adaptive fusion feature pyramid network (AF-FPN) to combine the adaptive context information fusion module (ACIFM) with the feature fusion module (FFM) to improve the neck structure of YOLOV7. The proposed method is validated on the PIDray dataset, and the tested results showed that our method obtained the highest mAP (68.7%), which is improved by 3.5% than YOLOV7 methods. Our approach provides a new design pattern for prohibited item detection in complex environments and shows the development potential of deep learning in related fields.
- Research Article
6
- 10.1155/2022/6205108
- Jan 5, 2022
- Computational Intelligence and Neuroscience
The existing face detection methods were affected by the network model structure used. Most of the face recognition methods had low recognition rate of face key point features due to many parameters and large amount of calculation. In order to improve the recognition accuracy and detection speed of face key points, a real-time face key point detection algorithm based on attention mechanism was proposed in this paper. Due to the multiscale characteristics of face key point features, the deep convolution network model was adopted, the attention module was added to the VGG network structure, the feature enhancement module and feature fusion module were combined to improve the shallow feature representation ability of VGG, and the cascade attention mechanism was used to improve the deep feature representation ability. Experiments showed that the proposed algorithm not only can effectively realize face key point recognition but also has better recognition accuracy and detection speed than other similar methods. This method can provide some theoretical basis and technical support for face detection in complex environment.
- Research Article
10
- 10.1109/jstars.2022.3170361
- Jan 1, 2022
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Ship detection in complex environment is a challenging task due to strong background inferences, for which various deep-learning-based methods have been proposed. However, they have poor performance on detecting nearshore ships for medium-resolution synthetic aperture radar (SAR) images due to the loss of typical features and the confusion with the land scatterers. The availability of multitemporal SAR images gives the opportunity to separate nearshore ships with land scatterers by using the temporal characteristics. In this article, we propose a ship detection method based on SAR time series. First, we investigate the statistical stability of the SAR time series and propose a preclassification method to identify the potential changed pixel clusters. Then, we discriminate between ship and background pixel candidates in the preclassification by combining a rotating object detector and the transition detection algorithm and generate the corresponding frozen background reference (FBR) image. In addition, a dynamic framework for ship detection is proposed based on the FBR image and a two-stage outlier detection approach. The experiments show that the proposed method enables a dynamic ship monitoring with a high accuracy in ship detection and low false alarm rate for nearshore ship targets.
- Research Article
63
- 10.1109/tgrs.2021.3066432
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
The complementarity of synthetic aperture radar (SAR) and optical images allows remote sensing observations to “see” unprecedented discoveries. Image matching plays a fundamental role in the fusion and application of SAR and optical images. However, both the geometric imaging pattern and the physical radiation mechanism of these two sensors are significantly different, so that the images show complex geometric distortion and nonlinear radiation differences. This phenomenon brings great challenges to image matching, which neither the handcrafted descriptors nor the deep learning-based methods have adequately addressed. In this article, a novel image-based matching method for SAR to optical images via an image-based convolutional network with spatial pyramid aggregated pooling (SPAP) and an attention mechanism is proposed, namely MAP-Net. The original image is embedded through the convolutional neural network to generate the feature map. Through the information extraction and abstraction of the original imagery, the embedded features containing the high-level semantic information are more robust to the geometric distortion and radiation variation among the different modal images, which is beneficial to the matching of cross-modal images. The adoption of the SPAP module makes the network more capable of integrating global and local contextual information. The attention block weights the dense features generated from the network to extract the key features that are invariant, distinguishable, repeatable, and suitable for the image matching task. In the experiments, five sets of multisource and multiresolution SAR and optical images with wide and varied ground coverage were used to evaluate the accuracy of MAP-Net, compared to both handcrafted and deep learning-based methods. The experimental results show that the MAP-Net method is superior to the current state-of-the-art image matching methods for SAR to optical images.
- Conference Article
1
- 10.1109/ursigass.2014.6929614
- Aug 1, 2014
Although Synthetic Aperture Radar (SAR) can capture rich land cover information as a most important advanced technique in the field of international earth observation, the application effects still limited significantly. The reason is that the study of SAR imaging processing, SAR image processing and SAR applications are conducted respectively, the integrated study are lacked for parameter selection of SAR imaging processing and SAR image processing and targets identification of SAR application. Focusing on above science problems, the study for phase high fidelity model of high precision interferometric SAR, SAR non-stationary backscattering characteristics and cognition, SAR three-dimensional electromagnetic scattering modeling and its scattering mechanism and SAR multi-parameters optimization information extraction has been conducted. The typical natural distributed targets and man-made targets such as surface deformation, sea ice, building complex and collapsed building are selected, the method and application study oriented to SAR environmental parameters inversion are conduced, the main result and conclusion are as follows: (1) The overall theoretical frames of SAR information integrated processing was proposed; The simulation and ground validation system used to assess the effects of SAR information integrated processing was created; furthermore, the spaceborne SAR imaging algorithm with three steps focusing processing was proposed and realized the multi-modal integrated imaging processing and improved the imaging processing precision. (2) In terms of SAR information integrated processing of natural targets, focusing on phase high fidelity imaging processing of interferometric SAR surface deformation monitoring, an novel spaceborne SAR high fidelity simulation method based on stationary RCS and improved Goldstein SAR interferogram filter based on empirical mode decomposition were proposed. Focusing on multi-parameters optimization processing of sea ice types classification, the multi-channel spaceborne SAR sparse imaging method based on compressive sensing, Kalman filter for removal of scalloping and inter-scan banding in ScanSAR images and SVM sea ice classification method combined with sea ice concentration were proposed. (3) In terms of SAR information integrated processing of man-made targets, focusing on non-stationary backscattering from building complex, mitigation of azimuth ambiguities in spaceborne stripmap SAR iamges using selective restoration, SAR image despeckling by selective 3D filtering of multiple compressive reconstructed images and man-made target detection in urban areas based on a new azimuth stationary extraction method were proposed. Focusing on three-dimensional modeling and fast extraction of collapsed building, H-a-p method used for collapsed building extraction was proposed.
- Research Article
6
- 10.3390/app11041431
- Feb 5, 2021
- Applied Sciences
This article proposes a method for the prediction of wide range two-dimensional refractivity for synthetic aperture radar (SAR) applications, using an inverse distance weighted (IDW) interpolation of high-altitude radio refractivity data from multiple meteorological observatories. The radio refractivity is extracted from an atmospheric data set of twenty meteorological observatories around the Korean Peninsula along a given altitude. Then, from the sparse refractive data, the two-dimensional regional radio refractivity of the entire Korean Peninsula is derived using the IDW interpolation, in consideration of the curvature of the Earth. The refractivities of the four seasons in 2019 are derived at the locations of seven meteorological observatories within the Korean Peninsula, using the refractivity data from the other nineteen observatories. The atmospheric refractivities on 15 February 2019 are then evaluated across the entire Korean Peninsula, using the atmospheric data collected from the twenty meteorological observatories. We found that the proposed IDW interpolation has the lowest average, the lowest average root-mean-square error (RMSE) of ∇M (gradient of M), and more continuous results than other methods. To compare the resulting IDW refractivity interpolation for airborne SAR applications, all the propagation path losses across Pohang and Heuksando are obtained using the standard atmospheric condition of ∇M = 118 and the observation-based interpolated atmospheric conditions on 15 February 2019. On the terrain surface ranging from 90 km to 190 km, the average path losses in the standard and derived conditions are 179.7 dB and 182.1 dB, respectively. Finally, based on the air-to-ground scenario in the SAR application, two-dimensional illuminated field intensities on the terrain surface are illustrated.
- Conference Article
7
- 10.1109/plans.2004.1309001
- Apr 26, 2004
In this paper, the authors propose to describe the development process of a navigation system, concerning synthetic aperture radar (SAR) applications, starting from the motivation for the sensor selection and finalizing with the first flight-test results. Sensor selection was one of the first steps in the design. Motivations for the accelerometers, gyroscopes, Global Positioning System (GPS) board, magnetometer and all the other hardware components, like temperature-sensor selection, are explained. A complete description of the hardware integration process, where details like sensor output time synchronization, signal digitalization and calibration are addressed, is given. The developed software is fully explained: starting from the digitalized signals and ending in the flight path outputted by the iterative extended Kalman filter. The developed software includes several features that allow the development of an inertial navigation system (INS) focused on SAR applications. The first flight tests have taken place on board of a Cessna Citation II jet aircraft. Results from these flight tests are presented as an evaluation of the performances of the INS.
- Research Article
17
- 10.12000/jr19104
- Feb 28, 2020
- DOAJ (DOAJ: Directory of Open Access Journals)
As an active microwave imaging sensor, Synthetic Aperture Radar (SAR) has become one of the main means of Earth observation owing to its unique technical advantages of all-day, all-weather operation and long working distance. As such, it plays a very important role in military and civilian fields. With the development of SAR remote-sensing technology, high-resolution, high-quality SAR images are produced continuously. However, manual detection and recognition of targets of interest is time-consuming and laborious, so the development of Automatic Target Recognition (ATR) technology is a matter of urgency. The typical SAR ATR system primarily comprises three stages: detection, discrimination, and classification/recognition. The detection and discrimination stages are the basis of the SAR ATR system, and research on SAR applications in the radar field has been conducted by researchers around the world. For single-channel SAR images, target detection and discrimination from simple scenes yield good results. However, in complex scenes, the clutter scattering intensity is relatively high, the clutter background is heterogenous, the target scattering intensity is relatively weak, and the target distribution is dense. These factors continue to make accurate SAR target detection and discrimination difficult. In this paper, we summarize the recent research progress on single-channel SAR target detection and discrimination methods for complex scenes, analyze the characteristics and problems associated with various methods, and consider the future development trend of single-channel SAR target detection and discrimination methods for complex scenes.
- Conference Article
- 10.1117/12.799699
- Oct 2, 2008
- Proceedings of SPIE, the International Society for Optical Engineering/Proceedings of SPIE
This paper presents the progress toward marine applications of synthetic aperture radar (SAR) data and a review of the SAR satellite program in China. The technique development includes the development of algorithms and of methodology for extracting oceanographic parameters from SAR data. Marine applications range from environmental monitoring to oceanographic research. Two series of SAR satellites have been planned. The first SAR satellite of the Environmental and Disaster Monitoring Satellite series (HJ series) will be launched in 2009 while the first SAR satellite of the Ocean SAR Satellite series (HY-3 series) is in planning phase. A description is given of the instruments and their plateforms.
- Research Article
4
- 10.3390/rs16040664
- Feb 12, 2024
- Remote Sensing
In complex environments, the clutter statistical characteristics of synthetic aperture radar (SAR) are inconstant, and the constant detection performance of a false alarm rate (CFAR) detector based on a clutter statistical model is also hard to achieve. As a result, the overestimated threshold leads to a degradation in detection probability. To this end, this paper proposes a SAR ship detector different from CFAR detectors, which is independent of traditional clutter statistical distribution models and the probability of a false alarm (PFA). The proposed detector aims to raise the ship detection probability and alleviate interference from complex environments such as multi-target areas, shores, and breakwaters. It estimates clutter-truncated thresholds based on clutter intensity statistics (CIS). Firstly, three statistical parameters, including the mean, standard deviation, and maximum intensity of background clutter contaminated by outliers, are calculated; secondly, these parameters are utilized to estimate the clutter-truncated threshold using the novel CIS; and finally, the pixel under test is determined according to the CIS detection rule. Compared with CFAR-based algorithms, CIS obtains a high probability of detection in complex environments. As for other aspects, the CIS detector is insensitive to the structure of the detection window, as well as the size. It is also computationally efficient due to its simple calculations. The superiority of the CIS detector is validated on scene-differed SAR images from the DSSDD dataset.
- Research Article
1
- 10.1109/jstars.2026.3671314
- Jan 1, 2026
- IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Ship detection is an important research direction in the intelligent interpretation of remote sensing imagery, with significant applications in marine monitoring, maritime traffic management, and search and rescue. Synthetic aperture radar (SAR) imagery plays a crucial role in maritime scenarios due to its all-weather, day-and-night imaging capabilities, making it a vital data source for ship detection. However, single-modality data are often constrained by imaging conditions and insufficient for detection in complex environments. Optical and SAR images each offer unique advantages, and their complementary nature motivates cross-modal fusion to enhance detection performance. To alleviate the reliance on limited annotated SAR data, this paper proposes an edge-constrained CycleGAN to generate structurally consistent SAR-style pseudo-images from unpaired optical images, thereby enhancing cross-modal feature alignment and increasing data diversity. To further address the complexity of learning unified representations across modalities within the teacher-student framework, a dual-teacher student architecture based on the probabilistic ensembling fusion strategy is designed to extract domain-invariant features from both optical and SAR pseudo-samples and to optimize the student network through collaborative distillation. Experimental results demonstrate that the proposed method achieves better overall detection performance compared to representative semi-supervised methods in multi-source remote sensing ship detection, validating its effectiveness for SAR ship detection under complex remote sensing conditions.
- Research Article
- 10.1142/s2301385027500749
- Feb 25, 2026
- Unmanned Systems
Ship detection in Synthetic Aperture Radar (SAR) imagery remains challenging due to complex backgrounds, scale variations, and limited semantic discrimination in conventional detectors. To address these critical challenges, we propose SAR-SwinX (SAR Swin Transformer-enhanced YOLOX): a novel hybrid lightweight one-stage detection model composed of anchor-free Exceeding You Only Look Once (YOLOX) as a baseline and enhanced with Swin transformer modules based on cross-stage partial connections (CSP) to improve contextual representation. This hybrid design combines the local feature extraction strengths of Convolutional Neural Networks (CNNs) with the global semantic modeling of visual transformers, enabling effective multiscale ship detection in cluttered maritime scenes. Extensive experiments conducted on two public SAR datasets, including SSDD and HRSID, consistently validate the superiority of SAR-SwinX over the baseline YOLOX-s and existing state-of-the-art methods. A key result of our approach is that SAR-SwinX improves mAP@50:95 by 2.03% and 3.46%, enhances recall by 0.73% and 0.52%, and boosts the F1-score by 0.56% and 0.16% for SSDD and HRSID, respectively. These results highlight SAR-SwinX as an efficient and robust solution for SAR ship detection in complex environments, with favorable computational efficiency.
- Book Chapter
1
- 10.1007/978-3-642-39109-5_8
- Jan 1, 2013
During the detection in complex biological system, similar to other fluorescence probes, the molecular beacons suffer severely from the background signal interference. Recent studies indicated that excimer molecular beacon (EMB) can address the problem. EMB is a dual-pyrene-labeled hairpin DNA structure with large Stokes shift and long fluorescence lifetime, which afford an effective strategy for detection in complex biological environment. In the chapter, the recent development of the research of EMB is presented. Firstly, the general design of EMB, as well as the structure and working mechanism, is introduced. Furthermore, the synthesis and properties of EMB are descripted, which explain its capability of detection in complex environment and high sensitivity and selectivity. Finally, the examples of the different applications are discussed, including nucleic acids and other molecules detection.