Multi-Feature Fusion Image Dehazing Based on SwinTransformerV2
Multi-Feature Fusion Image Dehazing Based on SwinTransformerV2
- Research Article
7
- 10.1002/mp.16946
- Mar 4, 2024
- Medical physics
Breast tumor is a fatal threat to the health of women. Ultrasound (US) is a common and economical method for the diagnosis of breast cancer. Breast imaging reporting and data system (BI-RADS) category 4 has the highest false-positive value of about 30% among five categories. The classification task in BI-RADS category 4 is challenging and has not been fully studied. This work aimed to use convolutional neural networks (CNNs) for breast tumor classification using B-mode images in category 4 to overcome the dependence on operator and artifacts. Additionally, this work intends to take full advantage of morphological and textural features in breast tumor US images to improve classification accuracy. First, original US images coming directly from the hospital were cropped and resized. In 1385 B-mode US BI-RADS category 4 images, the biopsy eliminated 503 samples of benign tumor and left 882 of malignant. Then, K-means clustering algorithm and entropy of sliding windows of US images were conducted. Considering the diversity of different characteristic information of malignant and benign represented by original B-mode images, K-means clustering images and entropy images, they are fused in a three-channel form multi-feature fusion images dataset. The training, validation, and test sets are 969, 277, and 139. With transfer learning, 11 CNN models including DenseNet and ResNet were investigated. Finally, by comparing accuracy, precision, recall, F1-score, and area under curve (AUC) of the results, models which had better performance were selected. The normality of data was assessed by Shapiro-Wilk test. DeLong test and independent t-test were used to evaluate the significant difference of AUC and other values. False discovery rate was utilized to ultimately evaluate the advantages of CNN with highest evaluation metrics. In addition, the study of anti-log compression was conducted but no improvement has shown in CNNs classification results. With multi-feature fusion images, DenseNet121 has highest accuracy of 80.22±1.45% compared to other CNNs, precision of 77.97±2.89% and AUC of 0.82±0.01. Multi-feature fusion improved accuracy of DenseNet121 by 1.87% from classification of original B-mode images (p<0.05). The CNNs with multi-feature fusion show a good potential of reducing the false-positive rate within category 4. The work illustrated that CNNs and fusion images have the potential to reduce false-positive rate in breast tumor within US BI-RADS category 4, and make the diagnosis of category 4 breast tumors to be more accurate and precise.
- Research Article
4
- 10.1061/jhtrcq.0000545
- Feb 15, 2017
- Journal of Highway and Transportation Research and Development (English Edition)
For multi-feature fusion in automatic recognition of pavement distress images, we proposed a multi-feature fusion method based on manifold learning. In this method, the intrinsic features o...
- Research Article
3
- 10.1002/mmce.23524
- Nov 2, 2022
- International Journal of RF and Microwave Computer-Aided Engineering
Due to millimeter-wave (MMW) has a strong ability to penetrate clothing, MMW holographic imaging technology can conduct a non-contact inspection of the human body's surface. Therefore, it is of great significance to study the security inspection equipment and target recognition technology based on millimeter wave imaging. In this paper, an active and passive hybrid millimeter wave imaging target recognition method based on multi-feature fusion is proposed, which improves the ability of the imaging system to identify different dangerous targets. First, active and passive millimeter wave imaging techniques are discussed, the reconstruction algorithm of active millimeter wave holographic imaging is derived in detail, and the measured optical photos and data of active and passive imaging are obtained. Secondly, the image preprocessing technology applied to active and passive millimeter wave imaging is studied, which can effectively highlight the target, eliminate background interference, and clearly describe the target contour. Then the methods of image feature extraction, data feature extraction and multi-feature fusion are studied. On this basis, a multi-feature fusion method based on weighted series fusion is proposed to obtain the fusion feature vector form of active and passive MMW imaging. Finally, this paper proposed the target recognition method applied to millimeter wave imaging, and concludes that the fusion feature vector is better than the original feature vector, which provides an idea for the fusion of active and passive millimeter wave imaging at the feature level. It also provides a theoretical basis for the application of security equipment.
- Research Article
4
- 10.1002/jmri.29399
- Apr 24, 2024
- Journal of magnetic resonance imaging : JMRI
MRI-based placental analyses have been used to improve fetal growth restriction (FGR) assessment by complementing ultrasound-based measurements. However, these are still limited by time-consuming manual annotation in MRI data and the lack of mother-based information. To develop and validate a hybrid model for accurate FGR assessment by automatic placental radiomics on T2-weighted imaging (T2WI) and multifeature fusion. Retrospective. 274 pregnant women (29.5 4.0 years) from two centers were included and randomly divided into training (N = 119), internal test (N = 40), time-independent validation (N = 43), and external validation (N = 72) sets. 1.5-T, T2WI half-Fourier acquisition single-shot turbo spin-echo pulse sequence. First, the placentas on T2WI were manually annotated, and a deep learning model was developed to automatically segment the placentas. Then, the radiomic features were extracted from the placentas and selected by three-step feature selection. In addition, fetus-based measurement features and mother-based clinical features were obtained from ultrasound examinations and medical records, respectively. Finally, a hybrid model based on random forest was constructed by fusing these features, and further compared with models based on other machine learning methods and different feature combinations. The performances of placenta segmentation and FGR assessment were evaluated by Dice similarity coefficient (DSC) and the area under the receiver operating characteristic curve (AUROC), respectively. A P-value <0.05 was considered statistically significant. The placentas were automatically segmented with an average DSC of 90.0%. The hybrid model achieved an AUROC of 0.923, 0.931, and 0.880 on the internal test, time-independent validation, and external validation sets, respectively. The mother-based clinical features resulted in significant performance improvements for FGR assessment. The proposed hybrid model may be able to assess FGR with high accuracy. Furthermore, information complementation based on placental, fetal, and maternal features could also lead to better FGR assessment performance. Stage 2.
- Conference Article
1
- 10.1109/ams.2015.23
- Sep 1, 2015
Image fusion merges the complementary information of different sensors and wavelengths. Images from multiple sensors such as visible and infrared (IR) are of particular interest in many applications. We present here a multi-stage image fusion scheme for multi-sensor images. At first stage, the proposed method segments the image into homogeneous regions and generates segmentation maps. At second stage, the segmentation maps are combined by an adaptive weight adjustment procedure. The third stage fuses the input images and segmentation maps via genetic algorithm based multi-objective optimization strategy. The results indicated that our proposed fusion scheme yields good quality fused images when compared against the existing image fusion method.
- Research Article
- 10.1142/s146902682442001x
- Nov 25, 2024
- International Journal of Computational Intelligence and Applications
The objective of image captioning is to empower computers to generate human-like sentences autonomously, describing a provided image. To tackle the challenges of insufficient accuracy in image feature extraction and underutilization of visual information, we present a Swin Transformer-based model for image captioning with feature enhancement and multi-stage fusion (Swin-Caption). Initially, the Swin Transformer is employed in the capacity of an encoder for extracting images, while feature enhancement is adopted to gather additional image feature information. Subsequently, a multi-stage image and semantic fusion module is constructed to utilize the semantic information from past time steps. Lastly, a two-layer LSTM is utilized to decode semantic and image data, generating captions. The proposed model outperforms the baseline model in experimental tests and instance analysis on the public datasets Flickr8K, Flickr30K, and MS-COCO.
- Research Article
- 10.12928/telkomnika.v14i2.2748
- Jun 1, 2016
- TELKOMNIKA (Telecommunication Computing Electronics and Control)
Image fusion is a comprehensive information processing technique and its purpose is to enhance the reliability of the image via the processing of the redundant data among multiple images, improve the image definition and information content through fusion of the complementary information of multiple images so as to obtain the information of the objective or the scene in a more accurate, reliable and comprehensive manner. This paper uses the sparse representation method of compressive sensing theory, proposes a multi-source and multi-feature image information fusion method based on compressive sensing in accordance with the features of image fusion, performs sparsification processing on the source image with K-SVD algorithm and OMP algorithm to transfer from spatial domain to frequency domain and decomposes into low-frequency part and high-frequency park. Then it fuses with different fusion rules and the experimental results prove that the method of this paper is better than the traditional methods and it can obtain better fusion effects.
- Conference Article
- 10.1117/12.2304845
- Apr 10, 2018
Generating description for an image can be regard as visual understanding. It is across artificial intelligence, machine learning, natural language processing and many other areas. In this paper, we present a model that generates description for images based on RNN (recurrent neural network) with object attention and multi-feature of images. The deep recurrent neural networks have excellent performance in machine translation, so we use it to generate natural sentence description for images. The proposed method uses single CNN (convolution neural network) that is trained on ImageNet to extract image features. But we think it can not adequately contain the content in images, it may only focus on the object area of image. So we add scene information to image feature using CNN which is trained on Places205. Experiments show that model with multi-feature extracted by two CNNs perform better than which with a single feature. In addition, we make saliency weights on images to emphasize the salient objects in images. We evaluate our model on MSCOCO based on public metrics, and the results show that our model performs better than several state-of-the-art methods.
- Research Article
24
- 10.1016/j.bspc.2022.103810
- May 17, 2022
- Biomedical Signal Processing and Control
Diabetic retinopathy classification using a novel DAG network based on multi-feature of fundus images
- Research Article
- 10.3390/agriculture16030298
- Jan 23, 2026
- Agriculture
Current agricultural irrigation management practices are often extensive, and traditional soil moisture content (SMC) monitoring methods are inefficient, so there is a pressing need for innovative approaches in precision irrigation. This study proposes a Multi-Feature Fusion Network (MFF-Net) for SMC inversion. The model uses a designed Channel-Changeable Residual Block (ResBlockCC) to construct a multi-branch feature extraction and fusion architecture. Integrating the Channel Squeeze and Spatial Excitation (sSE) attention module with U-Net-like skip connections, MFF-Net inverts root-zone SMC from summer maize leaf images. Field experiments were conducted in Zhengzhou, Henan Province, China, from 2024 to 2025, under three irrigation treatments: 60–70% θfc, 70–90% θfc, and 60–90% θfc (θfc denotes field capacity). This study shows that (1) MFF-Net achieved its smallest inversion error under the 60–70% θfc treatment, suggesting the inversion was most effective when SMC variation was small and relatively low; (2) MFF-Net demonstrated superior performance to several benchmark models, achieving an R2 of 0.84; and (3) the ablation study confirmed that each feature branch and the sSE attention module contributed positively to model performance. MFF-Net thus offers a technological reference for real-time precision irrigation and shows promise for field SMC inversion in summer maize.
- Research Article
- 10.23977/jaip.2016.11002
- Jan 1, 2016
- Journal of Artificial Intelligence Practice
Traffic congestions happen more and more frequently on the current urban roads. Detecting the congestion rapidly and effectively can avoid the second damages. In this paper, we use the traffic images as data source instead of the videos to detect traffic congestions, which have the advantages of low cost and big probability to be applied widely. Firstly, the interest region of the traffic images are calibrated manually, and then the image features in the interest region are abstracted, including the sift corner, gray histogram variance, gray level co-occurrence matrix of energy and contrast. Finally, BP neural network is used to realize image multi-feature fusion, and to classify the traffic condition described by the traffic images. The simulation results show that the method can recognize the traffic condition with the accuracy of 95%.
- Research Article
17
- 10.1109/tgrs.2022.3159345
- Jan 1, 2022
- IEEE Transactions on Geoscience and Remote Sensing
It is challenging to estimate satellite targets’ absolute attitude and size with limited observational data. This article proposes an innovative way to jointly estimate satellite targets’ absolute attitude and size in the 3-D stable coordinates based on inverse synthetic aperture radar (ISAR) image interpretation with only one image. By taking advantage of the rectangular solar panels commonly equipped on satellites, this article extracts solar panel’s principal components, line features, and phase features of single ISAR imagery with principal component analysis (PCA), radon transform (RT), and minimum-entropy (ME)-based autofocus method, respectively. The projection relationship between these features and the absolute attitude and size of the satellite are established separately. Through multi-features fusion, a joint parameter estimation optimization function is established. This optimization is solved iteratively by the quasi-Newton method. The attitude and size parameters can be estimated simultaneously and rapidly, realizing the satellite state estimation under limited observation data. The excellent performance of the proposed algorithm is verified through different experiments.
- Research Article
1
- 10.3390/agronomy15040788
- Mar 23, 2025
- Agronomy
Broccoli is a highly nutritious vegetable that is favored worldwide. Assessing and predicting the shelf life of broccoli holds considerable importance for effective resource optimization and management. The physicochemical parameters and spectral characteristics of broccoli are important indicators partially reflecting its shelf life. However, few studies have used spectral image information to predict and evaluate the shelf life of broccoli. In this study, multispectral imaging combined with multi-feature data fusion was used to predict and evaluate the shelf life of broccoli. Spectral data and textural features were extracted from multispectral images of broccoli and fused with the physicochemical parameters for analysis. Savitzky–Golay (SG) convolution smoothing and standard normal variate (SNV) and normalization (Norm) preprocessing methods were employed to preprocess the original spectral data and textural features, while a successive projection algorithm (SPA) was used to extract relevant feature bands. The physicochemical parameters for broccoli shelf life were predicted using three methods: support vector regression (SVR), random forest classification (RF), and 2D convolutional neural network (2D-CNN) models. Broccoli shelf life prediction models were evaluated using three classification methods: RF, 1D-CNN, and 2D-CNN. The results demonstrate that, among the models used for predicting and evaluating the shelf life of broccoli, the SPA+SG+RF classification model employing fused data Type C achieves the highest accuracy. Specifically, this method achieves accuracies of 88.98% and 88.64% for the training and validation sets, respectively. Multi-feature data fusion of spectral image information and physical and chemical parameters were combined with different machine learning methods to predict and evaluate the shelf life of broccoli.
- Research Article
- 10.1080/01431161.2025.2586472
- Nov 15, 2025
- International Journal of Remote Sensing
To mitigate the negative impacts of unregulated aquaculture development and promote sustainable industry growth, it is essential to quickly and accurately identify and extract aquaculture ponds. These ponds are distinctive grid-like water bodies segmented by dikes and roads, making their accurate extraction challenging with a single spectral feature. This study addresses this challenge by using data from Sentinel-1 and Sentinel-2, incorporating a comprehensive analysis of spectral, shape, polarization, environmental and temporal features. We propose a two-stage hierarchical decision tree – random forest (HDT-RF) framework: the first stage extracts water bodies using polarization and vegetation indices to suppress non-water interference, while the second stage employs a random forest classifier that fuses multi-source features to refine pond identification. The study results indicate that: 1) HDT-RF achieves an overall accuracy of 95.31%. 2) Compared to traditional methods, the inclusion of environmental and temporal characteristics improves classification accuracy by 6% and enhances the ability to identify water bodies with shapes and structures similar to aquaculture ponds. 3) The introduction of VV and VH polarization features and the NDVI effectively mitigates the impacts of building shadows, non-water dark surfaces, and vegetation, improving the accuracy of water body extraction. 4) HDT-RF enables the automated extraction of aquaculture ponds in different study areas, demonstrating strong portability and high extraction accuracy. This method provides a valuable reference for large-scale pond extraction and offers technical support for fisheries management and sustainable development.
- Research Article
- 10.1088/1361-6501/ae2cae
- Jan 21, 2026
- Measurement Science and Technology
Substantial advancements have been achieved in image analysis methods for dust concentration detection. However, current methods typically rely on a single image feature type, which restricts the utilization of the full spectrum of information present in dust images, thus limiting the accuracy of detection. To address this limitation, the present study systematically examines the response patterns of image quality evaluation features, shallow features, and semantic features to variations in dust concentration within mining environments. Furthermore, it proposes a multi-feature fusion algorithm for dust concentration detection. This algorithm extracts multi-level features from images using techniques such as image quality evaluation, machine vision, and deep learning. A dual feature selection strategy is applied to eliminate redundant features. Subsequently, kernel ridge regression models are trained using feature concatenation, model averaging, model weighted averaging, and stacking fusion methods. The best-performing model, based on predictive accuracy, is selected for dust concentration detection, followed by a residual analysis of the detection results. The experimental dataset comprises 1440 coal dust images collected using a dust concentration measurement system, along with the corresponding dust concentration data. The experimental results demonstrate that the proposed algorithm achieves a high detection accuracy, with an average relative error of 2.40%. The manually designed image features exhibit strong fitting capabilities for dust concentration, while the multi-feature fusion regression model outperforms the single-feature regression models significantly. Among the various models, the regression model that integrates image quality evaluation, shallow, and semantic features, and employs stacking fusion, shows the best predictive performance.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.