Character classification enhancement through hybrid feature fusion in challenging scripts systems
Character classification enhancement through hybrid feature fusion in challenging scripts systems
- Research Article
- 10.3390/math13152347
- Jul 23, 2025
- Mathematics
Water body detection in synthetic aperture radar (SAR) imagery plays a critical role in applications such as disaster response, water resource management, and environmental monitoring. However, it remains challenging due to complex background interference in SAR images. To address this issue, a bi-encoder and hybrid feature fuse network (BiEHFFNet) is proposed for achieving accurate water body detection. First, a bi-encoder structure based on ResNet and Swin Transformer is used to jointly extract local spatial details and global contextual information, enhancing feature representation in complex scenarios. Additionally, the convolutional block attention module (CBAM) is employed to suppress irrelevant information of the output features of each ResNet stage. Second, a cross-attention-based hybrid feature fusion (CABHFF) module is designed to interactively integrate local and global features through cross-attention, followed by channel attention to achieve effective hybrid feature fusion, thus improving the model’s ability to capture water structures. Third, a multi-scale content-aware upsampling (MSCAU) module is designed by integrating atrous spatial pyramid pooling (ASPP) with the Content-Aware ReAssembly of FEatures (CARAFE), aiming to enhance multi-scale contextual learning while alleviating feature distortion caused by upsampling. Finally, a composite loss function combining Dice loss and Active Contour loss is used to provide stronger boundary supervision. Experiments conducted on the ALOS PALSAR dataset demonstrate that the proposed BiEHFFNet outperforms existing methods across multiple evaluation metrics, achieving more accurate water body detection.
- Research Article
20
- 10.1109/tcyb.2022.3178116
- Dec 1, 2023
- IEEE Transactions on Cybernetics
The rapid development of information and communication technologies has facilitated machining condition monitoring toward a data-driven paradigm, of which the Industrial Internet of Things (IIoT) serves as the fundamental basis to acquire data from physical equipment with sensing technologies as well as to learn the relationship between the system condition and the collected condition monitoring data. However, most data-driven methods suffer from using a single-domain space, ignoring the importance of the learned features, and failing to incorporate the handcrafted features assisted by domain knowledge. To solve these limitations, a novel deep learning approach is proposed for machining condition monitoring in the IIoT environment, which consists of three phases, including: 1) the unsupervised parallel feature extraction; 2) adaptive feature importance weighting; and 3) hybrid feature fusion. First, separate sparse autoencoders are utilized to conduct the unsupervised parallel feature extraction, which enables to learn abstract feature representation from multiple domain spaces simultaneously. Then, an attention module is designed for the adaptive feature importance weighting, which can assign higher weights to those critical features accordingly. Moreover, a hybrid feature fusion is deployed to complement the automatic feature learning and further yield better model performance by fusing the handcrafted features assisted by domain knowledge. Finally, a real-life case study and extensive experiments have been conducted to show the effectiveness and superiority of the proposed approach.
- Research Article
27
- 10.1109/access.2023.3286935
- Jan 1, 2023
- IEEE Access
Coffee leaf diseases can significantly impact the productivity and quality of the crops. Accurate and timely identification of these diseases is crucial for effective management and control. This paper proposes a hybrid feature fusion approach for identifying coffee leaf disease, including early and late feature fusion. First, we propose several hybrid models to extract the information feature in the input images by combining MobileNetV3, Swin Transformer, and variational autoencoder (VAE). MobileNetV3, acting on the inductive bias of locality, can extract image features that are closer to one another (local features), while the Swin Transformer is able to extract feature interactions that are further apart (high-level features). These differently extracted features contain complementary information that enriches a unified feature map. Second, the extracted images from models are fused in the early fusion network. The early-fusion learner network is deployed to learn the rich information from the extracted feature. The late fusion network is implemented to comprehensively learn the fused feature before a classification network classifies coffee leaf diseases. The proposed hybrid feature fusion approach is evaluated on a challenging, real world Robusta Coffee Leaf (RoCoLe) dataset with various diseases, including red spider mite and leaf rust disease. The results show that our approach, the hybrid feature fusion of MobileNetV3 and Swin Transformer, outperforms the individual models with an accuracy of 84.29%. In conclusion, the hybrid feature fusion approach combining MobileNetV3 and Swin Transformer models is a promising approach for coffee leaf disease identification, providing accurate and timely diagnosis for effective management and control of the diseases in real-world conditions.
- Research Article
- 10.3389/fonc.2025.1595980
- Aug 18, 2025
- Frontiers in Oncology
ObjectiveCervical cancer screening through cytology remains the gold standard for early detection, but manual analysis is time-consuming, labor-intensive, and prone to inter-observer variability. This study proposes an automated deep learning-based framework that integrates lesion detection, feature extraction, and classification to enhance the accuracy and efficiency of cytological diagnosis.Materials and methodsA dataset of 4,236 cervical cytology samples was collected from six medical centers, with lesion annotations categorized into six diagnostic classes (NILM, ASC-US, ASC-H, LSIL, HSIL, SCC). Four deep learning models, Swin Transformer, YOLOv11, Faster R-CNN, and DETR (DEtection TRansformer), were employed for lesion detection, and their performance was compared using mAP, IoU, precision, recall, and F1-score. From detected lesion regions, radiomics features (n=71) and deep learning features (n=1,792) extracted from EfficientNet were analyzed. Dimensionality reduction techniques (PCA, LASSO, ANOVA, MI, t-SNE) were applied to optimize feature selection before classification using XGBoost, Random Forest, CatBoost, TabNet, and TabTransformer. Additionally, an end-to-end classification model using EfficientNet was evaluated. The framework was validated using internal cross-validation and external testing on APCData (3,619 samples).ResultsThe Swin Transformer achieved the highest lesion detection accuracy (mAP: 0.94 external), outperforming YOLOv11, Faster R-CNN, and DETR. Combining radiomics and deep features with TabTransformer yielded superior classification (test accuracy: 94.6%, AUC: 95.9%, recall: 94.1%), exceeding both single-modality and end-to-end models. Ablation studies confirmed the importance of both the detection module and hybrid feature fusion. External validation demonstrated high generalizability (accuracy: 92.8%, AUC: 95.1%). Comprehensive statistical analyses, including bootstrapped confidence intervals and Delong’s test, further substantiated the robustness and reliability of the proposed framework.ConclusionsThe proposed AI-driven cytology analysis framework offers superior lesion detection, feature fusion-based classification, and robust generalizability, providing a scalable solution for automated cervical cancer screening. Future efforts should focus on explainable AI (XAI), real-time deployment, and larger-scale validation to facilitate clinical integration.
- Research Article
23
- 10.3390/cancers15215247
- Oct 31, 2023
- Cancers
Oral cancer is a fatal disease and ranks seventh among the most common cancers throughout the whole globe. Oral cancer is a type of cancer that usually affects the head and neck. The current gold standard for diagnosis is histopathological investigation, however, the conventional approach is time-consuming and requires professional interpretation. Therefore, early diagnosis of Oral Squamous Cell Carcinoma (OSCC) is crucial for successful therapy, reducing the risk of mortality and morbidity, while improving the patient's chances of survival. Thus, we employed several artificial intelligence techniques to aid clinicians or physicians, thereby significantly reducing the workload of pathologists. This study aimed to develop hybrid methodologies based on fused features to generate better results for early diagnosis of OSCC. This study employed three different strategies, each using five distinct models. The first strategy is transfer learning using the Xception, Inceptionv3, InceptionResNetV2, NASNetLarge, and DenseNet201 models. The second strategy involves using a pre-trained art of CNN for feature extraction coupled with a Support Vector Machine (SVM) for classification. In particular, features were extracted using various pre-trained models, namely Xception, Inceptionv3, InceptionResNetV2, NASNetLarge, and DenseNet201, and were subsequently applied to the SVM algorithm to evaluate the classification accuracy. The final strategy employs a cutting-edge hybrid feature fusion technique, utilizing an art-of-CNN model to extract the deep features of the aforementioned models. These deep features underwent dimensionality reduction through principal component analysis (PCA). Subsequently, low-dimensionality features are combined with shape, color, and texture features extracted using a gray-level co-occurrence matrix (GLCM), Histogram of Oriented Gradient (HOG), and Local Binary Pattern (LBP) methods. Hybrid feature fusion was incorporated into the SVM to enhance the classification performance. The proposed system achieved promising results for rapid diagnosis of OSCC using histological images. The accuracy, precision, sensitivity, specificity, F-1 score, and area under the curve (AUC) of the support vector machine (SVM) algorithm based on the hybrid feature fusion of DenseNet201 with GLCM, HOG, and LBP features were 97.00%, 96.77%, 90.90%, 98.92%, 93.74%, and 96.80%, respectively.
- Research Article
- 10.36548/jiip.2025.3.002
- Sep 1, 2025
- Journal of Innovative Image Processing
Effective disease control and agricultural production require accurate and rapid crop disease detection. The worldwide commodity chili crops are sensitive to several diseases, including anthracnose, which reduces yields and harms farmers. Traditional disease detection approaches are laborious, time-consuming, and require specialized knowledge, adding to intervention delays and economic losses. The lack of systematic chili disease data makes identification more difficult. This research attempts to improve agricultural disease identification utilizing feature fusion, transfer learning, and a Convolutional Neural Network (CNN) to accurately and effectively diagnose chili plant anthracnose disease. Images are represented by two feature extractors: the first is the CNN based on VGG19, and the second is the Hybrid Feature Extractor (HFE). Three feature extraction techniques—Speed Up Robust Feature (SURF), Local Binary Pattern (LBP), and Histogram-Oriented Gradient (HOG) are combined into a single fused feature vector by the HFE. The classification model is then created by combining these two feature vectors. Using this combined feature set, a CNN with a fully connected layer and SoftMax function is trained to identify whether chili images are healthy or unhealthy. The model is also improved and optimized through data augmentation. The feature fusion approach shows great promise because it can more precisely detect anthracnose disease in chilli plants. Using 128 x 128 pixel images, the model learned at a rate of 0.01 and achieved 99.58% success after 100 iterations. Regardless of different batch sizes and learning rates, the model performs well. When compared to the top models currently in use, the feature fusion approach produces better performance results. The financial loss caused by anthracnose disease and the research on managing chili crops will benefit sustainable agriculture.
- Research Article
2
- 10.1080/21681163.2023.2193649
- Apr 2, 2023
- Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization
In this research, the multi model (image and video) based face shape detection with human temperament is developed. Here, the video is captured by webcam via live recording session and the proposed model undergoes three major stages namely, pre-processing, extracted feature fusion and temperament detection. Extraction of facial landmark and facial boundary takes place in the pre-processing stage. In the feature extraction stage, the handcrafted features from image are extracted (i.e. face, forehead, eyes, cheeks, nose and mouth). From the video frames, the intrinsic features related to face region are extracted using pretrained Inception V3 model. Then robust principal component analysis (RPCA) is introduced to reduce the dimension of extracted features. Further, the feature fusion process is performed using discriminant correlation analysis (DCA) and canonical correlation analysis (CCA) at hybrid phase. Finally, the gated recurrent unit (GRU) classifier model is applied to identify the human temperaments based on face shapes. In the experimental scenario, the performance measures of accuracy (98.51%, 98.86%), precision (96.14%, 97.89%), recall (96.34%, 97.95%), F-measures (96.24%, 97.94%), etc are evaluated and compared with state-of-the-art methods under two datasets. In addition to this, the statistical test is also conducted to validate the efficacy of the proposed model.
- Research Article
- 10.3390/s24072052
- Mar 23, 2024
- Sensors (Basel, Switzerland)
Multi-frame super-resolution (MFSR) leverages complementary information between image sequences of the same scene to increase the resolution of the reconstructed image. As a branch of MFSR, burst super-resolution aims to restore image details by leveraging the complementary information between noisy sequences. In this paper, we propose an efficient burst-enhanced super-resolution network (BESR). Specifically, we introduce Geformer, a gate-enhanced transformer, and construct an enhanced CNN-Transformer block (ECTB) by combining convolutions to enhance local perception. ECTB efficiently aggregates intra-frame context and inter-frame correlation information, yielding an enhanced feature representation. Additionally, we leverage reference features to facilitate inter-frame communication, enhancing spatiotemporal coherence among multiple frames. To address the critical processes of inter-frame alignment and feature fusion, we propose optimized pyramid alignment (OPA) and hybrid feature fusion (HFF) modules to capture and utilize complementary information between multiple frames to recover more high-frequency details. Extensive experiments demonstrate that, compared to state-of-the-art methods, BESR achieves higher efficiency and competitively superior reconstruction results. On the synthetic dataset and real-world dataset of BurstSR, our BESR achieves PSNR values of 42.79 dB and 48.86 dB, respectively, outperforming other MFSR models significantly.
- Research Article
- 10.1177/15347346251412645
- Jan 19, 2026
- The international journal of lower extremity wounds
Automated leprosy chronic wound analysis from smartphone-acquired images remains hindered by uneven illumination, indistinct lesion margins, and poor spatial-textural integration. The CO-WinF framework introduces three specialized modules: AINCE, which performs tile-adaptive contrast remapping guided by local intensity distributions and edge-preserving smoothing to restore fine lesion textures; MEPS, which executes simultaneous multi-resolution encoding within a U-Net backbone enhanced by gradient-driven attention to emphasize boundary transitions and ensure accurate segmentation of irregular wound contours; and HSTFC, which leverages attention-weighted fusion of deep spatial embeddings and handcrafted LBP and HOG texture histograms, followed by classification via a gradient-boosted ensemble optimized for class imbalance. Validation on the CO2Wounds-V2 dataset yields 94.98% precision, 95.78% recall, 93.10% F1-score, and 87.09% IoU, surpassing existing state-of-the-art approaches. By integrating localized enhancement, edge-aware segmentation, and hybrid feature fusion in a computationally efficient pipeline, CO-WinF delivers robust, interpretable diagnostic support in resource-constrained clinical environments. Key novelties include the tile-adaptive remapping within AINCE, the gradient-driven attention integrated at scales in MEPS, and the attention-weighted fusion of spatial and textural features in HSTFC. By addressing pre-processing and feature-level integration challenges, CO-WinF establishes a benchmark for smartphone wound analysis.
- Research Article
1
- 10.1109/jsen.2025.3581717
- Aug 1, 2025
- IEEE Sensors Journal
GSD-YOLO: A Gear Surface Defects Detection Method Using Adaptive Multiscale Fusion and Hybrid Feature Fusion
- Research Article
- 10.1504/ijesms.2025.147416
- Jan 1, 2025
- International Journal of Engineering Systems Modelling and Simulation
Character classification enhancement through hybrid feature fusion in challenging scripts systems
- Research Article
- 10.1016/j.procs.2025.04.283
- Jan 1, 2025
- Procedia Computer Science
Optimizing Oral Cancer Detection: A Hybrid Feature Fusion using Local Binary Pattern and CNN
- Conference Article
- 10.1109/icme59968.2025.11210244
- Jun 30, 2025
Continuous Lane Detection Network with Hybrid Feature Fusion and Differential Aggregation
- Conference Article
- 10.15625/vap.2024.0226
- Dec 24, 2024
HYBRID FEATURE FUSION: ENHANCING SCENE IMAGE CLASSIFICATION WITH HANDCRAFTED AND DEEP LEARNING APPROACHES
- Research Article
- 10.1007/s42835-025-02425-w
- Sep 2, 2025
- Journal of Electrical Engineering & Technology
Hybrid Feature Fusion with a Stacking Classifier for Accurate High-Voltage Equipment Identification
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.