Published in last 50 years
Articles published on Multi-level Features
- New
- Research Article
- 10.5194/isprs-annals-x-1-w2-2025-239-2025
- Nov 5, 2025
- ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
- Anhao Yang + 5 more
Abstract. In the context of agricultural modernization, precise 3D organ segmentation has become indispensable for automated extraction of phenotypic traits. In particular, the precise delineation of stem and leaf structures from 3D point clouds is critical for monitoring plant growth and supporting high-throughput breeding programs. However, the intricate structure of crops and the blurred boundaries between stems and leaves present significant challenges, leading to the poor segmentation performance. To tackle these problems, we propose a Semantic Embedding-Guided Graph Self-Attention Network for stem-leaf separation in 3D point clouds, to tackle weak feature representation and low inter-class separability in complex plant structures. During the encoding stage, a multi-scale feature extraction module captures fine-grained local geometries, while a feature fusion module integrating graph convolution and self-attention facilitates deep fusion of local and global semantic information. In the decoding stage, hierarchical upsampling combined with multi-level feature fusion reconstructs high-resolution representations to achieve fine-grained segmentation. Furthermore, we introduce a joint loss function that integrates inter-class discriminative loss with cross-entropy, aiming to optimize intra-class uniformity and reinforce class boundary delineation. Validation experiments on the Plant-3D dataset demonstrate that our methodology attains superior performance, with mean precision, recall, and IoU achieving 96.47%, 96.39%, and 93.50%, respectively. The proposed approach demonstrates high robustness and generalizability across diverse plant species and growth stages, providing an effective solution for high-throughput plant phenotyping.
- New
- Research Article
- 10.1038/s41598-025-22545-7
- Nov 5, 2025
- Scientific reports
- Ruba Abu Khurma + 4 more
Dermatological diseases are prevalent globally and provide significant challenges in diagnosis and treatment. Dermatology has changed due to developments in high-resolution digital photography and medical imaging, making it possible to document and analyze skin, nail, and hair diseases in great detail. With more than 10,000 photos, the Skin Condition Image Network (SCIN) dataset has become an essential tool in this area. In dermatological image analysis, image segmentation is essential because it makes it easier to identify and classify areas of interest for use, including automated disease diagnosis, lesion identification, and measurement. However, because skin textures vary, lighting varies, and skin disorders appear differently individually, it is difficult to achieve reliable segmentation in dermatological images. While segmentation techniques are now helpful for broad image analysis jobs, they are frequently insufficient for dermatological images from datasets such as SCIN. Reliable and consistent segmentation results are hampered by problems such as uneven lighting, different lesion scales, and image artifacts. Therefore, particular optimization algorithms that can adapt to the unique characteristics of dermatological images are needed to increase segmentation accuracy. This work is designed explicitly for SCIN dermatological images, suggesting an enhanced multilevel image segmentation optimization method. Opposition-Based Learning (OBL) and Orthogonal Learning (OL) are two improvements that the Enhanced Secretary Bird Optimization Algorithm (mSBOA) uses to increase segmentation accuracy, robustness to image artifacts, and computational efficiency. This study aims to improve optimization algorithms for robust multilevel feature segmentation in SCIN dataset dermatological images, mitigate problems such as overlapping textures and variable illumination, increase computational efficiency without sacrificing accuracy, and investigate possible clinical benefits of higher segmentation accuracy in automated dermatological diagnostics. Accurate segmentation can help create personalized treatment approaches, enhance patient outcomes, and lower diagnostic errors. Dermatologists gain from the wider adoption of AI-based healthcare solutions made possible by strong segmentation algorithms, especially in distant or underdeveloped areas. By increasing the potential for automated dermatological evaluations and enhancing diagnostic capacities, the study's findings advance the field of dermatological image analysis.
- New
- Research Article
- 10.1088/1361-6501/ae1aa9
- Nov 3, 2025
- Measurement Science and Technology
- Bowen Xiao + 4 more
Abstract Data imbalance remains a significant challenge to the practical application of intelligent fault diagnosis in accessory gearboxes. While data augmentation has proven to be an effective solution, deep generative models are difficult to train under limited sample conditions. To address this issue, this paper proposes a hierarchical contextual feature fusion synthetic minority over-sampling technique (HCFF-SMOTE). First, a novel skip-connected encoder-decoder architecture is developed. The skip-connections enhance the model's ability to learn features with limited labeled data. The encoder employs a multi-scale convolutional neural network to hierarchically extract multi-level features. Meanwhile, the decoder integrates an HCFF mechanism, which combines channel attention, spatial attention, and fusion attention to adaptively capture dependencies across these multi-level features, thereby enhancing the fine-grained feature representation. After training, fault samples are mapped into the deep feature space by the encoder. New features are generated using the synthetic minority over-sampling technique (SMOTE) and are then reconstructed by the decoder to synthesize realistic and diverse fault samples. Extensive experiments demonstrate that HCFF-SMOTE outperforms state-of-the-art methods, achieving up to 10.76% higher accuracy compared to the imbalanced dataset with a fault sample proportion of 2.5%, demonstrating its robustness and effectiveness under extreme data imbalance.
- New
- Research Article
- 10.5194/isprs-annals-x-1-w2-2025-27-2025
- Nov 3, 2025
- ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
- Jing Du + 3 more
Abstract. Urban environments are continually evolving, which presents significant challenges for 3D semantic segmentation systems that must adapt to emerging object categories. In this paper, we address the problem of Novel Class Discovery (NCD) in 3D semantic segmentation for urban scenes. We introduce a feature-driven framework that leverages the Dynamic Multi-level Feature Synthesis Module (D-MFSM) to extract and integrate multi-scale, cross-view structural information from raw urban point clouds. D-MFSM dynamically partitions point clouds via an adaptive grouping mechanism that utilizes a learnable spatial weight vector, and subsequently constructs local neighborhoods by means of an improved farthest point sampling strategy. The extracted local features are then processed by a dual-path adaptive synthesis mechanism and further refined through a novel cross-axis reordering strategy, which together yield comprehensive aggregated feature representations. These representations facilitate robust novel class discovery while maintaining high segmentation accuracy on known classes. Comprehensive evaluations on the DALES dataset demonstrate that the proposed approach yields substantial improvements in segmentation performance across diverse urban scenarios. The proposed framework, therefore, offers a complementary solution to existing methods and contributes to the development of more adaptive and accurate 3D semantic segmentation systems in complex urban settings.
- New
- Research Article
- 10.3390/s25216694
- Nov 2, 2025
- Sensors
- Zengbo Xu + 1 more
A method for super-resolution reconstruction of sonograms based on Residual Dense Conditional Generative Adversarial Network (RDC-GAN) is proposed in this paper. It is well known that the resolution of medical ultrasound images is limited, and the single-frame image super-resolution algorithms based on a convolutional neural network are prone to losing texture details, extracting much fewer features, and then blurring the reconstructed images. Therefore, it is very important to reconstruct high-resolution medical images in terms of retaining textured details. A Generative Adversarial Network could learn the mapping relationship between low-resolution and high-resolution images. Based on GAN, a new network is designed, where the generation network is composed of dense residual modules. On the one hand, low-resolution (LR) images are input into the dense residual network, then the multi-level features of images are learned, and then are fused into the global residual features. On the other hand, conditional variables are introduced into a discriminator network to guide the process of super-resolution image reconstruction. The proposed method could realize four times magnification reconstruction of medical ultrasound images. Compared with classical algorithms including Bicubic, SRGAN, and SRCNN, experimental results show that the super-resolution effect of medical ultrasound images based on RDC-GAN could be effectively improved, both in objective numerical evaluation and subjective visual assessment. Moreover, the application of super-resolution reconstructed images to stage the diagnosis of cirrhosis is discussed and the accuracy rates prove the practicality in contrast to the original images.
- New
- Research Article
- 10.1016/j.ipm.2025.104240
- Nov 1, 2025
- Information Processing & Management
- Bangcheng Zhan + 3 more
CoSegNet: Color medical image segmentation via quaternion self-attention and multi-level feature constraints
- New
- Research Article
- 10.1016/j.neucom.2025.131069
- Nov 1, 2025
- Neurocomputing
- Routhu Srinivasa Rao + 2 more
Multi-level feature enhancement and dual attention mechanisms for improved osteoporosis diagnosis
- New
- Research Article
- 10.1016/j.neucom.2025.131233
- Nov 1, 2025
- Neurocomputing
- Pingzhu Liu
A multimodal fusion framework for semantic segmentation of remote sensing based on multilevel feature fusion learning
- New
- Research Article
- 10.1016/j.energy.2025.138657
- Nov 1, 2025
- Energy
- Xifeng Guo + 6 more
Short-term power load forecasting for estate-level buildings considering multilevel feature extraction and adaptive fusion
- New
- Research Article
- 10.1002/mp.70130
- Nov 1, 2025
- Medical physics
- Shuaikun Han + 6 more
Anterior cruciate ligament (ACL) injuries are common among athletes, and accurate diagnosis is essential for recovery. Magnetic resonance imaging (MRI) is the preferred tool, but manual interpretation is time-consuming and often affected by slice noise. This study proposes a novel automated framework, multi-level feature aggregation network with slice-aligning (MLFANet-SA), to detect ACL injuries from MRI scans by excluding irrelevant slices and focusing on diagnostically significant regions, without requiring ROI or segmentation labels. MLFANet-SA consists of two modules: (1) a slice-aligning (SA) model using local context perceptron (LCP) to identify boundary slices and unify diagnostic regions, and (2) a multi-level feature aggregation (MLFA) module that captures spatial and cross-slice lesion patterns via channel-wise Top-K pooling and cross-slice fusion. On the MRNet dataset, MLFANet-SA achieves an AUC of 0.981, sensitivity of 0.961, specificity of 0.941, precision of 0.933, accuracy of 0.949, and MCC of 0.892. On the private GDAPF dataset, it reaches an AUC of 0.975, sensitivity of 0.945, specificity of 0.947, precision of 0.945, accuracy of 0.946, and MCC of 0.901. These results demonstrate that MLFANet-SA achieves superior diagnostic performance with fewer annotation requirements compared to existing models. MLFANet-SA combines slice selection and feature aggregation to improve lesion localization and classification of ACL injuries. Our model outperforms previous approaches in both public and private datasets. Its ability to reduce manual labeling while maintaining high accuracy suggests strong potential in diagnosing ACL injuries forradiologists.
- New
- Research Article
- 10.1016/j.infrared.2025.105964
- Nov 1, 2025
- Infrared Physics & Technology
- Xuan Fang + 3 more
Progressive Multi-Level Feature Fusion network with Global–Local Feature Enhancement for Infrared small target detection
- New
- Research Article
- 10.1016/j.neucom.2025.131235
- Nov 1, 2025
- Neurocomputing
- Tanusree Ghosh + 1 more
Multi-level feature fusion for generalized face forgery detection
- New
- Research Article
- 10.1080/19392699.2025.2580303
- Oct 31, 2025
- International Journal of Coal Preparation and Utilization
- Xianwu Huang + 4 more
ABSTRACT Extracting froth feature information through image segmentation technology to guide flotation production is crucial for optimizing and precisely controlling the coal purification process. However, various production factors affect the quality of froth images obtained during flotation, often suffering froth large amounts of noise, blurry boundaries, and strong adhesion. Traditional segmentation methods are sensitive to noise, typically focus only on local pixel information, and lack an understanding of global image features, making it difficult to effectively segment froth edges and details. To address these challenges, we propose a Multi-Scale Residual Network (MSRNet) for froth image segmentation. This method uses an improved U-shaped network architecture with a fifth-order residual structure, the Residual-Conv Module, as the backbone network. By stacking residual blocks, the network depth is extended to enhance the model’s learning capability. Additionally, a Multi-Scale Attention Residual Module is integrated to better capture multi-scale information and focus on key channel features while extracting global features. MSRNet also incorporates a Multi-Scale Dual Attention Module to connect the two sides of the U-shaped architecture to enhance multi-level feature extraction capabilities. Experimental results demonstrate that the proposed MSRNet model effectively segments coal slurry froth images, laying a foundation for the subsequent development of an intelligent reagent control system for the flotation production process.
- New
- Research Article
- 10.1088/2631-8695/ae15d3
- Oct 30, 2025
- Engineering Research Express
- Shihao Hao Gu + 5 more
Abstract In dynamic environments, moving objects introduce unstable features that significantly degrade the accuracy of simultaneous localization and mapping (SLAM) systems. To address this issue, we propose Neural-KF, a robust visual SLAM framework that integrates three key modules: (1) a modified SuperPoint network with multi-level feature fusion for reliable static keypoint extraction, (2) a YOLOv8-based dynamic object detector, and (3) a Kalman-consistent state estimation mechanism that predicts object motion trajectories to enhance temporal consistency. By associating predicted and detected bounding boxes via the Hungarian algorithm, Neural-KF achieves accurate suppression of dynamic points while preserving sufficient static features for pose estimation. Experimental evaluations on public datasets, including KITTI and EuRoC, demonstrate that Neural-KF improves absolute trajectory error by up to 28% compared to VINS-Fusion and achieves competitive accuracy against advanced dynamic SLAM systems such as DynaSLAM. Furthermore, the system maintains real-time performance (>30 FPS) with a balanced trade-off between accuracy and computational cost. These results highlight the effectiveness of Neural-KF in achieving robust and efficient visual odometry under challenging dynamic conditions.
- New
- Research Article
- 10.1088/2057-1976/ae12f8
- Oct 29, 2025
- Biomedical Physics & Engineering Express
- Xiao Tian + 6 more
Multi-modal learning leverages complementary information from different modalities to enhance medical image segmentation. However, existing methods often require large-scale, high-quality annotations, which are scarce in clinical practice, and suffer from anatomical misalignment between modalities.We propose MHPC-Net, a Manhattan Hybrid-attentive Prototype-aligned Cross-modal Network for semi-supervised multi-modal segmentation. MHPC-Net uses a dual-branch design to integrate CT and MRI information, producing anatomically consistent and complementary outputs under limited supervision. A feature interaction module combines spatial weights derived from Manhattan distance with dual-modal cross-attention to enhance inter-modal exchange while preserving fine anatomical details, mitigating modality misalignment. A multi-level feature fusion module achieves deep semantic integration with spatial and anatomical consistency in a lightweight manner. To address semantic inconsistency from modality-specific traits and channel misalignment, a modality contrast strategy projects modality-invariant representations and modality-specific knowledge into distinct spaces. A feature decorrelation loss enforces independence between them, preserving complementary information. A prototype alignment mechanism with a memory bank further refines structure consistency and aligns modality-invariant representations for robust cross-modal representation learning.Experiments on cardiac and abdominal segmentation show that MHPC-Net achieves state-of-the-art performance under limited labels, improving accuracy and generalization in semi-supervised multi-modal scenarios.
- New
- Research Article
- 10.1088/1361-6501/ae1856
- Oct 28, 2025
- Measurement Science and Technology
- Shijie Lv + 1 more
Abstract Fabric defect detection represents a pivotal step in quality control within the textile industry. However, it confronts challenges such as high similarity between pattern backgrounds and defects, the complexity of fabric backgrounds, and the diversity of defects. To address these issues, this paper proposes a fabric defect detection method based on contrast enhancement and aggregated downsampling. Firstly, a Multi-scale Local Contrast Enhancement (MLCE) module is designed to enhance defect features and suppress the interference from complex background information by extracting significant nonlinear features from images. Secondly, a Parallel Wavelet Downsampling (PWD) module is introduced to avoid the loss of target information when extracting multi-level feature maps, and a lightweight design is adopted to reduce the computational burden of the network. To tackle the feature map misalignment issue during upsampling, a Feature Alignment Aggregation (FAA) module is proposed, which employs a learnable interpolation strategy to align cross-layer features. Experimental results demonstrate that this method achieves an mAP@50 of 83.9% in fabric defect detection tasks, representing a 7.3% improvement over the baseline model YOLOv8s, with an average inference speed of 67.8 FPS, meeting the requirements for real-time detection. Furthermore, the model exhibits good generalization ability, adapting to defect detection scenarios involving fabrics of different colors and textures. This study provides an efficient and robust solution for fabric defect detection in complex backgrounds.
- New
- Research Article
- 10.1080/20964471.2025.2570574
- Oct 19, 2025
- Big Earth Data
- Shiying Yuan + 7 more
ABSTRACT Under strong wind conditions, floatable plastics (i.e., plastic mulching films, plastic greenhouses, plastic dust-proof nets, etc.) are prone to be blown up to hang on the power transmission lines of high-speed railways, leading to the misfunction or even train service suspension. Therefore, timely and accurate mapping of these floatable plastics is of great significance to both railway administrations and the safety of passengers onboard. However, the potential of remote sensing has not been well justified in this study field. To tackle this issue, we take Beijing-Shanghai high-speed railway as an example, which is the busiest railway in China, and propose a novel deep learning based semantic segmentation model to map floatable plastics from very high-resolution optical satellite imagery. Specifically, a well-annotated sample dataset of floatable plastics is prepared, consisting of three typical categories of plastic mulching films, plastic greenhouses and plastic dust-proof nets. Afterwards, a hybrid Convolutional Neural Network-Mamba (CNN-Mamba) network is proposed, which integrates multi-scale convolutions with various local receptive fields and Mamba with global receptive fields into an end-to-end model. Specifically, the Multi-Perspective Fusion Block leverages multi-kernel convolutions to capture multi-scale local features, while the Feature Refinement Module integrates encoder-decoder multi-level features, thereby improving semantic consistency and boundary precision. Experimental results showed that the proposed model has achieved a high performance in floatable plastics mapping with an mIoU of 0.8641 and an average F1-score of 0.9261. Ablation studies have been done to justify the rationality of each module in the proposed hybrid model. Besides, the proposed model also outperformed several CNN-based and Mamba-based networks, not only in floatable plastics mapping but also in two other popular land use land cover datasets. Overall, this study provides an effective pipeline for monitoring the floatable plastics along high-speed railways.
- New
- Research Article
- 10.26689/jera.v9i5.11990
- Oct 17, 2025
- Journal of Electronic Research and Application
- Qian Xu
This paper proposes SW-YOLO (StarNet Weighted-Conv YOLO), a lightweight human pose estimation network for edge devices. Current mainstream pose estimation algorithms are computationally inefficient and have poor feature capture capabilities for complex poses and occlusion scenarios. This work introduces a lightweight backbone architecture that integrates WConv (Weighted Convolution) and StarNet modules to address these issues. Leveraging StarNet’s superior capabilities in multi-level feature fusion and long-range dependency modeling, this architecture enhances the model’s spatial perception of human joint structures and contextual information integration. These improvements significantly enhance robustness in complex scenarios involving occlusion and deformation. Additionally, the introduction of WConv convolution operations, based on weight recalibration and receptive field optimization, dynamically adjusts feature importance during convolution. This reduces redundant computations while maintaining or enhancing feature representation capabilities at an extremely low computational cost. Consequently, SW-YOLO substantially reduces model complexity and inference latency while preserving high accuracy, significantly outperforming existing lightweight networks.
- New
- Research Article
- 10.1088/2057-1976/ae0483
- Oct 16, 2025
- Biomedical Physics & Engineering Express
- Chaozhi Yang + 5 more
Purpose.Cerebrovascular segmentation is crucial for the diagnosis and treatment of cerebrovascular diseases. However, accurately extracting cerebral vessels from Time-of-Flight Magnetic Resonance Angiography (TOF-MRA) remains challenging due to the topological complexity and anatomical variability.Methods.This paper presents a novel Y-shaped segmentation network with fast Fourier convolution and Mamba, termed F-Mamba-YNet. The network employs a dual-encoder architecture that effectively leverages the complementarity of spectral and spatial domains for achieving the fusion of multi-level features. The spectral encoder features the Fast Fourier Convolution Module, which captures high-frequency changes in vessel edges, improving segmentation completeness and connectivity. The spatial encoder incorporates a Spatial Mamba Module, which captures long-range dependencies while enhancing the spatial feature representation of cerebral vessels. Additionally, a Multi-scale Feature Selection Module in the decoder adaptively enhances discriminative features, enabling improved feature reuse.Results.Experiments demonstrate that the proposed F-Mamba-YNet achieved 86.28% and 72.24% Dice Similarity Coefficient (DSC) on the IXI-A-SegAN dataset and MIDAS dataset.Conclusions.Compared with existing algorithms, F-Mamba-YNet provided more connected and continuous segmentation results and achieved competitive performance in terms of generalization.
- Research Article
- 10.1186/s12859-025-06272-4
- Oct 15, 2025
- BMC Bioinformatics
- Siqi Chen + 3 more
BackgroundThe identification of protein-protein interaction (PPI) plays a crucial role in understanding the mechanisms of complex biological processes. Current research in predicting PPI has shown remarkable progress by integrating protein information with PPI topology structure. Nevertheless, these approaches frequently overlook the dynamic nature of protein and PPI structures during cellular processes, including conformational alterations and variations in binding affinities under diverse environmental circumstances. Additionally, the insufficient availability of comprehensive protein data hinders accurate protein representation. Consequently, these shortcomings restrict the model’s generalizability and predictive precision.ResultsTo address this, we introduce DCMF-PPI (Dynamic condition and multi-feature fusion framework for PPI), a novel hybrid framework that integrates dynamic modeling, multi-scale feature extraction, and probabilistic graph representation learning. DCMF-PPI comprises three core modules: (1) PortT5-GAT Module: The protein language model PortT5 is utilized to extract residue-level protein features, which are integrated with dynamic temporal dependencies. Graph attention networks are then employed to capture context-aware structural variations in protein interactions; (2) MPSWA Module: Employs parallel convolutional neural networks combined with wavelet transform to extract multi-scale features from diverse protein residue types, enhancing the representation of sequence and structural heterogeneity; (3) VGAE Module: Utilizes a Variational Graph Autoencoder to learn probabilistic latent representations, facilitating dynamic modeling of PPI graph structures and capturing uncertainty in interaction dynamics.ConclusionWe conducted comprehensive experiments on benchmark datasets demonstrating that DCMF-PPI outperforms state-of-the-art methods in PPI prediction, achieving significant improvements in accuracy, precision, and recall. The framework’s ability to fuse dynamic conditions and multi-level features highlights its effectiveness in modeling real-world biological complexities, positioning it as a robust tool for advancing PPI research and downstream applications in systems biology and drug discovery.