Articles published on Attention network
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
13558 Search results
Sort by Recency
- New
- Research Article
- 10.1007/s11222-025-10801-9
- Jan 18, 2026
- Statistics and Computing
- Kesen Wang + 1 more
Abstract In recent decades, statisticians have been increasingly encountering spatial data that exhibit non-Gaussian behaviors such as asymmetry and heavy-tailedness. As a result, the assumptions of symmetry and fixed tail weight in Gaussian processes have become restrictive and may fail to capture the intrinsic properties of the data. To address the limitations of the Gaussian models, a variety of skewed models has been proposed, of which the popularity has grown rapidly. These skewed models introduce parameters that govern skewness and tail weight. Among various proposals in the literature, unified skewed distributions, such as the Unified Skew-Normal (SUN), have received considerable attention. In this work, we revisit a more concise and intepretable re-parameterization of the SUN distribution and apply the distribution to random fields by constructing a generalized unified skew-normal (GSUN) spatial process. We demonstrate that the GSUN is a valid spatial process by showing its vanishing correlation at large distances and provide the corresponding spatial interpolation method. In addition, we develop an inference mechanism for the GSUN process using the concept of neural Bayes estimators with deep graphical attention networks (GATs) and encoder transformer. We show the superiority of our proposed estimator over conventional CNN-based architectures in terms of stability and accuracy by means of a simulation study. In addition, we demonstrate that the GSUN process offers enhanced flexibility compared to another model proposed in the literature through an application to Pb-contaminated soil data. Furthermore, we show that the GSUN process is different from the conventional Gaussian processes and Tukey g -and- h processes, through the probability integral transform (PIT).
- New
- Research Article
- 10.1016/j.braindev.2026.104502
- Jan 17, 2026
- Brain & development
- Xinyu Yang + 3 more
Neurocognitive profiles of attentional networks in children with Tic disorders.
- New
- Research Article
- 10.1108/dta-06-2025-0512
- Jan 16, 2026
- Data Technologies and Applications
- Kai Kong + 4 more
Purpose The purpose of this paper is to propose a hierarchical deep reinforcement learning (H-DRL) framework for real-time tactical decision-making in team sports. The framework addresses challenges such as continuous action spaces, partial observability and adversarial environments by leveraging multi-agent collaboration, adaptive strategy optimization and explainable artificial intelligence (AI). It aims to enhance tactical accuracy, decision speed and resource efficiency while uncovering novel tactical patterns that human experts may overlook. Design/methodology/approach The study combines graph neural networks (GNNs) for spatial-temporal player interactions and transformer-based attention for strategic pattern recognition. It integrates opponent modeling via inverse reinforcement learning (IRL) and self-play. The hierarchical architecture decomposes decisions into strategic, tactical and technical levels. Experiments were conducted on professional basketball, soccer and rugby datasets, with validation by expert coaches. The framework was tested in real-world deployments, including youth academies and professional teams, to evaluate performance and tactical innovations. Findings The H-DRL framework achieved a 34.7% higher tactical accuracy, 28.3% faster decision-making and 41.2% lower resource usage compared to state-of-the-art methods. It identified 17 new tactical patterns, such as dynamic role-switching, which improved scoring efficiency by 23.6%. Real-world deployments demonstrated significant performance gains, including a 42.3% improvement in tactical decision-making for youth teams. The system's explainable AI module bridged algorithmic insights with coach expertise, fostering trust and adoption. Research limitations/implications The study is limited by its reliance on proprietary tracking data and the computational demands of real-time deployment. Future research could explore cross-sport generalization and integration of physiological/psychological factors. The framework's scalability to larger team sizes and more complex environments remains a challenge. These limitations highlight opportunities for advancements in model compression and hardware optimization. Practical implications The framework provides actionable insights for coaches and players, enhancing in-game decision-making and training efficiency. It enables teams to adopt data-driven tactics, such as elastic pressing in soccer or optimized phase play in rugby. The system's real-time capabilities (30.9 ms latency) make it suitable for live match analysis. Professional teams reported improved tactical understanding (91.7% of coaches) and scoring efficiency (23.6% increase). The technology is applicable beyond sports, including autonomous systems and emergency response. Social implications The study promotes the ethical use of AI in sports, emphasizing augmentation over replacement of human expertise. It fosters collaboration between coaches and AI, enhancing tactical literacy and innovation. The framework's transparency builds trust, addressing concerns about black-box AI. By uncovering counterintuitive strategies, it challenges traditional coaching paradigms and encourages continuous learning. The technology's broader applications (e.g. military, robotics) underscore its societal impact. Originality/value This paper presents a hierarchical DRL framework for real-time tactical decision-making in team sports, integrating GNNs, transformers and IRL. Its dual-stream architecture and adaptive computation are novel contributions. The system's ability to discover and explain tactical innovations (e.g. dynamic role-switching) sets it apart from prior work. The rigorous validation across multiple sports and real-world deployments demonstrates its practical value. The study advances multi-agent AI, offering scalable solutions for complex, dynamic environments.
- New
- Research Article
- 10.3389/fnsys.2025.1674124
- Jan 16, 2026
- Frontiers in Systems Neuroscience
- Almira Kustubayeva + 9 more
Objective Behavioral and neurological studies suggest that major depressive disorder (MDD) is associated with pervasive deficits in executive control of attention. Research using Event-Related Potentials (ERPs) to investigate attentional impairments in depression has provided mixed results. The current study aimed to clarify abnormalities in ERPs associated with depression through use of the Attention Network Test (ANT) which assesses efficiency of three fundamental brain networks: executive control, alerting, and orienting. Methods Participants were 93 volunteers. We compared ERP amplitudes in healthy, subsyndromal depression, and MDD groups (31 participants per group) during performance of an extended-duration version of the ANT. Results Both N100 and P300 ERP amplitudes were generally lower in the MDD group across central-parietal and posterior sites, with medium-to-large effect sizes. There were also significant effects of depression on the ANT indices for executive control and alerting. Further analyses showed that some abnormalities in ERPs were seen in the subsyndromal group and that depression effects were stable across time, despite vigilance decrement. Conclusion Neurocognitive deficits in depression may relate to depletion of a general attentional resource.
- New
- Research Article
- 10.3390/app16020946
- Jan 16, 2026
- Applied Sciences
- Liang Dong + 3 more
Infrared and visible image fusion improves the visual representation of scenes. Current deep learning-based fusion methods typically rely on either convolution operations for local feature extraction or Transformers for global feature extraction, often neglecting the contribution of multi-scale features to fusion performance. To address this limitation, we propose MRMAFusion, a nested connection model that relies on the multi-scale restoration-Transformer (Restormer) and multi-dimensional attention. We construct an encoder–decoder architecture on UNet++ network with multi-scale local and global feature extraction using convolution blocks and Restormer. Restormer can provide global dependency and more comprehensive attention to texture details of the target region along the vertical dimension, compared to extracting features by convolution operations. Along the horizontal dimension, we enhance MRMAFusion’s multi-scale feature extraction and reconstruction capability by incorporating multi-dimensional attention into the encoder’s convolutional blocks. We perform extensive experiments on the public datasets TNO, NIR and RoadScene and compare with other state-of-the-art methods for both objective and subjective evaluation.
- New
- Research Article
- 10.3389/fpsyt.2025.1722172
- Jan 16, 2026
- Frontiers in Psychiatry
- Weiqing Li + 8 more
Background Mild cognitive impairment (MCI) is a precursor state of Alzheimer’s disease (AD) and has attracted attention, but why amnestic mild cognitive impairment (aMCI) is more likely to progress to AD than non-amnestic mild cognitive impairment (naMCI) is unclear. The present study of aMCI compares differences in intra- and inter-network functional connectivity (FC) across multiple networks in naMCI and further correlates FC with cognitive assessment scores to assess their ability to predict AD progression. Methods Resting-state functional magnetic resonance imaging (rs-fMRI) was performed in 30 naMCI and 40 aMCI cases, and 12 resting-state networks (RSNs) were identified by independent component analysis (ICA). Two-sample t-tests were performed to detect intra-network FC differences, and functional network connectivity (FNC) was calculated to compare inter-network FC differences. Subsequently, Pearson or Spearman correlation analyses were used to explore the correlation between altered FC and cognitive assessment scores. Results The aMCI compared to the naMCI differed within the (Default mode network) DMN, (Dorsal attention network) DAN, (Sensorimotor system) SMN, and (Salience network) SN networks (corrected for FWEc, P< 0.05), and inter-network differences in DAN-DMN, DMN-SN, SN-SMN (corrected for FWEc, P<0.05). Conclusion aMCI contrasts naMCI with widespread intra- and inter-static FNC differences, mainly involving the DMN, DAN, SMN, and SN. these network interactions provide a powerful method for assessing and predicting why aMCI is more likely to progress to AD, and contribute to our understanding of the neurological mechanisms underlying the pathological process of AD.
- New
- Research Article
- 10.3390/rs18020299
- Jan 16, 2026
- Remote Sensing
- Yuyuan Liu + 6 more
With the deployment of mega-constellations, the proliferation of on-orbit Resident Space Objects (RSOs) poses a severe challenge to Space Situational Awareness (SSA). RSOs produce elongated and stripe-like signatures in long-exposure imagery as a result of their relative orbital motion. The accurate detection of these signatures is essential for critical applications like satellite navigation and space debris monitoring. However, on-orbit detection faces two challenges: the obscuration of dim RSOs by complex stray light interference, and their dense overlapping trajectories. To address these challenges, we propose the Shape-Aware Attention Network (SAANet), establishing a unified Shape-Aware Paradigm. The network features a streamlined Shape-Aware Feature Pyramid Network (SA-FPN) with structurally integrated Two-way Orthogonal Attention (TTOA) to explicitly model linear topologies, preserving dim signals under intense stray light conditions. Concurrently, we propose an Adaptive Linear Oriented Bounding Box (AL-OBB) detection head that leverages a Joint Geometric Constraint Mechanism to resolve the ambiguity of regressing targets amid dense, overlapping trajectories. Experiments on the AstroStripeSet and StarTrails datasets demonstrate that SAANet achieves state-of-the-art (SOTA) performance, achieving Recalls of 0.930 and 0.850, and Average Precisions (APs) of 0.864 and 0.815, respectively.
- New
- Research Article
- 10.1080/17452759.2025.2611194
- Jan 16, 2026
- Virtual and Physical Prototyping
- Suk Ki Lee + 6 more
ABSTRACT In additive manufacturing (AM) processes, in-situ monitoring combined with machine learning (ML) approaches plays a crucial role in ensuring consistent product quality and preventing defects. However, existing ML methods for anomaly detection predominantly rely on correlation-based models that lack interpretability and fail to capture underlying spatiotemporal and causal dynamics. This study proposes an anomaly detection framework that integrates spatiotemporal dependency learning (STL) and Granger causality learning (GCL) through graph attention network mechanisms. The STL module enforces spatial consistency and temporal smoothness in learned feature representations, while the GCL module identifies causal relationships between historical process signatures and both historical and current parameters, and current states through attention-based causal aggregation and disentanglement techniques. By combining these complementary modules, our method achieves superior anomaly detection performance while providing interpretable insights through spatial-temporal dependency interpretation, causal disentanglement analysis, and causal attribution analysis. Experimental validation demonstrates improved detection accuracy compared to existing baselines, with attention-based mechanisms enabling the identification of specific process parameters and spatial regions contributing to anomalous behaviour. This framework facilitates proactive quality control in AM processes by bridging the gap between high-accuracy anomaly detection and practical interpretability requirements in manufacturing applications.
- New
- Research Article
- 10.3389/fnins.2025.1667360
- Jan 16, 2026
- Frontiers in Neuroscience
- Junzhuo Chen + 10 more
Background Impaired attention is a key feature of HIV-associated brain damage, and people living with HIV (PLWH) often have potential visual–auditory perceptual deficits. This study aimed to explore functional alterations in divided attention in PLWH using a parallel audio-visual spatiotemporal task with multimodal functional magnetic resonance imaging (fMRI) and to explore candidate neuroimaging markers of HIV-related attention impairment. Methods Thirty-one cognitively unimpaired PLWH and 34 healthy controls (HC) completed a divided attention task during fMRI via a modified Posner paradigm. Behavioral performance and task-related brain activation were compared between the two groups. Seed-based whole-brain functional connectivity (FC) maps were computed in resting-state fMRI (rs-fMRI) using a priori anatomical regions of interest (ROIs) from the audiovisual attention network, defined based on previous independent fMRI studies employing similar spatial–temporal attention paradigms. Results The PLWH showed lower accuracy than HC. Task-related brain activation was more extensive in PLWH, including increased activation in occipital/temporal lobes, plus frontal/parietal lobes, insula, and limbic system. Using a priori anatomical regions of interest from the audiovisual attention network as seeds, PLWH exhibited increased resting-state FC between these frontal–parietal–temporal–insular regions and bilateral posterior cerebellar lobules VIII–IX, as well as with multimodal associative cortices. Within the PLWH group, percent BOLD signal change showed significant positive correlations with HIV infection duration in a subset of task-difference ROIs—7 regions identified under spatial cueing and 13 regions identified under temporal cueing. Conclusion The HIV impairs audio-visual divided attention, with fMRI revealing neural alterations in cognitively unimpaired PLWH. These findings suggest that task-related activation patterns and resting-state connectivity measures may serve as sensitive candidate markers of HIV-related brain involvement and help identify individuals at increased risk of cognitive decline, although longitudinal studies are needed to establish their prognostic value.
- New
- Research Article
- 10.1016/j.jep.2026.121218
- Jan 15, 2026
- Journal of ethnopharmacology
- Fengming Chen + 6 more
MAGED: Multimodal Attentive Graph learning with Gene Expression Dynamics on Knowledge Graphs for TCM Target Prediction.
- New
- Research Article
- 10.4103/atmr.atmr_100_25
- Jan 14, 2026
- Journal of Advanced Trends in Medical Research
- Abdulrhman Salem Al-Harthy + 9 more
Abstract Background: The Beers Criteria are widely used to identify potentially inappropriate medicines (PIMs) in older adults. However, it does not account for pharmacogenomic variability, which can lead to adverse drug reactions (ADRs) in genetically susceptible individuals. This study aimed to develop and validate a machine learning–driven framework that integrates pharmacogenomic data into the Beers Criteria to support personalized prescribing and reduce ADR risk. Methods: We designed a hybrid neural network combining graph attention networks (GATs) and Transformers to predict clinically relevant drug–gene associations using multi-omics, pharmacogenomic and ADR datasets. In silico quantitative trait locus mapping was performed to validate predicted interactions, complemented by experimental validation using the luciferase reporter assays. Pharmacogenomic interactions confirmed through this process were proposed as contraindications for integration into the Beers Criteria. The framework was implemented using PyTorch Geometric and the Hugging Face Transformer library. Results: The GAT-Transformer model achieved high predictive performance, with an area under the receiver operating characteristic curve of 0.92, area under the precision-recall curve of 0.87 and F1 score of 0.83 for 78 high-risk Beer’s medications, outperforming baseline models. Amongst 14,520 candidate drug–gene pairs, 327 novel interactions were predicted (false discovery rate < 0.05) and 31 were experimentally validated, yielding a validation rate of 73.8%. Conclusion: This framework provides a scalable, evidence-based approach to integrate pharmacogenomics into PIM assessment, enhancing the Beers Criteria for genetically susceptible elderly populations. By identifying patient-specific risk factors for ADRs, it has the potential to improve medication safety and optimise geriatric pharmacotherapy outcomes.
- New
- Research Article
- 10.1038/s41386-025-02318-6
- Jan 13, 2026
- Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology
- Kathryn Biernacki + 6 more
From rest to focus: pharmacological modulation of the relationship between resting state dorsal attention network dynamics and task-based brain activation.
- New
- Research Article
- 10.1177/09544070251405042
- Jan 13, 2026
- Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering
- Solanki Mital Shantibhai + 1 more
Timely detection of traffic accidents is crucial for enabling rapid emergency response and minimizing road disruptions. Existing surveillance systems often struggle with accurate classification in complex environments due to limitations in processing static and dynamic features. This study presents a Traffic Vision-based Fusion Network (TVFN), enhanced by a Temporal Attentive Inception Network (TAIN), to improve accident detection accuracy and reliability. The model fuses RGB and optical flow features using a dual-feature strategy and leverages temporal attention to capture subtle motion anomalies. Evaluation was conducted on two benchmark datasets the HWID12 dataset, comprising 12 accident categories, and the Accident Detection from CCTV Footage dataset with two classes (accident and non-accident). On the HWID12 dataset, the proposed model achieved an accuracy of 99.61%, precision of 99.62%, recall of 99.61%, and F1-score of 99.60%. Similarly, on the CCTV Footage dataset, it attained 96.9% accuracy, 96% precision, 99% recall, and 96% F1-score, outperforming existing CNN, R-CNN, and SVM-based methods. These results highlight the robustness and generalization capability of the proposed framework, offering a real-time, efficient, and reliable solution for intelligent traffic surveillance and road safety enhancement.
- New
- Research Article
- 10.1088/2631-8695/ae37cf
- Jan 13, 2026
- Engineering Research Express
- Wanxiang Li + 2 more
Abstract Reliable anomaly detection is crucial for ensuring the operational safety and reducing maintenance costs of industrial equipment. However, existing anomaly detection methods for industrial equipment are limited in their ability to fuse multi-sensor data and adapt to varying operating conditions, leading to performance degradation from redundant information and poor cross-condition generalization. To address these challenges, this paper proposes a novel anomaly detection method based on information fusion of multiple operating conditions and multiple sensor sources (MOCSS), aiming to improve detection accuracy and robustness. Firstly, the proposed method employs a residual network to extract generalized features from multi-condition, multi-sensor data. Subsequently, a multi-head shrinkage graph attention network (MSGAT) is designed to adaptively learn the contribution of each sensor and suppress redundant information, achieving deep feature fusion. Furthermore, a multi-condition discriminator and a central moment discrepancy-based feature distribution constraint are introduced to align features across different operating conditions, enhancing cross-condition generalization. Finally, a one-class classifier is designed based on support vector data description to aggregate normal features into a high-dimensional hypersphere, increase the distance between abnormal features, and achieve efficient anomaly detection. Experiments on gear and 110 kV power transformer datasets validate the effectiveness of the proposed method. The results indicate that, compared with state-of-the-art methods, MOCSS demonstrates superior detection accuracy and noise robustness, achieving average accuracy improvements ranging from 11.67% to 29.64%, highlighting its potential for application in complex multi-condition and multi-sensor scenarios.
- New
- Research Article
- 10.1088/2057-1976/ae3763
- Jan 13, 2026
- Biomedical physics & engineering express
- Hao Yue + 2 more
Electroencephalogram (EEG)-based emotion recognition holds great potential in affective computing, mental health assessment, and human-computer interaction. However, EEG signals are non-stationary, noisy, and composed of multiple frequency bands, making direct feature learning from raw data particularly challenging. While end-to-end models alleviate the need for manual feature engineering, advancing the performance frontier of lightweight architectures remains a crucial and complex challenge for practical deployment. To address these issues, we propose LMSA-Net (Lightweight Multi-Scale Attention Network), a lightweight, interpretable, and end-to-end model that directly learns spatio-temporal features from raw EEG signals. The architecture integrates learnable channel weighting for adaptive spatial encoding, multi-scale temporal separable convolution for rhythm-specific feature extraction, and Sim Attention Module for parameter-free saliency enhancement. Our proposed LMSA-Net is evaluated on three benchmark datasets, SEED, SEED-IV, and DEAP, under subject-dependent protocols. It achieves top performance on SEED (65.53\% accuracy), competitive results on SEED-IV (48.52\% accuracy), and strong performance in arousal classification on DEAP, demonstrating good generalization. Ablation studies confirm the critical role of each proposed module. Frequency analysis reveals that our multi-scale temporal kernels inherently specialize in distinct EEG rhythms, validating their neurophysiological alignment. Lightweight design is evidenced by minimal parameters (7.64K) and low latency, ideal for edge deployment. Interpretability analysis further shows the model's focus on emotion-related brain regions. LMSA-Net thus delivers an efficient, interpretable, and high-performing solution. The code is available at https://github.com/rhr0411/LMSA-Net.git.
- New
- Research Article
- 10.3390/bdcc10010031
- Jan 13, 2026
- Big Data and Cognitive Computing
- Jiabin Ye + 3 more
Retrieval-Augmented Generation (RAG) is widely used for long-text summarization due to its efficiency and scalability. However, standard RAG methods flatten documents into independent chunks, disrupting sequential flow and thematic structure, resulting in significant loss of contextual information. This paper presents MOEGAT, a novel graph-enhanced retrieval framework that addresses this limitation by explicitly modeling document structure. MOEGAT constructs an Orthogonal Context Graph to capture sequential discourse and global semantic relationships—long-range dependencies between non-adjacent text spans that reflect topical similarity and logical associations beyond local context. It then employs a query-aware Mixture-of-Experts Graph Attention Network to dynamically activate specialized reasoning pathways. Experiments conducted on three public long-text summarization datasets demonstrate that MOEGAT achieves state-of-the-art performance. Notably, on the WCEP dataset, it outperforms the previous state-of-the-art Graph of Records (GOR) baseline by 14.9%, 18.1%, and 18.4% on ROUGE-L, ROUGE-1, and ROUGE-2, respectively. These substantial gains, especially the 14.9% improvement in ROUGE-L, reflect significantly better capture of long-range coherence and thematic integrity in summaries. Ablation studies confirm the effectiveness of the orthogonal graph and Mixture-of-Experts components. Overall, this work introduces a novel structure-aware approach to RAG that explicitly models and leverages document structure through an orthogonal graph representation and query-aware Mixture-of-Experts reasoning.
- New
- Research Article
- 10.36001/phmap.2025.v5i1.4340
- Jan 13, 2026
- PHM Society Asia-Pacific Conference
- Shuquan Xiao + 3 more
Aero engines are widely used in modern aviation due to their high thrust-to-weight ratio, high efficiency, and high reliability, placing greater demands on the operational safety of key components such as bearings. Traditional bearing fault diagnosis methods typically rely on vibration signals collected by a single sensor, which makes it difficult to handle challenges such as incomplete information and noise interference in industrial settings. The paper proposes an intelligent fault diagnosis model called the Time-Frequency Attention Network, which is based on a time-frequency-aware convolutional layer and a fused attention mechanism. The goal is to fully exploit the time-frequency feature information from multi-sensor signals. First, a time-frequency-aware convolutional layer is designed using a kernel function constrained by the Short-Time Fourier Transform, leveraging a complex-valued convolution structure to effectively extract non-stationary features and local instantaneous frequency variations. Subsequently, a fused attention module is constructed, introducing a dual-attention mechanism in both channel and spatial dimensions to adaptively adjust the response intensity and frequency-domain focus areas of different sensor signals. The proposed network is experimentally validated on the Harbin Institute of Technology bearing dataset, achieving an accuracy of 99.54%. The results demonstrate that the proposed method outperforms existing benchmark models in terms of fault recognition accuracy and robustness, showcasing excellent diagnostic performance and generalization ability.
- New
- Research Article
- 10.3390/math14020289
- Jan 13, 2026
- Mathematics
- Pulikandala Nithish Kumar + 2 more
Accurate volatility forecasting is essential for risk management in increasingly interconnected financial markets. Traditional econometric models capture volatility clustering but struggle to model nonlinear cross-market spillovers. This study proposes a Temporal Graph Attention Network (TemporalGAT) for multi-horizon volatility forecasting, integrating LSTM-based temporal encoding with graph convolutional and attention layers to jointly model volatility persistence and inter-market dependencies. Market linkages are constructed using the Diebold–Yilmaz volatility spillover index, providing an economically interpretable representation of directional shock transmission. Using daily data from major global equity indices, the model is evaluated against econometric, machine learning, and graph-based benchmarks across multiple forecast horizons. Performance is assessed using MSE, R2, MAFE, and MAPE, with statistical significance validated via Diebold–Mariano tests and bootstrap confidence intervals. The study further conducts a strict expanding-window robustness test, comparing fixed and dynamically re-estimated spillover graphs in a fully out-of-sample setting. Sensitivity and scenario analyses confirm robustness across hyperparameter configurations and market regimes, while results show no systematic gains from dynamic graph updating over a fixed spillover network.
- New
- Research Article
- 10.1088/1361-6501/ae378e
- Jan 13, 2026
- Measurement Science and Technology
- Yongbing Zhou + 4 more
Abstract Deep learning models have been developed and applied to the field of printed circuit boards (PCBs) surface defect detection. However, there are still limitations that prevent the model from effectively balancing detection accuracy and model com-plexity. To address this issue, a lightweight defect inspection network (LDINet) is proposed to achieve accurate and efficient detection with reduced model parameters. Firstly, to improve the ability to express tiny defects on the PCB surface, an im-proved MobileNetV3, based on a lightweight and efficient channel attention network (ECANet), is adopted as the model backbone to achieve more efficient feature extraction with fewer parameters. Subsequently, to utilize both the deep semantic feature and the shallow fine-grained feature that the backbone component extracted and enhance the detection capability with notable scale changes, the novel cross-scale feature fusion module (ICCFM) based on combining reparameterization and ECANet attention mechanism is designed, which can effectively integrate context information and detail features, and integrate more abundant small defect features. Finally, comparative experiments on public and industrial datasets demonstrate that the proposed model is better suited for deployment in PCB production line inspection devices with limited computing resources.
- New
- Research Article
- 10.36001/phmap.2025.v5i1.4505
- Jan 13, 2026
- PHM Society Asia-Pacific Conference
- Jaewoong Choi + 2 more
Dual‑fuel (DF) marine engines, capable of operating on both diesel and LNG, face significant monitoring challenges due to frequent mode switching, dual valve timing, and load variability, which create nonlinear, time‑varying dependencies among sensors. Such dynamics undermine conventional time‑series anomaly detection methods that overlook structural relationships. To address this, we propose a graph‑based anomaly detection framework tailored for DF engine monitoring. Sensor readings are modeled as nodes, with edges encoding domain‑informed physical or functional dependencies. A multi‑head Graph Attention Network (GAT)–based overcomplete autoencoder captures both local sensor behavior and global structural patterns; the expanded latent space preserves fine‑grained features and heightens sensitivity to subtle deviations. The encoder aggregates context‑aware features, and the decoder ensures graph‑consistent reconstruction. Anomalies are scored using a λ‑weighted combination of node‑level reconstruction error (RMSE) and graph‑level structural inconsistency from Graph Laplacian Smoothness (GLS). The λ parameter is optimized post hoc on validation data via F1‑score, balancing sensitivity and precision. Evaluation on ten months of DF engine data demonstrates interpretable, real‑time fault detection and sensor‑level localization, supporting practical, condition‑based maintenance.