Articles published on Deep Networks
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
23723 Search results
Sort by Recency
- New
- Research Article
- 10.1108/ir-06-2025-0222
- Mar 13, 2026
- Industrial Robot: the international journal of robotics research and application
- Ruhan He + 2 more
Purpose This study aims to address the challenges of robotic cloth folding, stemming from complex dynamics and high degrees of freedom. While existing learning-based approaches have shown promise, they often suffer from limited generalization across diverse fabrics and require extensive real-world data. To address this gap, we propose a perception-centric strategy introducing HrcbamFolding, a dual-arm system that leverages a deep network to directly map visual inputs to the key manipulation points required. This approach simplifies the complex problems of state estimation and motion planning into a structured keypoint detection task, effectively bypassing the need for explicit physical modeling of the fabric. Design/methodology/approach This study proposes HrcbamFolding, which combines a multiresolution neural network with a channel–spatial attention mechanism to spotlight task-critical fabric regions, thereby enhancing visual-perception accuracy and generalization across diverse materials. It further uses a grasp-pose prediction module that translates visual inputs directly into coordinated grasping and placement actions for each arm, which reduces motion-planning errors and improves execution efficiency. The framework is trained purely in simulation on rectangular fabrics before being assessed on three multistep folding benchmarks. Findings This study shows that in all three tasks, HrcbamFolding achieves greater execution efficiency than baseline methods. It also delivers higher folding accuracy in two tasks while maintaining competitive performance in the third. Despite being trained only on simulated rectangular cloth, the system generalizes well to real-world manipulation of nonrectangular garments such as T-shirts and shorts, requiring only minimal fine-tuning. The demonstration video is available at: https://www.youtube.com/@Jaui-g9j. Originality/value This study presents a practical dual-arm folding system featuring a novel perception architecture that yields both high accuracy and exceptional sample efficiency for sim-to-real transfer. By focusing on structural feature learning via multiresolution attention and direct action prediction, HrcbamFolding advances the state-of-the-art toward generalized and data-efficient robotic fabric manipulation, offering significant value for real-world automation.
- New
- Research Article
- 10.3390/fi18030143
- Mar 11, 2026
- Future Internet
- Amir A Ghavifekr + 3 more
One of the complicated and demanding tasks in seismology is the reliable detection of earthquakes. The key challenge is that the detection models must be applied to a specific region, and models trained on one region may not perform as well in others. The limitations of datasets for most regions of the world pose another task. Comprehensive, high-quality datasets are essential for developing robust earthquake detection algorithms. Despite these challenges, developing effective earthquake detection systems is critically important. This paper proposes a novel deep network, Earth–Transformer–LSTM (ETL), to estimate earthquake magnitude with high precision. The proposed method uses Transformer encoders as its first layer to extract profound features from the dataset. To obtain highly accurate results, the extracted data is used as the input to the Long Short-Term Memory (LSTM) neural network. Additionally, one-dimensional convolution is replaced by Multi-Layer Perceptron (MLP), which performs better in Transformer encoders’ feed-forward networks. The Turkey earthquake dataset 2000–2018 was used in this research because significant earthquakes have occurred in this region in recent years. According to the obtained results, the proposed method’s Root Mean Squared Error (RMSE) is 0.7, representing a noticeable improvement over advanced conventional models.
- New
- Research Article
- 10.3390/diagnostics16060836
- Mar 11, 2026
- Diagnostics
- Florin Mihail Filipoiu + 8 more
Background and Clinical Significance: Deep thalamic and periventricular lesions are uncommon in adults but can result in significant loss of function because of their convergence on three interdependent processes: thalamocortical state regulation, throughput of periventricular long association systems, and ventricular compartmental compliance. The resulting combination of executive control collapse, retrieval-weighted language fragility, and load-sensitive gait instability may occur early after a lesion forms an atrial/posterior horn interface, and pressure-linked autonomic symptoms may be late to develop. Screening deficits will likely be minimal and therefore underreported. Objective/Aim: To present a thalamic–atrial/posterior horn tumor case with quantified load-sensitive cognitive–language–gait dysfunction and to detail a physiology-guided, sequence-driven decompression approach emphasizing ventricular relaxation and perforator-preserving, interface-limited thalamic resection. Case Presentation: A 56-year-old female patient experienced a 3-month, rapidly progressive decline in her cognitive and language abilities. The clinical progression was not stepwise or punctuated by a single “sentinel” event. She had a moderate level of cognitive impairment consistent with both Broca’s and Wernicke’s aphasias (MoCA: 22/30) and suffered from significant interference effects and increased cost of task-switching. Her ability to generate novel responses and name objects was significantly impaired; however, she was able to repeat words and phrases appropriately. In addition, she exhibited a severe sustained attention signature and a high error rate during dual-task performance, indicating severe gait instability, although her overall global anchors were nearly neutral (GCS 15; FOUR 15/16; NIHSS 2). Nausea and vomiting occurred simultaneously with the cognitive and language decline, suggesting decreased intracranial compliance. MRI revealed a heterogeneous left-sided thalamic tumor extending into the posterior horn of the lateral ventricle. The tumor caused deformation of the lateral ventricle and midline displacement. The patient underwent microsurgical intervention using a physiology-conscious sequence of graded cerebrospinal fluid (CSF) equilibration and primary mechanical removal of the tumor from the ventricular system. Additionally, decompression of the thalamus was performed in a manner that was cognizant of the boundaries formed by the perforating arteries of the thalamus. Early resolution of pressure symptoms was noted postoperatively. Objective measures demonstrated significant improvement in the patient’s executive functioning, language skills, attentional errors, and dual-task performance stability. The patient remained functionally independent at discharge and at subsequent follow-up visits. Surveillance imaging did not demonstrate any evidence of tumor recurrence. Conclusions: The clinical presentation described above is supportive of a model in which the synergy between deep network damage and distortion of the posterior ventricular compartment amplifies network dysfunction. Additionally, the use of quantitative stress-phenotyping makes it possible to identify deep network pathology early in its course. Finally, the physiology-guided decompression approach that was used in this case has the potential to increase functional reserve in patients with pathology that requires millimeter transitions.
- New
- Research Article
- 10.1007/s42161-026-02161-8
- Mar 11, 2026
- Journal of Plant Pathology
- Prabahar Ravichandran + 3 more
Estimation of blast severity in rice with deep learning networks and canopy images from universal blast nursery (UBN)
- New
- Research Article
- 10.55041/ijsrem57466
- Mar 11, 2026
- International Journal of Scientific Research in Engineering and Management
- Dr N.Mahendiran + 1 more
Abstract Machine learning (ML) and deep learning (DL) approaches have become parts of modern malware detection systems because of their capabilities to evaluate complex and large amounts of data. Unfortunately, while many models have demonstrated strong detection accuracies in laboratory settings, they have significant limitations when placed in operational, security-critical environments. Some examples of such limitations include a lack of interpretability, exposure to adversarial evasion, high false-positive rates, and degraded performance with the passage of time. This paper proposes a Secure Interpretable Deep Convolutional Network (SIDCN) that incorporates interpretability into the learning process. In contrast to conventional black-box models and post-hoc methods for providing explanations for the behavior of model predictions, SIDCN co-optimizes the accuracy of malware detection and the stability of explanatory outputs. The proposed approach employs a method for enforcing explanation-consistency regularization that allows for the generation of stable and robust explanatory outputs under adversarial perturbations. Additionally, the instability of explanatory outputs has been used as an additional signal to identify behavior that may be abnormal or evasive. Results from both an experimental analysis and real-world attack case studies demonstrate that the proposed SIDCN yields enhanced trustworthiness, robustness and operational effectiveness compared with conventional ML/DL-based malware detection systems and is, therefore, applicable within real-time security scenarios. Keywords Malware Detection, Interpretable Deep Learning, Cybersecurity, Adversarial Attacks, Explainable AI
- New
- Research Article
- 10.3390/app16052610
- Mar 9, 2026
- Applied Sciences
- Mengxiao Cui + 2 more
Microsystem devices are widely used in key fields such as aerospace. The various contaminants generated during their manufacturing process have the characteristics of diverse forms and are easily affected by background interference, making them difficult to detect. To solve this problem, this paper proposes a surface contaminant detection transformer for microsystem devices with scale sequence feature fusion (SSFF-DETR). This model is based on the real-time detection transformer (RT-DETR) framework. The faster efficient channel attention (Faster-ECA) was constructed as the backbone network, enhancing the extraction ability and computational efficiency of key features of contaminants. By introducing the dynamic feature region collaborative attention (DFRCA) at the end of the backbone network, the contrast between contaminant features and the background was effectively enhanced, thereby improving the model’s ability to identify contaminants. An Encoder based on scale sequence feature (SSF) and triple-branch feature fusion (TFF) is designed. By enhancing multi-scale representation, it effectively retains the detailed features of contaminants in complex backgrounds and alleviates the problem of feature loss during transmission in deep networks. The experimental results show that compared with the RT-DETR model, the SFFE-DETR model has achieved an increase of 2.6% in mean average precision (mAP). At the same time, the Giga Floating-Point Operations Per Second (GFLOPs) have decreased by 2G, and the params have reduced by 0.8 M. This provides a feasible solution for the high-precision and high-efficiency automated detection of surface contaminants in microsystem devices.
- New
- Research Article
- 10.1080/19942060.2026.2637646
- Mar 9, 2026
- Engineering Applications of Computational Fluid Mechanics
- Nan Chen + 4 more
Accurate pump energy consumption forecasting in long-distance water supply systems (LWSS) is crucial for operational scheduling optimisation and demand response. However, nonlinear variations in pump parameters during long-term operation often degrade the forecasting accuracy. To address this, this study proposes a novel hybrid framework, CV-CBiLSTM-Att, which integrates Chaos Particle Swarm Optimization (CPSO)-tuned Variational Mode Decomposition (VMD) with a deep learning network, utilising hydraulic flow rate as the primary predictor to capture the intrinsic nonlinear relationship between flow dynamics and pump energy consumption. Specifically, the CPSO algorithm is employed to minimise envelope entropy, globally searching for the optimal decomposition mode number (K) and penalty factor (α). This adaptive decomposition effectively disentangles the non-stationary flow rate signal into band-limited Intrinsic Mode Functions (IMFs), avoiding mode mixing and residual noise. Subsequently, a Convolutional Neural Network (CNN) extracts local invariant features from the multiscale IMFs, while a Bidirectional Long Short-Term Memory (BiLSTM) network captures long-range temporal dependencies. Crucially, an Attention mechanism is integrated to assign adaptive weights to pivotal hidden states, thereby enhancing the model's sensitivity to peak-valley transitions. Validated against 11 benchmark models using real-world LWSS operational data, the proposed framework demonstrates superior robustness. Experimental results indicate that the CV-CBiLSTM-Att model reduces the Root Mean Square Error (RMSE) by 68.06% compared to the baseline LSTM. Further, the model exhibits exceptional distributional consistency, achieving a Nash-Sutcliffe Efficiency (NSE) of 0.972, a Kling-Gupta Efficiency (KGE) of 0.976, and a negligible systematic bias (PBIAS < 0.1%), confirming its stability in capturing peak-to-valley energy dynamics. These findings verify that the proposed framework offers a highly accurate and reliable approach for energy management in LWSS.
- New
- Research Article
- 10.3389/fenrg.2026.1770773
- Mar 9, 2026
- Frontiers in Energy Research
- Xiaojuan Chen + 1 more
Correction: Grey wolf optimization–based deep echo state network for time series prediction
- New
- Research Article
- 10.3390/electronics15051108
- Mar 7, 2026
- Electronics
- Syed Mahedi Hasan + 2 more
Respiratory rate (RR) is a critical vital sign for the early detection of hypoxia and respiratory deterioration, yet its continuous monitoring remains challenging in clinical environments. Photoplethysmography (PPG) provides a non-invasive source of physiological information from which respiratory dynamics can be inferred. In this study, numerical physiological features derived from PPG data were used to comparatively evaluate multiple deep learning models for respiratory rate estimation. Fixed-length sliding windows were constructed from the dataset and used to train five neural network architectures: a Deep Feedforward Neural Network (DFNN), unidirectional and bidirectional Recurrent Neural Networks (RNN, Bi-RNN), and unidirectional and bidirectional Long Short-Term Memory networks (LSTM, Bi-LSTM). Model performance was assessed using mean absolute error (MAE), root mean squared error (RMSE), coefficient of determination (R2), and computational runtime. Results indicate that models incorporating temporal dependencies outperform the static feedforward baseline, achieving MAE values as low as 0.521 breaths/min, making them competitive with or lower than previously reported PPG-based approaches. These findings highlight the effectiveness of temporal deep learning models for respiratory rate estimation from PPG-derived numerical features and provide insight into accuracy–efficiency trade-offs relevant to real-time monitoring applications.
- New
- Research Article
- 10.1109/jbhi.2026.3669366
- Mar 4, 2026
- IEEE journal of biomedical and health informatics
- Du Jiang + 6 more
High quality medical images are the foundation of clinical diagnosis and treatment, but their quality may decrease due to imaging noise, artifacts, and uneven lighting. To address this issue, this paper proposes a novel Multi-scale Deep Residual Shrinkage Generative Adversarial Network (MDRSGAN) for non paired medical image enhancement. Its core innovations include: (1) Adopting a customized generator with learnable channel shared soft threshold (DRSN-CS), which can achieve hierarchical feature extraction and adaptive noise suppression; (2) Combining dual core discriminator to ensure global statistical consistency and high fidelity of local structure; (3) Introducing content perception loss and lighting loss to optimize overall details and image features. The performance of MDRSGAN is validated on fundus retina, endoscope, and self built NIR-II mouse dataset, and its performance is superior to five mainstream methods, such as a 5% increase in SNR for NIR-II mouse image enhancement. Downstream retinal vessel segmentation experiments show that IoU and DSC achieved 13% and 8% improvement, respectively, demonstrating the clinical applicability and performance advantages of this method.
- New
- Research Article
- 10.1109/tnnls.2026.3665811
- Mar 4, 2026
- IEEE transactions on neural networks and learning systems
- Xiangyu Shao + 4 more
The intrinsic memory and nonlocality that allow fractional-order calculus to capture complex dynamical behaviors also pose significant challenges for accurate modeling and stable control. This article presents a unified data-driven framework that simultaneously addresses these challenges through three key innovations. First, we propose a fractional-order deep Lagrangian network (DeLaN) with a Transformer-like structure, fPLCS-DeLaN, to learn system's inherent fractional-order behaviors directly from uniformly sampled data. It not only enforces fractional-order Lagrangian structure by integrating key physical priors, but also enhances capturing ability of memory effects by incorporating long-short-term convolutional self-attention mechanism. Second, we develop a hybrid network-based disturbance observer, T2F-CRNN, which synergizes CNN's temporal feature extraction, hierarchical recurrence, and interval-based fuzzy inference to robustly estimate uncertainties with unknown nonuniform bounds and capture temporal dependencies. Third, we establish a fully fractional-order controller with practical finite-time convergence. It incorporates input saturation compensation and sliding mode constraints to ensure robustness and high performance. Simulations show that fPLCS-DeLaN achieves modeling errors at least one order of magnitude lower with less than a 15% increase in computational time. The proposed fractional-order controller reduces transient and steady-state tracking errors by 23.1% and 87.6% compared to state-of-the-art controllers, respectively. Experiments on a soft manipulator platform further demonstrate consistent superiority in model learning and tracking performance.
- New
- Research Article
- 10.1038/s41746-026-02499-4
- Mar 3, 2026
- NPJ digital medicine
- Junjun Huang + 12 more
Bladder cancer is one of the most prevalent malignancies of the urinary system and is associated with high morbidity and mortality. With advances in medical image analysis, deep learning has shown promise for automated bladder cancer classification using magnetic resonance imaging (MRI). However, clinical deployment remains challenging due to substantial inter-center distributional discrepancies and limited feature discriminability between non-muscle-invasive bladder cancer (NMIBC) and muscle-invasive bladder cancer (MIBC). To address these challenges, we propose a Domain-Adaptive Deep Contrastive Network (DADCNet) for MRI-based bladder cancer classification. The proposed framework jointly incorporates source- and target-domain samples during feature learning to obtain domain-invariant yet discriminative representations, thereby improving cross-center generalization. In addition, a deep contrastive learning strategy is introduced to enhance inter-class separability and intra-class compactness, leading to more robust classification. Experiments conducted on a multi-center bladder cancer MRI dataset demonstrate that DADCNet consistently outperforms existing convolutional neural network- and Transformer-based methods, achieving an accuracy of 0.955, an F1-score of 0.955, and an area under the curve of 0.991.
- New
- Research Article
- 10.1117/1.jbo.31.3.036002
- Mar 3, 2026
- Journal of biomedical optics
- Er Ouyang + 4 more
Light-field microscopy (LFM) is a scanning-free 3D imaging technique that is useful for observing dynamic biological systems due to its unique capability to capture both spatial and angular information from samples in a single exposure. However, LFM suffers from the spatial-angular information trade-off associated with microlens arrays, and its spatial resolution is usually unsatisfactory for fine-structure imaging. To overcome this bottleneck, we introduce a deep-learning-based image fusion technique that combines LFM images with Fourier LFM (FLFM) images. The high spatial resolution of FLFM is combined with the dense angular acquisition capability of LFM to improve 3D image reconstruction quality. The deep learning network was trained with LFM, FLFM, and epipolar plane image data. The proposed neural network employs specialized feature extraction modules for each modality, with a U-Net backbone for 3D reconstruction, and integrates a hierarchical cascade-based result-level fusion strategy to jointly optimize multimodal features. This approach significantly enhances detail preservation and depth recovery in the final output. Results obtained using a publicly available dataset of synthetic tubulins demonstrate that the proposed method outperforms state-of-the-art techniques. Quantitatively, it achieved a peak signal-to-noise ratio (PSNR) of 38.4729 and a structural similarity index measure (SSIM) of 0.9876, significantly outperforming both traditional algorithms and single-modality deep learning approaches. Furthermore, validation on a mouse brain blood vessels dataset confirms the effectiveness of the method in reconstructing biological structures, achieving a PSNR of 35.0548 and an SSIM of 0.8424. We introduce an approach that combines LFM with FLFM, providing an efficient and reliable solution for practical LFM applications. The deep-learning-based framework demonstrates significant potential to simultaneously accelerate imaging acquisition and enhance 3D reconstruction quality, offering further possibilities for computational microscopy.
- New
- Research Article
- 10.1016/j.inffus.2025.103748
- Mar 1, 2026
- Information Fusion
- Shuoling Zhou + 7 more
Multi-tissue deep fusion network for prediction of pulmonary metastasis in hepatocellular carcinoma
- New
- Research Article
- 10.1002/mp.70346
- Mar 1, 2026
- Medical physics
- Xiaohong Wang + 5 more
Accurate dose prediction is challenged by the lack of available training samples and the rapid evolution of radiotherapy techniques. A cross-technique transfer learning strategy was developed to predict the dose distribution for radiotherapy planning using limited training samples. Data were collected from 154 patients with nasopharyngeal carcinoma: 60 treated with intensity-modulated radiotherapy (IMRT) and 94 treated with volumetric modulated arc therapy (VMAT). The Res-U Net was selected as the base deep learning network. Cross-technique models were pretrained on the IMRT dataset and subsequently fine-tuned on VMAT data using limited samples (five and seven cases). Independent models were trained from scratch using the same limited samples, while a standard model trained on the full VMAT training set served as the reference. Model performance was evaluated on a test set using metrics including the dose-volume histogram (DVH), voxel-based mean absolute error (MAE), and the Dice similarity coefficient (DSC) of the isodose volume. The cross-technique models exhibited clinically acceptable performance with only five training samples and were comparable to the standard model (MAE deviation: 0.15%, p>0.01 after Bonferroni correction; DSC deviation: 0.11%-0.72%). Performance improved further with seven training samples (MAE deviation: 0.05%, p>0.01; DSC deviation: 0.02%-0.40%). However, the independent models trained with five or seven samples showed significantly inferior performance (five samples: MAE deviation: 1.14%, p<0.01, DSC deviation: 0.98%-2.48%; seven samples: MAE deviation: 0.50%, p<0.01, DSC deviation: 0.48%-1.05%). The cross-technique models accurately and reliably predicted the dose distribution for a new radiotherapy technique using a limited sample size.
- New
- Research Article
- 10.1016/j.est.2026.120366
- Mar 1, 2026
- Journal of Energy Storage
- Shilpa Dnyaneshwar Ghode + 1 more
Deep belief network based reinforcement learning for energy management in hybrid electric vehicles
- New
- Research Article
- 10.1016/j.applthermaleng.2026.129695
- Mar 1, 2026
- Applied Thermal Engineering
- Ali Mirzagoli Ganji + 3 more
Physics-informed deep operator network for real-time dynamic simulation of extruded-absorber photovoltaic/thermal collectors
- New
- Research Article
- 10.4308/hjb.33.3.739-749
- Mar 1, 2026
- HAYATI Journal of Biosciences
- Yanti Ariyanti + 9 more
Why populations persist in active volcanic zones poses a fundamental challenge to risk perception models. We propose that perceptual advantages wherein tangible and intangible benefits filtered through local worldviews underpin this resilience. Through a thematic analysis of an open-ended question/answer study of communities in Mount Semeru, Indonesia, we identify six core frameworks of perceptual advantages. Analysis of Multiple Response Categorical Variable (MRCV) reveals that residents of high-risk zones cognitively amplify bio-cultural and livelihood benefits, showing significantly higher odds of emphasizing BIRTH/FAMILY (lineage/kin ties, birthright, and deep social networks that create an intergenerational connection to the land) and ECONOMIC (livelihood opportunities, including agriculture and volcano-driven tourism) advantages, while safe-zone residents prioritize ambient CLIMATE-related benefits (a preferred quality of life attributed to the region’s cool, fresh air and superior air quality). Crucially, the influence of gender is context-dependent, as significant disparities in perception vanish within the high-risk zone. In this environment, the shared experience of chronic volcanic threat supersedes gender distinction to foster a "community of fate". Within this collective, a shared risk-benefit calculus and a unified identity override individual perspectives. These findings demonstrate that persistence in hazardous environments reflects an active cognitive recalibration of risk and benefit, necessitating disaster policies that integrate these perceptual realities.
- New
- Research Article
- 10.1016/j.cageo.2026.106122
- Mar 1, 2026
- Computers & Geosciences
- Yingjie Ma + 3 more
Prediction of natural gamma and neutron porosity based on waveform structures of elastic parameters using a closed-loop deep learning fusion network
- New
- Research Article
- 10.1016/j.engappai.2026.113783
- Mar 1, 2026
- Engineering Applications of Artificial Intelligence
- Jiahao Shen + 5 more
Progressive deep feature learning network based on fault-aware deformable convolution and its application in railway defect visual inspection