Articles published on knowledge-distillation
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
4488 Search results
Sort by Recency
- Research Article
- 10.1016/j.asoc.2026.114897
- May 1, 2026
- Applied Soft Computing
- Karthick Sharma + 1 more
Data-centric single teacher guided knowledge distillation for alleviating sub-optimal supervision in image classification
- Research Article
- 10.1016/j.knosys.2026.115713
- May 1, 2026
- Knowledge-Based Systems
- Pengchen Liang + 4 more
Task-specific knowledge distillation from the vision foundation model for enhanced medical image segmentation
- Research Article
- 10.1016/j.ins.2026.123126
- May 1, 2026
- Information Sciences
- Ping Li + 3 more
Class semantics guided knowledge distillation for few-shot class incremental learning
- Research Article
- 10.1016/j.eswa.2026.131204
- May 1, 2026
- Expert Systems with Applications
- Wenming Cao + 2 more
Hierarchical joint contrastive learning with knowledge distillation for self-supervised 3D skeleton-based action recognition
- Research Article
- 10.1016/j.knosys.2026.115690
- May 1, 2026
- Knowledge-Based Systems
- Ning Li + 5 more
Context-aware knowledge distillation for anomaly detection
- Research Article
- 10.1016/j.future.2025.108253
- May 1, 2026
- Future Generation Computer Systems
- Jiali Zheng + 3 more
CFLKD: Clustered federated learning via cross-group knowledge distillation
- Research Article
- 10.1016/j.bspc.2026.109447
- May 1, 2026
- Biomedical Signal Processing and Control
- Wajid Ali + 3 more
Enhancing cancer detection with a lightweight knowledge distillation approach for Multi-Class image classification
- Research Article
- 10.1016/j.eswa.2026.131393
- May 1, 2026
- Expert Systems with Applications
- Bolei He + 3 more
D2A2: Enhancing LLM knowledge distillation efficiency and performance with difficulty-aware and adaptive distillation framework
- Research Article
- 10.1016/j.neunet.2026.109050
- Apr 30, 2026
- Neural networks : the official journal of the International Neural Network Society
- Yilong Chen + 4 more
Non-target divergence hypothesis: Toward understanding modality differences in cross-modal knowledge distillation.
- Research Article
- 10.1016/j.slast.2026.100422
- Apr 30, 2026
- SLAS technology
- Jyotirmayee Rautaray + 7 more
Retraction notice to "Leveraging FastViT Based Knowledge Distillation with EfficientNet-B0 for Diabetic Retinopathy Severity Classification" [SLAS Technology 33 (2025) 100325
- Research Article
- 10.1007/s10994-026-07016-y
- Apr 29, 2026
- Machine Learning
- Sonakshi Garg + 1 more
Abstract Large Language Models (LLMs) have demonstrated exceptional capabilities in language understanding and generation, but their large-scale architecture poses significant challenges in deployment and inference, such as increased computational demands and slower processing times. While various techniques like model pruning, knowledge distillation, and quantization have been developed to compress LLMs, they often result in task-specific compression, limiting the model’s versatility. Additionally, LLMs face privacy risks due to their potential to memorize and reproduce sensitive training data, raising concerns when deployed in real-world applications. To address these challenges, we propose a novel methodology PrunePrivyTune that combines efficient model compression with privacy preserving fine-tuning. Our approach leverages pairwise cosine similarity to identify redundant layers in transformer models, enabling structural pruning that reduces model size without compromising performance. After pruning, we apply Low-Rank Adaptation (LoRA) with DPSGD to fine-tune the model. This ensures that fine-tuning process is both efficient and privacy-preserving, outperforming training and preventing the model from memorizing sensitive data. Later on, we generated synthetic data using the fine-tuned model and subsequently conducted a training data extraction attack to assess the model’s privacy vulnerabilities, in terms of perplexity and BERTScore. Our framework demonstrates that the proposed methodology effectively reduces the inference time through model compression and pruning compliments privacy, followed by private fine-tuning. Additionally, our privacy risk assessment indicates that integrating DP successfully mitigates the risk of the model’s memorization. This approach upholds strong privacy guarantees, making it highly suitable for real-time applications and deployment in sensitive domains where data confidentiality is paramount.
- Research Article
- 10.1186/s12870-026-08818-x
- Apr 29, 2026
- BMC plant biology
- Huiling Jiang + 6 more
Plant diseases threaten global agriculture, and deep learning-based disease recognition has become crucial for addressing this challenge. While DenseNet excels in plant disease classification due to its dense connectivity, its large size limits deployment on resource-constrained edge devices. This paper proposes Connection-Aware DenseNet Pruning (CADP), achieving efficient compression through three collaborative modules. First, the EdgePrune module explicitly models inter-channel feature flows via an edge weight network, using dual-channel importance scoring that fuses activation correlation and gradient information to remove redundant connections while preserving critical propagation paths. Second, connection-guided CP decomposition leverages EdgePrune's importance information, adaptively assigning differentiated ranks through the Connection Importance Index (CII) to balance preservation of critical layers with deep compression of secondary layers. Third, dual-stream knowledge distillation integrates throughout post-pruning and post-decomposition fine-tuning, combining output-level soft labels and intermediate spatial attention transfer to recover compression losses. CADP achieves 88% parameter reduction and 89% computational savings on DenseNet-121, maintaining 99.67% and 99.66% accuracy on PlantVillage and RiceLeaf datasets, achieving competitive accuracy with significantly fewer parameters. This provides a promising approach for resource-constrained deployment with potential generalizability and practical value.
- Research Article
- 10.1063/5.0324705
- Apr 28, 2026
- The Journal of chemical physics
- Feranmi V Olowookere + 4 more
Molecular dynamics simulations are an integral tool for studying the atomistic behavior of materials under diverse conditions. However, they can be computationally demanding in wall-clock time, especially for large systems, which limits the time and length scales accessible. Coarse-grained (CG) models reduce computational expense by grouping atoms into simplified representations commonly called beads, but sacrifice atomic detail and introduce mapping noise, complicating the training of machine-learned surrogates. Moreover, because CG models inherently include entropic contributions, they cannot be fit directly to all-atom (AA) energies, leaving instantaneous, noisy forces as the only state-specific quantities available for training. Here, we apply a knowledge distillation framework by first training an initial CG neural network potential (the teacher) solely on AA-mapped forces to denoise those labels, then distill its force and energy predictions to train refined CG models (the student) in both single- and ensemble-training setups while exploring different force and energy target combinations. We validate this framework on a complex molecular fluid-a deep eutectic solvent-by evaluating two-, three-, and many-body properties and compare the CG and AA results. Our findings demonstrate that training a student model on ensemble teacher-predicted forces and per-bead energies improve the quality and stability of CG force fields.
- Research Article
- 10.3390/systems14050476
- Apr 28, 2026
- Systems
- Jiangchuan Liu + 3 more
As a common brain-computer interface (BCI) paradigm, electroencephalogram (EEG)-based motor imagery provides a critical pathway for both assistive technology to (restoring communication and control) and active rehabilitation (promoting neural plasticity and functional recovery). Domain adaptation has been shown to effectively enhance the decoding performance of motor intentions for target subjects by leveraging labeled data from source subjects. However, EEG data from source subjects often contains extensive personal privacy, and the direct access to source EEG data easily leads to privacy leakage issues. An important research topic is to achieve domain adaptation without directly accessing the source subjects’ raw data. To address this challenge, a privacy-preserving source-free domain adaptation framework, termed Transformer-based SFDA with Class-balanced Multicentric Dynamic Pseudo-labeling (T-CMDP), is proposed for cross-subject motor-imagery EEG classification. This framework consists of three coupled stages. In the source model training stage, a Transformer-based encoder combined with Riemannian manifold-aware feature extraction is employed to learn transferable and discriminative EEG feature representations. In the source-free target adaptation stage, only the pretrained source model is transferred to the target domain and adapted through knowledge distillation and information maximization, without accessing raw source EEG data. In the self-supervised learning stage, class-balanced multicentric prototypes and high-confidence pseudo-label updates are introduced to progressively refine the target-domain decision boundaries. Extensive experiments on three motor-imagery EEG datasets demonstrate that the proposed T-CMDP framework consistently outperforms eleven representative baselines from traditional machine learning, deep learning, and source-free transfer approaches, achieving average accuracies of 56.85%, 76.34%, and 74.49%, respectively. These results indicate that T-CMDP effectively alleviates inter-subject EEG distribution discrepancies and ensures the privacy preserving of source subjects, thereby facilitating more reliable and practical deployment of EEG-based BCI systems.
- Research Article
- 10.1007/s11831-026-10598-4
- Apr 27, 2026
- Archives of Computational Methods in Engineering
- A Aruna Gladys + 5 more
Building Expert Small Models: A Comprehensive Survey of Model Compression, Knowledge Distillation, and Augmented Inference
- Research Article
- 10.1088/1361-6501/ae5df5
- Apr 24, 2026
- Measurement Science and Technology
- Ma Xiumin + 5 more
Augmenting gas turbine health monitoring: invariant feature extraction from unlabeled blade temperature profiles via GAF and knowledge distillation
- Research Article
- 10.3390/machines14050476
- Apr 24, 2026
- Machines
- Liyuan Yu + 4 more
Machine condition monitoring increasingly depends on distributed sensing, edge intelligence, and cloud analytics, yet timely and trustworthy health assessment remains constrained by latency, bandwidth, privacy, and reliability requirements. Cloud-only architectures provide scalable computation and historical data integration but often fail to satisfy real-time industrial needs, whereas edge-only deployments are limited by restricted computing resources and fragmented local knowledge. Edge–cloud collaboration has, therefore, emerged as a practical architecture for distributing perception, inference, learning, and coordination across hierarchical industrial systems. This review examines 147 publications on edge–cloud collaboration for machine condition monitoring published between 2019 and February 2026. A four-dimensional taxonomy is developed to organize the literature into model-centric, data-centric, resource and task-centric, and architecture and trust-centric mechanisms, while 13 survey and review papers are considered separately for contextual comparison. On this basis, the review analyzes representative collaboration mechanisms and enabling technologies, with particular attention to federated learning, transfer learning, knowledge distillation, digital twins, and deep reinforcement learning, and surveys their deployment in manufacturing, energy, transportation, and infrastructure monitoring scenarios. The literature remains dominated by model-centric collaboration, while architecture and trust-centric studies increasingly provide the system foundations required for practical deployment. The review further identifies major open challenges, including robust generalization under changing operating conditions, efficient data transmission, real-time resource coordination, interoperability, and trustworthy large-scale deployment, and outlines future directions in foundation-model-based edge–cloud collaboration, continual learning, dual digital twins, trustworthy collaboration, and privacy-preserving industrial ecosystems.
- Research Article
- 10.1007/s11227-026-08525-2
- Apr 24, 2026
- The Journal of Supercomputing
- Alireza Taremi + 3 more
Improved detection of abnormal cervical tissue growth using cascade knowledge distillation based deep learning
- Research Article
- 10.1007/s42452-026-08618-w
- Apr 24, 2026
- Discover Applied Sciences
- Mahaswetha S + 1 more
Segmentation-guided knowledge distillation framework for interpretable breast cancer classification
- Research Article
- 10.3390/ani16091301
- Apr 23, 2026
- Animals : an Open Access Journal from MDPI
- Jie Hu + 7 more
Cattle behavior constitutes important phenotypic information reflecting animals' health status, activity level, and welfare condition, and is therefore of considerable significance for automated monitoring and precision management in smart livestock farming. However, under complex barn conditions, cattle behavior recognition is easily affected by factors such as illumination variation, partial occlusion, background interference, and individual differences, thereby reducing recognition stability and generalization capability. To address these challenges, this study proposes a pose-driven method for cattle behavior recognition in complex barn environments. First, a 16-keypoint annotation scheme suitable for describing bovine posture, termed cow16, was constructed. Based on this scheme, OpenPose was employed to extract heatmaps (HMs) and part affinity fields (PAFs), which were then used to build an intermediate HM/PAF posture representation. Subsequently, this representation was taken as the input to a lightweight convolutional neural network for classifying three behavioral categories: stand, walk, and lying. On this basis, class-imbalance correction during training and a multi-random-seed logits ensemble strategy during inference were further introduced. In addition, knowledge distillation was adopted to transfer knowledge from a high-performance teacher model to a lightweight student model. Experimental results demonstrate that training-stage class-imbalance correction and inference-stage multi-random-seed logits ensembling exhibit strong complementarity; when combined, the AB configuration improves the test-set Macro-F1 by 3.83 percentage points. Moreover, the distilled student model still achieves competitive recognition performance while maintaining 1× inference cost, indicating a favorable trade-off between accuracy and efficiency. This study provides a useful reference for deployment-oriented cattle behavior recognition in smart farming scenarios and offers a lightweight technical basis for subsequent practical applications.