Discovery Logo
Sign In
Search
Paper
Search Paper
R Discovery for Libraries Pricing Sign In
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
Discovery Logo menuClose menu
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
features
  • Audio Papers iconAudio Papers
  • Paper Translation iconPaper Translation
  • Chrome Extension iconChrome Extension
Content Type
  • Journal Articles iconJournal Articles
  • Conference Papers iconConference Papers
  • Preprints iconPreprints
  • Seminars by Cassyni iconSeminars by Cassyni
More
  • R Discovery for Libraries iconR Discovery for Libraries
  • Research Areas iconResearch Areas
  • Topics iconTopics
  • Resources iconResources

Related Topics

  • Knowledge Distillation Method
  • Knowledge Distillation Method
  • Pre-trained Model
  • Pre-trained Model

Articles published on knowledge-distillation

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
4488 Search results
Sort by
Recency
  • Research Article
  • Cite Count Icon 1
  • 10.1609/aaai.v40i9.37684
ConsistTalk: Intensity Controllable Temporally Consistent Talking Head Generation with Diffusion Noise Search
  • Mar 14, 2026
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Zhenjie Liu + 4 more

Recent advancements in video diffusion models have significantly enhanced audio-driven portrait animation. However, current methods still suffer from flickering, identity drift, and poor audio-visual synchronization. These issues primarily stem from entangled appearance-motion representations and unstable inference strategies. In this paper, we introduce ConsistTalk, a novel intensity-controllable and temporally consistent talking head generation framework with diffusion noise search inference. First, we propose an optical flow-guided temporal module (OFT) that decouples motion features from static appearance by leveraging facial optical flow, thereby reducing visual flicker and improving temporal consistency. Second, we present an Audio-to-Intensity (A2I) model obtained through multimodal teacher-student knowledge distillation. By transforming audio and facial velocity features into a frame-wise intensity sequence, the A2I model enables joint modeling of audio and visual motion, resulting in more natural dynamics. This further enables fine-grained, frame-wise control of motion dynamics while maintaining tight audio-visual synchronization. Third, we introduce a diffusion noise initialization strategy (IC-Init). By enforcing explicit constraints on background coherence and motion continuity during inference-time noise search, we achieve better identity preservation and refine motion dynamics compared to the current autoregressive strategy. Extensive experiments demonstrate that ConsistTalk significantly outperforms prior methods in reducing flicker, preserving identity, and delivering temporally stable, high-fidelity talking head videos.

  • Research Article
  • 10.1609/aaai.v40i13.38028
Asymmetric Cross-Modal Knowledge Distillation: Bridging Modalities with Weak Semantic Consistency
  • Mar 14, 2026
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Riling Wei + 5 more

Cross-modal Knowledge Distillation has demonstrated promising performance on paired modalities with strong semantic connections, referred to as Symmetric Cross-modal Knowledge Distillation (SCKD). However, implementing SCKD becomes exceedingly constrained in real-world scenarios due to the limited availability of paired modalities. To this end, we investigate a general and effective knowledge learning concept under weak semantic consistency, dubbed Asymmetric Cross-modal Knowledge Distillation (ACKD), aiming to bridge modalities with limited semantic overlap. Nevertheless, the shift from strong to weak semantic consistency improves flexibility but exacerbates challenges in knowledge transmission costs, which we rigorously verified based on optimal transport theory. To mitigate the issue, we further propose a framework, namely SemBridge, integrating a Student-Friendly Matching module and a Semantic-aware Knowledge Alignment module. The former leverages self-supervised learning to acquire semantic-based knowledge and provide personalized instruction for each student sample by dynamically selecting the relevant teacher samples. The latter seeks the optimal transport path by employing Lagrangian optimization. To facilitate the research, we curate a benchmark dataset derived from two modalities, namely Multi-Spectral (MS) and asymmetric RGB images, tailored for remote sensing scene classification. Comprehensive experiments exhibit that our framework achieves state-of-the-art performance compared with 7 existing approaches on 6 different model architectures across various datasets.

  • Research Article
  • 10.1088/1361-6501/ae4cab
DADF: dual-attention distillation framework using adaptive feature fusion for traffic object recognition
  • Mar 13, 2026
  • Measurement Science and Technology
  • Yihong Zhang + 5 more

Abstract Training lightweight traffic object detectors with knowledge distillation (KD) is crucial for intelligent transportation systems under resource-limited conditions. However, most KD approaches still rely on fixed thresholding to generate binary masks for student feature reconstruction, disrupting semantic continuity in complex traffic scenes. In this work, a Dual Attention Distillation Framework (DADF) using Adaptive Feature Fusion is proposed for traffic object recognition. Instead of binary masks, DADF produces Softmax-based normalized distributions soft masks along both spatial and channel dimensions, thereby more effectively regulating the continuity of semantic information. To adaptively balance spatial and channel cues, teacher feature variances are utilized for weighting and fusing the masks into a unified attention map. Meanwhile, a multilayer perceptron (MLP) generator is subsequently used to reconstruct the masked student features. Finally, the distillation process is optimized by minimizing the mean squared error (MSE) between the reconstructed and teacher features. We extensively validated the effectiveness of the DADF method across multiple datasets and detectors. On Cityscapes, it boosts YOLOv8 mAP from 41.9% to 44.1%, while cutting parameters and GFLOPs by 73.0% and 71.6%, and raising inference speed from 188.7 to 202.8 FPS. On KITTI, DADF boosts the RT-DETR mAP from 85.8% to 90.5%, even surpassing its teacher model. It also cuts parameters by 31.0%, reduces GFLOPs by 32.5%, and increases speed from 33.8 to 35.3 FPS. These results highlight DADF’s suitability for traffic measurement applications under resource constraints.

  • Research Article
  • 10.3390/bioengineering13030339
Stable Longitudinal Screening of Latent Physiological Dysregulation from Psychometric Data Using Machine Learning.
  • Mar 13, 2026
  • Bioengineering (Basel, Switzerland)
  • Alin Adrian Alecu

Physiological dysregulation arising from chronic stress is a key mechanism linking psychosocial factors to long-term health outcomes, yet early identification typically relies on invasive or resource-intensive measurements. This study evaluates whether high-dimensional psychometric survey data can support scalable, non-invasive screening for latent physiological dysregulation. Using longitudinal data from the Midlife in the United States (MIDUS) Waves 2 and 3, we develop a screening-oriented modeling framework that separates longitudinal risk estimation from deployable screening model construction. Physiological targets are defined across inflammatory, metabolic, and neuroendocrine domains using three canonical allostatic load formulations. A teacher-ranking-pruning-student pipeline combines stable feature ranking, parsimony-driven dimensionality reduction, and knowledge distillation. Predictor dimensionality is reduced by more than an order of magnitude without loss of screening performance. Distilled student models consistently outperform linear, tree-based, and direct neural baselines, achieving area under the receiver operating characteristic curve values up to approximately 0.78 and substantial precision-recall lift over baseline prevalence. Longitudinal information is exploited during model development but not required at inference, enabling deployment using psychometric data alone. These findings demonstrate the feasibility of non-invasive screening for latent physiological dysregulation and provide a generalizable framework for translating longitudinal cohort data into deployable population health tools.

  • Research Article
  • 10.1109/tpami.2026.3672655
Co-Boosting++: Coupled Optimization of Data and Ensemble for One-Shot Federated Learning.
  • Mar 12, 2026
  • IEEE transactions on pattern analysis and machine intelligence
  • Xun Yang + 5 more

One-shot Federated Learning (OFL) has emerged as a promising paradigm, enabling global model training with minimal communication overhead. In OFL, the server model is usually distilled from an ensemble of pre-trained client models, while the ensemble also facilitates synthetic data generation for the knowledge distillation process. Prior works show that the performance of the final model is fundamentally tied to both the quality of the synthetic data and the ensemble. However, existing methods often optimize these two components separately, overlooking their interaction. To address this coupled optimization problem and provide a unified solution to the dual challenges of data and model heterogeneity inherent in OFL, we introduce Co-Boosting++, a novel OFL framework where synthetic data generation and ensemble construction mutually enhance each other in an iterative fashion. First, we fix the ensemble and generate hard samples in an adversarial manner. These samples are crucial for enhancing the robustness of knowledge transfer, as they challenge the model to generalize better, thereby improving quality of the synthetic data and subsequent distillation process. Second, leveraging these hard samples, we enhance the ensemble via a Mixture of Experts (MoE) mechanism. MoE allows dynamic adjustment of ensemble weights based on the generated hard samples, which enables the ensemble to better capture diverse and heterogeneous knowledge from client models. Furthermore, we extend Co-Boosting++ to support the simultaneous generation of multiple heterogeneous target models, enabling efficient adaptation to diverse device constraints. Extensive experiments on benchmark datasets demonstrate that Co-Boosting++ consistently outperforms state-of-the-art methods due to its coupled optimization of data and ensemble quality. Additionally, Co-Boosting++ is highly practical in real-world model market scenarios, requiring no local training modifications, additional transmissions, or restrictions on client model architectures. Our code is available at https://github.com/rong-dai/Co-Boosting-PP.

  • Research Article
  • 10.1177/24056456261426603
Separate Reverse: A Gradient-Conflict-Free Training Framework for Multi-Exit Transformers
  • Mar 11, 2026
  • Web Intelligence
  • Lisha Gao + 6 more

Pretrained transformer models have demonstrated excellent performance on complex tasks. To improve their inference efficiency, recent studies have introduced the multi-exit mechanism, which enables early exiting through multiple intermediate classifiers. However, the deep architectures of pretrained transformers cause severe gradient conflicts during multi-exit fine-tuning, leading to degraded shallow-exit accuracy and reduced early-exit efficiency. To address this issue, we propose Separate Reverse, a multi-exit training strategy specifically designed for pretrained transformer models. The method iteratively integrates reverse iterative optimization and hierarchical knowledge distillation from deeper to shallower exits, maintaining pretrained parameter integrity, enhances the representation capacity of shallow exits, and coordinates gradient updates across exits to achieve a balanced optimization between shallow and deep classifiers. Experiments on multiple GLUE benchmark datasets using BERT demonstrate that our method significantly improves shallow-exit accuracy, maintains main-exit performance, and accelerates inference for simple samples by a large margin.

  • Research Article
  • 10.3390/s26061780
Toward Energy-Efficient and Low-Carbon Intrusion Detection in Edge and Cloud Computing Based on GreenShield Cybersecurity Framework.
  • Mar 11, 2026
  • Sensors (Basel, Switzerland)
  • Abdullah Alshammari

The fast growth of edge-cloud computing infrastructures has increased the cybersecurity burden even as it has substantially amplified the energy use and carbon footprint of intrusion detection systems (IDSs). In order to overcome this challenge, this paper suggests GreenShield, which is a framework of low-carbon cybersecurity involving lightweight cryptography, deep learning that is energy efficient, and carbon conscious system optimization across distributed edges and in cloud setup. GreenShield employs a hierarchical federated learning architecture with integrated knowledge distillation and a carbon-aware scheduling controller that dynamically adjusts security response execution based on threat intensity and renewable energy availability. As extensive experiments on the UNSW-NB15 and CIC-IDS2017 datasets show, GreenShield attains 98.73% detection accuracy and is 67.4% more energy efficient than traditional deeplearning-based IDSs. Further, the suggested system reduces the operational carbon emissions up to 97.6%, which is equivalent to a reduction of around 2.8 kg CO2-equivalent/per hour in a typical edge-deployment situation, yet it does not undermine the performance of the detection. These findings suggest that GreenShield can be one of the meaningful alternatives in providing viable and scalable sustainable cybersecurity that supports carbon-conscious security workflows in the future edge-cloud computing architecture.

  • Research Article
  • 10.3390/computers15030184
BiteAI: Attention-Guided Distillation and Weight-Only Quantization for Compact Insect-Bite Classification
  • Mar 11, 2026
  • Computers
  • Mohamed Echchidmi + 1 more

Insect bites are a common cause of skin irritation and can contribute to disease transmission through vector-borne pathogens. Early identification of the likely biting organism can assist preliminary guidance (e.g., monitoring for warning signs, considering exposure history) and may reduce complications through timely follow-up. This paper studies a compact attention-guided learning framework for multiclass insect-bite image classification under strict storage constraints. A teacher network (BiteAI-T) based on MobileNetV3-Small is trained with spatial attention pooling to emphasize lesion-relevant regions while maintaining an efficient backbone. A lightweight depthwise-separable student (BiteAI-S) is trained using multi-level knowledge distillation that combines softened-logit matching with intermediate supervision through attention-map alignment and pooled-feature matching. Model storage is further reduced through weight-only quantization-aware training using an LSQ-inspired learnable scaling factor; BatchNorm running statistics are frozen during quantization fine-tuning to improve stability. Experiments on an eight-class dataset (ants, bed bugs, chiggers, fleas, mosquitos, no bites, spiders, ticks) show that BiteAI-T reaches 93.75% test accuracy. For deployment, we export (i) a TorchScript Lite teacher artifact (BiteAI-TLite, 2.35 MB) and (ii) a weight-only int8 student artifact (BiteAI-Sint8, 0.992 MB). Comparative results are also reported for an SVD-compressed + fine-tuned FP16 variant (92.66% test accuracy, 2.84 MB), illustrating accuracy–size trade-offs across compression strategies.

  • Research Article
  • 10.1007/s00530-026-02245-6
Cross-branch knowledge distillation via shallow layer guidance
  • Mar 11, 2026
  • Multimedia Systems
  • Mingfu Zhu + 2 more

Cross-branch knowledge distillation via shallow layer guidance

  • Research Article
  • 10.13052/jwe1540-9589.2523
Cross-scenario Multi-modal Knowledge Fusion and Knowledge Recommendation Based on a MDR-DKD Model
  • Mar 10, 2026
  • Journal of Web Engineering
  • Jiang Jiang + 1 more

With the widespread application of recommendation systems in e-commerce, education, and other fields, the heterogeneity of cross-scenario data and the insufficient integration of multi-modal information such as text, images, and user behavior are becoming increasingly prominent. To achieve cross-scenario multi-modal knowledge fusion and knowledge recommendation, a meta doubly robust-debiasing knowledge distillation (MDR-DKD) model is proposed. This model efficiently extracts universal features cross-scenarios using a small amount of unbiased data through a meta-learning mechanism and optimizes the model by combining knowledge distillation techniques. Finally, combined with the knowledge recommendation module, targeted knowledge recommendation is achieved by calculating the matching degree between user interests and knowledge nodes. The results showed that the multi-modal feature extraction of the model took an average of 18.61 ms, the parameter utilization rate during the feature extraction process was 91.3%, the feature extraction throughput reached 2460 samples/s, and the knowledge recommendation accuracy was 97.84%. This model can effectively extract cross-scenario multi-modal features for accurate knowledge recommendation. The research provides an effective technical path for cross-domain knowledge recommendation, which can promote the implementation of recommendation systems in multi-scenario and multi-modal practical scenarios, and help improve the personalized recommendation experience for users.

  • Research Article
  • 10.1371/journal.pdig.0001275.r003
Efficient information extraction using LLMs and knowledge distillation: A study on HPV health communication
  • Mar 10, 2026
  • PLOS Digital Health
  • Saadat Hasan Khan + 3 more

State Department of Health (DOH) websites serve as authoritative sources of HPV-related health communications, presenting state-specific content that influences public awareness and vaccination decisions. We develop a computationally efficient framework to systematically evaluate these information repositories based on their content quality, completeness, and their motivational impact on vaccination behavior. We propose a dataset consolidating 48 different DOH websites’ data targeted towards HPV and HPV vaccination. By developing an annotated dataset (n = 400), efficient prompting techniques and a Knowledge Distillation framework, we develop and evaluate efficient student models based on the Llama family of Large Language Models (LLMs) and the RoBERTa Large encoder architecture. We finally deploy the best-performing student model for a computationally feasible evaluation of the content of DOH websites. We show that fine-tuned RoBERTa Large model achieves an F1 score of 0.74 on the test set, outperforming all other student models and approaching the teacher model's performance (F1 = 0.77). The fine-tuned RoBERTa-Large model is subsequently applied to data from various state DOH websites to evaluate the information presented. We also discuss the broader implications, limitations, and ethical and legal considerations of the proposed approach.

  • Research Article
  • 10.1093/jamia/ocag032
MedRep: medical concept representations for general electronic health record foundation models.
  • Mar 10, 2026
  • Journal of the American Medical Informatics Association : JAMIA
  • Junmo Kim + 3 more

Traditional electronic health record (EHR) foundation models fail to process unseen medical codes, limiting generalizability across institutions with different vocabularies. To address this problem, we introduce medical concept representation (MedRep), standardized medical concept representations for EHR foundation models, enabling recognition of semantically similar concepts regardless of their specific IDs. We utilized Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) vocabulary covering 7.5 million concepts from 66 medical vocabularies. MedRep integrates large language model-generated concept descriptions and OMOP graph ontology using graph contrastive learning with knowledge distillation. We evaluated MedRep-based models on MIMIC-IV (internal validation) and EHRSHOT (external validation) across 9 prediction tasks including clinical outcomes, phenotypes, and in-hospital events. MedRep consistently outperformed baseline models, particularly in external validation with average improvements of 0.088 in area under the receiver operating characteristic curve and 0.208 in area under the precision-recall curve. Qualitative analysis demonstrated that MedRep-based models identified more clinically relevant concepts when making decisions than the baseline models. Performance improvements remained stable across diverse EHR foundation model architectures, including BEHRT, Med-BERT, and CDM-BERT. MedRep improves the generalizability of EHR foundation models by encouraging similar concepts to have similar representations. EHR foundation models developed at different institutions could cooperate through MedRep, merging knowledge from multiple hospital datasets. In addition, our approach could reduce healthcare disparities by enabling smaller institutions to benefit from models trained on larger datasets. MedRep improves EHR foundation model performance, interpretability, and generalizability, serving as a standard baseline representation for EHR foundation models adopting OMOP CDM.

  • Research Article
  • 10.1038/s41598-026-35627-x
Distilling population specific expertise into a unified model for generalizable brain tumor segmentation.
  • Mar 10, 2026
  • Scientific reports
  • Ahmed Elzayat + 8 more

Multimodal brain tumor segmentation models often struggle to generalize across diverse populations due to variations in tumor pathology, patient demographics, and imaging protocols. A common approach to mitigate these challenges involves training separate models per population, employing ensemble methods, fine-tuning pretrained networks, or adopting curriculum learning strategies. While these approaches may yield improvements within specific domains, they often suffer from limited scalability, increased inference cost, poor adaptability to heterogeneous populations, and susceptibility to overfitting or catastrophic forgetting. To address these challenges, we propose a novel Multi-Teacher Single-Student Knowledge Distillation framework (MTSS-KDNet), built on the specialized knowledge of individual teacher models and distilling their collective expertise into a unified student model. Our framework performs population-aware knowledge transfer, guiding the student to integrate the strengths of multiple specialized teachers through both latent- and output-level supervision. This enables effective and independent generalization across all tumor types. In this paper, we focus on five distinct tumor populations: Adult Gliomas, Pediatric Gliomas, Sub-Saharan African Gliomas—which, although pathologically similar to their adult counterparts, often suffer from degraded MRI image quality—Intracranial Meningiomas and Brain Metastases. These tumor types exhibit unique developmental, morphological, anatomical and imaging characteristics, introducing heterogeneity that poses significant challenges to the ability of models to generalize accurately. Our approach achieves superior performance across all five populations, with average dice scores (DSC) of 0.87, 0.84, and 0.77 in the whole tumor (WT), tumor core (TC) and enhancing tumor (ET) regions, respectively, outperforming both population-specific and strong benchmark models. These results highlight the robustness and versatility of our method, offering a promising solution for enhancing generalizability in brain tumor segmentation while facilitating seamless clinical deployment.

  • Research Article
  • 10.3390/rs18050842
A Unified Framework for Vehicle Detection, Tracking, and Counting Across Ground and Aerial Views Using Knowledge Distillation with YOLOv10-S
  • Mar 9, 2026
  • Remote Sensing
  • Md Rezaul Karim Khan + 1 more

Accurate and reliable vehicle detection, tracking, and counting across different surveillance platforms are fundamental requirements for developing smart Traffic Management Systems (TMS) and promoting sustainable urban mobility. Recent advances in both ground-level surveillance and remote sensing using deep learning have opened new opportunities for extracting detailed vehicular information from high-resolution aerial and surveillance video data. Our research reported here aims to present a unified, real-time vehicle analysis framework that integrates lightweight deep learning–based detection, robust multi-object tracking, and trajectory-driven counting within a single modular pipeline. The proposed framework employs a “You Only Look Once” system, YOLOv10-S as the detection backbone and enhances its robustness through supervision-level knowledge distillation without introducing any architectural modifications. Temporal consistency is enforced using an observation-centric multi-object tracking algorithm (OC-SORT), enabling stable identity preservation under camera motion and dense traffic conditions. Vehicle counting is performed using a trajectory-based virtual gate strategy, reducing duplicate counts and improving counting reliability. Comprehensive experiments conducted on the UA-DETRAC and VisDrone benchmarks show that the proposed framework effectively balances detection performance, tracking robustness, counting accuracy, and real-time efficiency in both ground-based and aerial surveillance settings. Furthermore, cross-dataset evaluations under direct train–test transfer highlight the inherent challenges of domain shift while showing that knowledge distillation consistently improves robustness in detection, tracking identity consistency, and vehicle counting. Overall, this framework enables effective real-world traffic monitoring by adopting a scalable and practical system design, where reliability is prioritized over architectural complexity.

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3794845
A Survey on Inference Optimization Techniques for Mixture of Experts Models
  • Mar 9, 2026
  • ACM Computing Surveys
  • Jiacheng Liu + 7 more

The emergence of large-scale Mixture of Experts (MoE) models represents a significant advancement in artificial intelligence, offering larger model capacity and computational efficiency through conditional computation. However, deploying and running inference on these models presents significant challenges in computational resources, latency, and energy efficiency. This comprehensive survey analyzes optimization techniques for MoE models across the entire system stack. We first establish a taxonomical framework that categorizes optimization approaches into model-level, system-level, and hardware-level optimizations. At the model level, we examine architectural innovations including efficient expert design, attention mechanisms, various compression techniques such as pruning, quantization, and knowledge distillation, as well as algorithm improvement including dynamic routing strategies and expert merging methods. At the system level, we investigate distributed computing approaches, load balancing mechanisms, and efficient scheduling algorithms that enable scalable deployment. Furthermore, we delve into hardware-specific optimizations and co-design strategies that maximize throughput and energy efficiency. This survey provides both a structured overview of existing solutions and identifies key challenges and promising research directions in MoE inference optimization. To facilitate ongoing updates and the sharing of cutting-edge advances in MoE inference optimization research, we have established a repository accessible at https://github.com/MoE-Inf/awesome-moe-inference/ .

  • Research Article
  • 10.3389/fmicb.2026.1791871
Dual-graph knowledge distillation for few-shot class-incremental microorganism recognition
  • Mar 9, 2026
  • Frontiers in Microbiology
  • Sihang Xu + 4 more

Environmental microorganism recognition from microscopic images is crucial for environmental monitoring and ecological analysis. In practical scenarios, microorganism categories often evolve over time, and newly emerging classes usually have only a few labeled samples due to high annotation costs. This combination naturally gives rise to the few-shot class-incremental learning (FSCIL) problem. FSCIL requires models to incrementally learn new classes under severe data scarcity while effectively retaining knowledge of previously learned ones. In this work, we propose a unified FSCIL framework for environmental microorganism recognition. The proposed method is composed of three complementary components. First, a contrastive-inspired fine-grained representation learning strategy is introduced in the base session. This strategy enhances intra-class compactness by mining prediction-consistent augmented samples, without introducing explicit contrastive losses. Second, a prototype rectification mechanism is designed to stabilize the representations of incremental classes by leveraging semantic structures learned from base classes. Third, a dual-graph knowledge distillation framework is proposed to preserve both instance-level and class-level relational knowledge during incremental learning. This process is guided by a teacher model updated via exponential moving average. Experiments conducted on the EMDS-7 dataset demonstrate the effectiveness of the proposed approach. Compared with state-of-the-art FSCIL methods, our method achieves the highest average accuracy of 78.19% and maintains the best final-session accuracy of 65.36%. Meanwhile, strong base-session performance is consistently preserved. These results indicate that the proposed framework effectively mitigates catastrophic forgetting and enables robust adaptation to new microorganism categories in real-world incremental recognition scenarios.

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3796229
Improving Arabic Information Retrieval and Reranking Performance Using Knowledge Distillation
  • Mar 6, 2026
  • ACM Transactions on Asian and Low-Resource Language Information Processing
  • M'Hamed Amine Hatem + 2 more

Transformer-based models have revolutionized information retrieval, achieving state-of-the-art performance in document retrieval and ranking. For high-resource languages like English, an abundance of high-quality labeled datasets has facilitated the development of powerful models. However, developing powerful models for low-resource languages such as Arabic is challenging due to the scarcity of labeled data. While using translated English datasets can be considered to overcome the lack of labeled data, translated datasets have inherent information loss and inconsistencies introduced during the translation process. As a result, models fine-tuned on translated datasets typically underperform relative to their English counterparts. To address this issue, we explore the potential of transferring expertise from high-resource models to low-resource models. In particular, we investigate whether knowledge learned by English retrieval and reranking models can be effectively transferred to Arabic models via knowledge distillation. Our results demonstrate that knowledge distillation significantly improves the performance of Arabic information retrieval. Our models, fine-tuned using knowledge distillation on the mMARCO Arabic passage-ranking dataset, outperform state-of-the-art retrieval and reranker models. Specifically, our cross-encoder achieves an MRR@10 of 0.254, representing an 8% relative improvement over the previous best cross-encoder, mT5. In terms of recall, our bi-encoder achieves an R@1000 of 0.799, surpassing the late-interaction model mColBERT (R@1000 = 0.749, +6.7%) and the baseline BM25 (R@1000 = 0.637, +25%). Furthermore, by leveraging knowledge distillation with soft labels generated by an ensemble of IR models, we manage to achieve comparable or higher performance without requiring extensive manual annotation. This approach offers an effective mechanism for automatic annotation and pseudo-labeling in low-resource language scenarios.

  • Research Article
  • 10.1007/s10846-026-02379-9
SUSHI: A Vision System for Reactive, Uninformed ASV Navigation via Multi-Field Path Planning and Visual Exploration
  • Mar 6, 2026
  • Journal of Intelligent & Robotic Systems
  • Hamze Hammami + 5 more

Abstract Vision offers richer context than traditional marine sensors (e.g., LiDAR, Doppler Velocity Logger (DVL), sonar) but is harder to interpret on water due to reflections, glare, and dynamic surfaces. SUSHI is a vision-first navigation system for Autonomous Surface Vehicles (ASVs) that fuses detection, water segmentation, and monocular depth to produce camera-centric navigation grids for planning and control. The proposed perception methods achieve 90% segmentation accuracy through knowledge distillation with SAM2 logits, requiring only 500-550 frames and approximately 30 minutes of training. The system implements a YOLO detection model that achieves 94.5% mAP@0.5 (F1 score: 0.91) for trash and obstacle detection in simulation, and benchmarks a monocular depth method that solves the issue of reflective surfaces and can work universally. Path planning uses a Multi-Field Synthesis (MFS) approach: a locally reactive artificial-potential-field component blended adaptively with a global wavefront flow field, mitigating local minima while preserving real-time responsiveness. A behavior layer prioritizes target seeking and mask-based visual exploration when explicit goals are absent. Validation was performed in the TOAST simulator and in a pool environment, demonstrating robust goal targeting and exploration using cameras with minimal side sensing for emergency avoidance.

  • Research Article
  • 10.1186/s13677-026-00872-y
LMM-guided knowledge distillation for power operation object detection in cloud-edge environment
  • Mar 6, 2026
  • Journal of Cloud Computing
  • Bingyang Li + 4 more

Power-grid field operations demand real-time visual monitoring to verify personal protective equipment and tool usage under large depth-of-field. Conventional real-time detectors are efficient but closed-vocabulary; they struggle with rare or unseen objects. Large multimodal models (LMM) offer open-vocabulary understanding guided by prompts, yet are too heavy for edge deployment. To address these challenges, We propose an LMM-guided distillation framework that transfers prompt-grounded semantics from a large teacher to a lightweight YOLO-style student. The teacher, queried with expanded prompt set, produces pseudo labels and region–text embeddings. The student is trained with a standard detection objective and three semantic transfers. Firstly, feature distillation aligns student features to teacher region embeddings via a linear projector; Secondly, prompt-aware logit distillation matches student logits to the teacher’s temperature-smoothed prompt distribution; and thirdly, vision–language contrastive alignment ties projected student regions to the correct prompt embedding. Experiments on two benchmark dataset indicate consistent gains on both common and rare categories while retaining real-time throughput on edge hardware, demonstrating a practical cloud-to-edge pipeline for safety monitoring.

  • Research Article
  • 10.1038/s41598-026-42981-3
Hallucination-aware learning and latency optimization transformer (HALL-OPT) for real-time edge intelligence.
  • Mar 5, 2026
  • Scientific reports
  • Danah Algawiaz

Transformer architectures and large language models remain competitive across a broad range of AI tasks, making them challenging to deploy in resource-constrained edge computing environments due to high resource demands and the generation of erroneous or fake outputs (hallucinations). In this paper, a single scheme, HALL-OPT, is proposed to address both latency detection and reduction in hallucination for real-time edge intelligence. The paper presents three main elements of the framework, namely, (1) a dual-stream hallucination detector that analyses internal attention behaviour, (2) an adaptive token-pruning system, which decodes and extracts the necessary context at minimal computation, and (3) a lightweight edge-optimized transformer obtained by knowledge distillation. On SQuAD 2.0 and CNN/DailyMail, HALL-OPT detects hallucinations accurately at 94.3% and achieves a 67.8% reduction in inference latency with only a 2.1% decrease in accuracy compared to the BERT-base model. The system (when deployed on edge hardware) provides sub-50 ms response times while consuming 43% less energy. It is appropriate for real-time applications in industrial IoT, autonomous systems, healthcare monitoring, and other applications where low latency is critical. Existing transformer optimisation and hallucination mitigation approaches treat reliability and Efficiency as separate objectives, limiting their applicability in real-time edge environments. HALL-OPT uniquely integrates hallucination-aware attention, adaptive pruning, and edge-oriented optimisation into a single unified framework, enabling simultaneous reductions in hallucination, latency, and energy consumption. This integrated design distinguishes HALL-OPT from prior work that optimises accuracy or Efficiency in isolation.

  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • .
  • .
  • .
  • 14
  • 5
  • 6
  • 7
  • 8
  • 9

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers