Related Topics
Articles published on Perception system
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
4279 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.bandl.2025.105659
- Jan 1, 2026
- Brain and language
- Jiayi Zhao + 8 more
Word selection, concreteness and brain lateralization.
- New
- Research Article
- 10.1109/tpami.2025.3611795
- Jan 1, 2026
- IEEE transactions on pattern analysis and machine intelligence
- Linshen Liu + 4 more
Autonomous vehicles, open-world robots, and other automated systems rely on accurate, efficient perception modules for real-time object detection. Although high-precision models improve reliability, their processing time and computational overhead can hinder real-time performance and raise safety concerns. This paper introduces an Edge-based Mixture-of-Experts Optimal Sensing (EMOS) System that addresses the challenge of co-achieving accuracy, latency and scene adaptivity, further demonstrated in the open-world autonomous driving scenarios. Algorithmically, EMOS fuses multimodal sensor streams via an Adaptive Multimodal Data Bridge and uses a scenario-aware MoE switch to activate only a complementary set of specialized experts as needed. The proposed hierarchical backpropagation and a multiscale pooling layer let model capacity scale with real-world demand complexity. System-wise, an edge-optimized runtime with accelerator-aware scheduling (e.g., ONNX/TensorRT), zero-copy buffering, and overlapped I/O-compute enforces explicit latency/accuracy budgets across diverse driving conditions. Experimental results establish EMOS as the new state of the art: on KITTI, it increases average AP by 3.17% while running $2.6\times$2.6× faster on Nvidia Jetson. On nuScenes, it improves accuracy by 0.2% mAP and 0.5% NDS, with 34% fewer parameters and a $15.35\times$15.35× Nvidia Jetson speedup. Leveraging multimodal data and intelligent experts cooperation, EMOS delivers accurate, efficient and edge-adaptive perception system for autonomous vehicles, thereby ensuring robust, timely responses in real-world scenarios.
- New
- Research Article
- 10.70695/iaai202504a12
- Dec 31, 2025
- Innovative Applications of AI
- Hui Kou
In complex scenarios, single-sensor perception is unstable, trajectory planning lacks safety constraints, and there are problems with multi-source coordination. To address these issues, this paper proposes an intelligent robot trajectory decision-making method combining LiDAR and image processing, forming a multimodal perception system. This system requires a robot platform, proper extrinsic parameter calibration, and time synchronization. A LiDAR-Image feature acquisition and attention fusion network is designed from a unified BEV perspective to generate an environmental cost map that considers both geometry and semantics. Based on this cost map, an RL/MPC trajectory decision-making model is constructed, introducing chance constraints and dynamic boundaries to ensure safety margins. Simulations and real-world experiments included indoor corridors, office areas, and crowded places. Results show that the multimodal approach outperforms DWA and single-modal RL in terms of mAP, distance RMSE, minimum obstacle distance, and task completion rate. Furthermore, it can run continuously on embedded platforms, demonstrating the effectiveness of the proposed method and its value for engineering applications.
- New
- Research Article
- 10.1038/s41598-025-28326-6
- Dec 29, 2025
- Scientific Reports
- Gérard Coureaud + 6 more
Newborn mammals must adapt to the chemically complex environment by detecting and prioritizing relevant stimuli. In rabbits, the mammary pheromone (MP) emitted by lactating females triggers a typical behaviour in newborns, helping them to locate the nipples, suck and survive. The MP also promotes very rapid learning of other odorants by associative conditioning. In this study, MP-induced learning was used to investigate the neonatal detection and recognition abilities of two odorants very different in volatility, ethyl isobutyrate and ethyl maltol, across concentrations ranging from 10− 5 to 10− 25 g/ml. The results show, firstly, that the odorants could be learned even at very low concentrations; and secondly, that a process of generalisation of the odorant quality was effective after learning over a wide range of concentrations. However, the degree of generalisation depended on the concentration at which the odorants had been learned, with quality and intensity becoming closely interdependent for very low concentrations of learning. Taken together, these data highlight the remarkable adaptability of the olfactory perceptual and cognitive systems of newborn rabbits, enabling them not only to rapidly learn new odorants, but also to attribute qualities to them that depend on the quality perceived at the learning concentration.Supplementary InformationThe online version contains supplementary material available at 10.1038/s41598-025-28326-6.
- New
- Research Article
- 10.1088/2631-8695/ae2e80
- Dec 29, 2025
- Engineering Research Express
- Xiaoyu Zhang
Abstract This study addresses the core safety challenges of Advanced Driver-Assistance Systems (ADAS), particularly those specified by the Safety of the Intended Functionality (SOTIF, ISO 21448). These challenges stem from algorithmic limitations in uncertain or ambiguous scenarios. To mitigate such risks, an enhanced Transformer-based detector, the Uncertainty-Aware Transformer (U-Transformer), is developed to quantify its own predictive uncertainty. This model forms the basis of a reliability and safety design framework that integrates algorithmic innovation with systems engineering. An uncertainty evaluation mechanism is embedded within the Transformer architecture, enabling the model to output both object detection results and a quantitative measure of prediction confidence. Experimental results show that the system achieves a perception accuracy of 95.77%. In complex scenarios, it sustains a Minimum Risk Response (MRR) rate of 90.9%, with a failure recovery time of only 1.93 seconds. By providing the perception system with an intrinsic and quantifiable selfassessment capability, this approach improves the trustworthiness of intelligent driving systems and also enhances safety in complex, open-world environments. Together, these advances establish a solid technical foundation for advanced autonomous driving.
- New
- Research Article
- 10.64808/engineeringperspective.1814718
- Dec 28, 2025
- Engineering Perspective
- Lorant Szabo + 2 more
The development of autonomous transportation systems represents a critical step toward achieving intelligent and reliable mobility. Ensuring accurate, real-time environmental perception and the robust detection of unexpected or rare events remains a major challenge for autonomous vehicles operating in complex and dynamic environments. To address this, we propose a novel processing pipeline that constructs Bird’s Eye View (BEV) representations from raw 3D LiDAR point clouds using both intensity and height map channels, thereby retaining essential geometric and reflective features. On top of these BEV representations, an optimized YOLOv11-based deep learning model is applied for high-precision object detection. A key contribution of our work is the integration of a real-time Out-of-Distribution (OOD) detection module, which employs lightweight statistical techniques in conjunction with learned feature representations to ensure minimal computational overhead while maintaining operational robustness. The proposed architecture enables the reliable identification of standard traffic objects as well as the detection of atypical or previously unseen events, such as overturned vehicles or unknown obstacles. Experimental evaluation on representative driving scenarios demonstrates that our method achieves approximately 95% detection accuracy, outperforming conventional baselines in both speed and reliability. Overall, the results highlight the potential of combining state-of-the-art deep neural detection frameworks with efficient, statistically grounded OOD analysis for enhancing the safety and trustworthiness of autonomous vehicle perception systems.
- New
- Research Article
- 10.61260/2074-1618-2025-4-19-27
- Dec 24, 2025
- Psychological and pedagogical problems of human and social security
- Alexey Vostryh
The article solves the problem of improving the perception of user interface signals by software product operators by analyzing the functional field of the human perceptual system. The analysis allowed: to identify the limitations of the perceptual system; to visualize the process of processing sensations received from interfaces through the complex of users' sensory systems; to identify periods in the users' psyche when illusions in the perception of sensations from interaction with interfaces arise, as well as to compile a list of recommendations for designing interfaces in the aspect of the limitations of the perceptual system
- New
- Research Article
- 10.37391/ijeer.130421
- Dec 20, 2025
- International Journal of Electrical and Electronics Research
- Likang Bo + 3 more
With the rapid development of autonomous driving technology, real-time ranging of preceding vehicles has become a critical component to ensure driving safety. Although monocular vision-based ranging methods offer advantages of low cost and easy deployment, they still suffer from limited accuracy in long-distance targets, small objects, and complex traffic scenarios. To address these challenges, this paper improves the classic Smoke monocular 3D detection model by introducing a multi-scale feature enhancement module and a dynamic Gaussian heatmap generation mechanism, which effectively strengthen feature representation and stabilize depth estimation. Experiments conducted on the KITTI dataset demonstrate that the improved model outperforms the baseline in both 3D AP and BEV AP metrics, with a significant reduction in average ranging error, especially in small-target and long-distance scenarios. This study provides a feasible improvement strategy for monocular vision-based ranging in complex traffic environments and has important implications for enhancing the robustness of autonomous driving perception systems.
- New
- Research Article
- 10.61173/xz4crn81
- Dec 19, 2025
- Interdisciplinary Humanities and Communication Studies
- Siyu Wu
With the popularization of social media, adolescents are increasingly exposed to online information at an early age and become immersed in it, facing growing influences from negative cultural trends and narrow aesthetic standards. This paper systematically reviews over ten empirical studies conducted within the past five years, revealing that social platforms amplify the exposure frequency of “borderline culture” and the “white, young, and thin” body ideal image through their content recommendation mechanisms. This exposure triggers adolescents’ comparisons of their appearance and increases body anxiety. Simultaneously, interactive behaviors such as posting selfies and reading comments amplify aesthetic internalization, gradually leading adolescents to adopt online standards as benchmarks for self-evaluation. The research further indicates that adolescents demonstrate active agency in imitating and reproducing content during the process of cultural assimilation, forming a “participatory exposure” mechanism. Based on these findings, this paper proposes specific intervention strategies across four domains—platform algorithms, media literacy education, family communication, and research pathways—to help adolescents develop diverse and healthy body perception systems.
- New
- Research Article
- 10.1371/journal.pone.0338638.r006
- Dec 18, 2025
- PLOS One
- Yin Lei + 6 more
Autonomous driving perception systems still encounter significant challenges in edge scenarios involving multi-scale target changes and adverse weather, which seriously compromise detection reliability. To address this issue, we introduce a novel edge case dataset that extends existing benchmarks by capturing extreme road conditions (fog, rain, snow, nighttime et al.) with precise annotations, and develop EdgeCaseDNet as an optimized object-detection framework. EdgeCaseDNet’s architecture extends YOLOv8 through four synergistic innovations: (1) a Haar_HGNetv2 backbone that enables hierarchical feature extraction with enhanced long-range dependencies, (2) an asymptotic feature pyramid network for context-aware multi-scale fusion, (3) a hybrid partial depth-wise separable convolution module, and (4) Wise-IoU loss optimization for accelerated convergence. Comprehensive evaluations demonstrated the superiority of EdgeCaseDNet over YOLOv8, achieving improvements of +10.6% in mAP@50, and +8.4% in mAP@[.5:.95]. All the relevant codes are available at https://github.com/yutianku/EdgeCaseDNet.
- New
- Research Article
- 10.11648/j.ajma.20251204.14
- Dec 17, 2025
- American Journal of Mechanics and Applications
- Axmedov O’G’Li
Ensuring reliable camera vision in autonomous driving systems requires continuous monitoring of image quality and lens integrity. External contaminants such as dust, raindrops, and mud, as well as permanent defects like cracks or scratches, can severely degrade visual perception and compromise safety-critical tasks such as lane detection, obstacle recognition, and path planning. This paper presents an AI-based framework that integrates image quality assessment (IQA) and lens defect analysis to enhance the robustness of camera-based perception systems in autonomous vehicles. Building on previous conceptual work in safety-aware lens defect detection, the proposed framework introduces a dual-layer architecture that combines real-time IQA monitoring with deep learning-based soiling segmentation. As an initial experimental validation, a U-Net model was trained on the WoodScape Soiling dataset to perform pixel-level detection of lens contamination. The model achieved an average Intersection-over-Union (IoU) of 0.6163, a Dice coefficient of 0.7626, and a recall of 0.9780, confirming its effectiveness in identifying soiled regions under diverse lighting and environmental conditions. Beyond the experiment, this framework outlines pathways for future integration of semantic segmentation, anomaly detection, and safety-driven decision policies aligned with ISO 26262 and ISO 21448 standards. By bridging conceptual modeling with experimental evidence, this study establishes a foundation for intelligent camera health monitoring and fault-tolerant perception in autonomous driving. The presented results demonstrate that AI-based image quality and defect assessment can significantly improve system reliability, supporting safer and more adaptive driving under real-world conditions.
- Research Article
- 10.3389/fpls.2025.1721484
- Dec 12, 2025
- Frontiers in Plant Science
- Weixiang Yao + 7 more
Unmanned Aerial Vehicle (UAV), as a new generation of intelligent equipment, has gradually become an essential tool across multiple industries due to its high maneuverability and strong task adaptability. UAV payload technology (UPT) serves as a key support for enhancing mission performance and expanding application scenarios. UPT is being rapidly integrated into agriculture and other key fields, emerging as a driving force for the low-altitude economy and intelligent operations. This study systematically analyzed and discussed the development status of UPT, its typical application scenarios, and the challenges faced. By conducting a comprehensive review of global research on UPT from 2012 to 2025, this review summarized research hotspots and revealed evolutionary trends. The findings demonstrated that UPT had made notable progress in typical application areas, including crop monitoring, precision agricultural operations, agricultural product harvesting and aerial transportation, power line inspection, emergency rescue, and logistics. However, UPT was still constrained by limited autonomous perception and path planning capabilities, insufficient universality of payload platforms, a lack of standardized device interfaces, as well as challenges related to endurance, communication, and operational stability under adverse weather conditions. Future research should focus on lightweight and multifunctional payload design, intelligent operation control, and modular and standardized integration, while building a “satellite-UAV-ground” collaborative perception and decision-making system. The outcomes of this study provide both theoretical reference and practical guidance for promoting UAV adoption in agriculture and other low-altitude application scenarios, thereby contributing to the sustainable development of smart agriculture and the low-altitude economy.
- Research Article
- 10.1109/tpami.2025.3642123
- Dec 10, 2025
- IEEE transactions on pattern analysis and machine intelligence
- Jiacheng Zhang + 7 more
Semi-supervised object detection (SSOD) mitigates the annotation burden in object detection by leveraging unlabeled data, providing a scalable solution for modern perception systems. Concurrently, detection transformers (DETRs) have emerged as a popular end-to-end framework, offering advantages such as non-maximum suppression (NMS)-free inference. However, existing SSODmethods are predominantly designed for conventional detectors, leaving the exploration of DETR-based SSOD largely uncharted. This paper presents a systematic study to bridge this gap. We begin by identifying two principal obstacles in semi-supervised DETR training: (1) the inherent one-to-one assignment mechanism of DETRs is highly sensitive to noisy pseudo-labels, which impedes training efficiency; and (2) the query-based decoder architecture complicates the design of an effective consistency regularization scheme, limiting further performance gains. To address these challenges, we propose Semi-DETR++, a novel framework for efficient SSOD with DETRs. Our approach introduces a stage-wise hybrid matching strategy that enhances robustness to noisy pseudo-labels by synergistically combining one-to-many and one-to-one assignments while preserving NMS-free inference. Furthermore, based on our observation of the unique layer-wise decoding behavior in DETRs, we develop a simple yet effective re-decode query consistency training method to regularize the decoder. Extensive experiments demonstrate that Semi-DETR++ enables more efficient semi-supervised learning across various DETR architectures, outperforming existing methods by significant margins. The proposed components are also flexible and versatile, showing superior generalization by readily extending to semi-supervised segmentation tasks. Code is available at https://github.com/JCZ404/Semi-DETR.
- Research Article
- 10.3390/s25247495
- Dec 9, 2025
- Sensors (Basel, Switzerland)
- Pierluigi Rossi + 6 more
HighlightsWhat is the chance of detecting distance errors in RGB-D Cameras?Depth camera errors can be predicted according to distance, angle of the target and light conditions;How can these sources of measurement bias be considered?A geometry-aware model was tested to provide depth measurement corrections in outdoor environments;What is the behavior of the sensor across different distances?Distance measurement errors grow with distance and angle up to 3.5 m with targets at 16 m;What is the precision of the depth correction model developed in this research?Depth correction models can achieve RMSE between 0.46 and 0.64 m, even at long distances.Stereo cameras, also known as depth cameras or RGB-D cameras, are increasingly employed in a large variety of machinery for obstacle detection purposes and navigation planning. This also represents an opportunity in agricultural machinery for safety purposes to detect the presence of workers on foot and avoid collisions. However, their outdoor performance at medium and long range under operational light conditions remains weakly quantified: the authors then fit a field protocol and a model to characterize the pipeline of stereo cameras, taking the Intel RealSense D455 as benchmark, across various distances from 4 m to 16 m in realistic farm settings. Tests have been conducted using a 1 square meter planar target in outdoor environments, under diverse illumination conditions and with the panel being located at 0°, 10°, 20° and 35° from the center of the camera’s field of view (FoV). Built-in presets were also adjusted during tests, to generate a total of 128 samples. The authors then fit disparity surfaces to predict and correct systematic bias as a function of distance and radial FoV position, allowing them to compute mean depth and estimate a model of systematic error that takes depth bias as a function of distance, light conditions and FoV position. The results showed that the model can predict depth errors achieving a good degree of precision in every tested scenario (RMSE: 0.46–0.64 m, MAE: 0.40–0.51 m), enabling the possibility of replication and benchmarking on other sensors and field contexts while supporting safety-critical perception systems in agriculture.
- Research Article
- 10.1007/s10462-025-11455-9
- Dec 8, 2025
- Artificial Intelligence Review
- Zhongnan Zhao + 3 more
A collaborative metaverse-digital twin system for traffic perception, reasoning, and resource scheduling
- Research Article
- 10.1111/nyas.70147
- Dec 7, 2025
- Annals of the New York Academy of Sciences
- Camila Alviar + 2 more
Learning to successfully participate in social interactions is a monumental task for infants, whose perceptual systems are immature and communicative signals complex and hard to parse. To support their infants, caregivers naturally modify their communicative behaviors to be more repetitive, redundant, and rhythmic, thus engaging infants' perceptual biases. In this paper, we present the InCHORRRuS framework: which considers the role of rhythm in organizing caregivers' communicative behaviors across modalities to scaffold communication and dyadic coordination in early social interactions. We argue rhythm's role in infant-directed (ID) communication is particularly highlighted in ID singing, in which metrically structured beat-based rhythms make the multimodal redundancy and repetition in ID communication also temporally predictable, thus "supercharging" the cues' communicative value. Additionally, the repetition in songs, across verses and over time, offers caregivers a natural way of leveraging predictability and familiarity at the local level and at longer interactional timescales alike, increasing the impact of the enriched communicative signal. We review the current literature on timing and rhythm, redundancy, and repetition in ID signals; discuss the evidence on the confluence of redundancy and repetition in rhythmic contexts; and consider open questions and future directions our framework inspires.
- Research Article
- 10.3390/bs15121684
- Dec 4, 2025
- Behavioral Sciences
- Jiahui Liu + 2 more
The cultivation of aesthetic appreciation through engagement with exemplary artworks constitutes a fundamental pillar in fostering children’s cognitive and emotional development, while simultaneously facilitating multidimensional learning experiences across diverse perceptual domains. However, children in early stages of cognitive development frequently encounter substantial challenges when attempting to comprehend and internalize complex visual narratives and abstract artistic concepts inherent in sophisticated artworks. This study presents an innovative methodological framework designed to enhance children’s artwork comprehension capabilities by systematically leveraging the theoretical foundations of audio-visual cross-modal integration. Through investigation of cross-modal correspondences between visual and auditory perceptual systems, we developed a sophisticated methodology that extracts and interprets musical elements based on gaze behavior patterns derived from prior pilot studies when observing artworks. Utilizing state-of-the-art deep learning techniques, specifically Recurrent Neural Networks (RNNs), these extracted visual–musical correspondences are subsequently transformed into cohesive, aesthetically pleasing musical compositions that maintain semantic and emotional congruence with the observed visual content. The efficacy and practical applicability of our proposed method were validated through empirical evaluation involving 96 children (analyzed through objective behavioral assessments using eye-tracking technology), complemented by qualitative evaluations from 16 parents and 5 experienced preschool educators. Our findings show statistically significant improvements in children’s sustained engagement and attentional focus under AI-generated, artwork-matched audiovisual support, potentially scaffolding deeper processing and informing future developments in aesthetic education. The results demonstrate statistically significant improvements in children’s sustained engagement (fixation duration: 58.82 ± 7.38 s vs. 41.29 ± 6.92 s, p < 0.001, Cohen’s d ≈ 1.29), attentional focus (AOI gaze frequency increased 73%, p < 0.001), and subjective evaluations from parents (mean ratings 4.56–4.81/5) when visual experiences are augmented by AI-generated, personalized audio-visual experiences.
- Research Article
- 10.1088/1402-4896/ae26df
- Dec 2, 2025
- Physica Scripta
- Wei Yi + 7 more
Abstract Single-pixel panoramic imaging (SPPI) technology, which integrates convex mirrors with computational imaging algorithms, achieves efficient panoramic perception and demonstrates application potential in autonomous driving perception, pipeline inspection and low-altitude related applications. However, its algorithmic efficiency and accuracy still require optimization. This study improves the SPPI system from two aspects: modulation mode and reconstruction algorithm. In terms of modulation, the Walsh-Hadamard (W-H) orthogonal basis structure pattern is introduced, which effectively alleviates the noise sensitivity problem of the random mode in SPPI. In terms of algorithms, this study adapts and optimizes two key approaches for panoramic scenes: the orthogonal algorithm based on two-dimensional (2D) W-H transformation and the 2D compressive sensing (2DCS) algorithm based on the 2D Variational Augmented Lagrange Multiplier (V2DALM). These approaches enable independent asymmetric sampling in the radial and angular dimensions, allowing flexible configuration of sampling resources according to the characteristics of different scenes. The experimental system adopts a passive imaging mode, further enhancing the practicality and adaptability of SPPI. This study provides scalable theoretical support for single-pixel panoramic perception systems and corresponding SPPI schemes for three typical requirements: high-fidelity imaging, high-speed reconstruction, and speed-fidelity trade-off. Finally,the SPPI performance of the V2DALM algorithm under different modulation patterns has been preliminarily analyzed and discussed.
- Research Article
- 10.1016/j.atech.2025.101102
- Dec 1, 2025
- Smart Agricultural Technology
- Changjoo Lee + 3 more
Monitoring runtime input data distribution for the safety of the intended functionality in perception systems
- Research Article
- 10.3168/jds.2025-27187
- Dec 1, 2025
- Journal of dairy science
- Lisa Ekman + 3 more
Farmer attitudes and motivation affect their health-seeking behavior in relation to mastitis in dairy cows-A survey on Swedish dairy farms with automatic milking systems.