Articles published on Stereo camera
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
3200 Search results
Sort by Recency
- New
- Research Article
- 10.12913/22998624/211569
- Feb 1, 2026
- Advances in Science and Technology Research Journal
- Andrzej Chmielowiec + 4 more
Utilization of stereo camera and artificial intelligence methods for automatic employee motion tracking in manufacturing enterprises
- New
- Research Article
- 10.3390/electronics15030573
- Jan 28, 2026
- Electronics
- Zhongli Ma + 7 more
Ensuring production safety and enabling rapid emergency response in complex industrial environments remains a critical challenge. Traditional inspection robots are often limited by perception delays when confronted with sudden dynamic threats. This paper presents a vision-driven dynamic digital twin system designed to enhance real-time monitoring and emergency management capabilities. The framework constructs high-fidelity 3D models using SolidWorks 2024, Scaniverse 5.0.0, and 3ds Max 2024, and integrates them into a unified digital twin environment via the Unity 3D engine. Its core contribution is a vision-driven dynamic mapping mechanism: robots operating on the Robot Operating System (ROS) and equipped with ZED stereo cameras and embedded YOLOv5m models perform real-time detection, such as personnel and fire sources. Recognized targets trigger the dynamic instantiation of corresponding virtual models from a pre-built library, enabling automated, real-time reconstruction within the digital twin. An integrated service platform further supports early warning, status monitoring, and maintenance functions. Experimental validation confirms that the system satisfies key performance metrics, including data collection completeness exceeding 99.99%, incident detection accuracy of 80%, and state synchronization latency below 90 milliseconds. The system improves the dynamic updating efficiency of digital twins and demonstrates strong potential for proactive safety assurance and efficient emergency response in dynamic industrial settings.
- New
- Research Article
- 10.5194/isprs-archives-xlviii-4-w18-2025-255-2026
- Jan 27, 2026
- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences
- Simla Özbayrak + 4 more
Abstract. Simultaneous Localisation and Mapping (SLAM) is a technique that allows a vehicle to determine its location and map its surroundings simultaneously. This study was carried out to produce a 3-dimensional (3D) model of the environment using the SLAM technique by processing the data obtained from Light Detection and Ranging (LiDAR) and stereo camera sensors mounted on an Unmanned Ground Vehicle (UGV) capable of operating in an indoor-outdoor area. The environment was modelled using LiDAR-SLAM and Visual Simultaneous Localisation and Mapping (VSLAM) methods, using the LiDAR sensor and the stereo camera integrated into the UGV. The accuracy assessment of the produced models was made by comparing the real sizes of the objects in the environment with the sizes in the produced model. In addition, the model’s surface accuracies were tested by examining the linearity of flat surfaces selected from the study area.
- New
- Research Article
- 10.1088/2631-8695/ae3e3a
- Jan 27, 2026
- Engineering Research Express
- Ligang Ye + 2 more
Abstract To address the challenges of blurred equipment boundaries and cumulative registration errors from multiple viewpoints in 3D reconstruction of substations, this study proposes a method based on superpixel segmentation and point cloud registration. Laser radar and stereo cameras are used to synchronously capture images of the substation scene. The K-means clustering algorithm, which does not require predefined cluster numbers, is employed to extract key frames from the visible light images, thereby reducing image redundancy. The key frames are processed using a graph-based superpixel segmentation algorithm to obtain semantically coherent superpixel sets. Combined with a plane estimation algorithm, effective superpixels with sufficient projection points are identified. The superpixel blocks are accurately segmented using the reprojection error of semi-dense point clouds from neighboring key frame superpixel images. The ICP algorithm is employed to achieve precise registration between the laser point cloud and the superpixel segmentation results. Based on the point cloud registration results, we complete the 3D reconstruction of the substation using Delaunay triangulation and point-by-point interpolation. Experimental results show that the key structure loss rate of the 3D reconstruction of the substation is below 0.1%, with high scene completeness and detail restoration, effectively supporting the digital operation and maintenance of substations.
- New
- Research Article
- 10.1038/s41597-026-06668-8
- Jan 27, 2026
- Scientific data
- David Rodríguez-Martínez + 5 more
Exploring high-latitude lunar regions presents a challenging visual environment for robots. The low sunlight elevation angle and minimal light scattering result in a visual field dominated by a strong contrast featuring long, dynamic shadows. Reproducing these conditions on Earth requires sophisticated simulators and specialized facilities. We introduce a unique dataset recorded at the LunaLab from the SnT - University of Luxembourg, an indoor test facility designed to replicate the optical characteristics of multiple lunar latitudes. Our dataset includes images, inertial measurements, and wheel odometry data from robots navigating different trajectories under multiple illumination scenarios, simulating high-latitude lunar conditions from dawn to nighttime with and without the aid of headlights, resulting in 88 distinct sequences containing a total of 1.3 M images. Data was captured using a stereo RGB-inertial sensor, a monocular monochrome camera, and, for the first time, a novel single-photon avalanche diode (SPAD) camera. We recorded both static and dynamic image sequences, with robots navigating at slow (5 cm/s) and fast (50 cm/s) speeds. All data is calibrated, synchronized, and timestamped, providing a valuable resource for validating perception tasks from vision-based autonomous navigation to scientific imaging for future lunar missions targeting high-latitude regions or those intended for robots operating across perceptually degraded environments.
- Research Article
- 10.1364/ol.582735
- Jan 15, 2026
- Optics letters
- Fengming Huang + 5 more
Accurate in-situ volume measurement of small (1 mm-10 cm) drifting underwater particles is critical for marine ecology and pollutant monitoring, yet it demands snapshot 3D imaging to avoid motion artifacts. Existing imaging techniques-including digital holography and conventional light field imaging-face a fundamental limitation in recovering the complete surface geometry of opaque and semi-transparent particles due to optical occlusion and limited perspective sampling. We overcome this challenge with a face-to-face dual light field camera (F2F-DLFC) system, which simultaneously captures both sides of a target under incoherent dark-field illumination. This dual-side snapshot strategy enables full 3D reconstruction of opaque particles, with experimental results showing volume errors below 6% for targets such as live fish and irregular pellets. While semi-transparent objects still present reconstruction challenges, this work establishes a foundational methodology for in-situ volumetric instrument development, providing a viable approach for accurate volumetry of a wide range of underwater particles.
- Research Article
- 10.3390/computers15010053
- Jan 13, 2026
- Computers
- Pushkar Kadam + 4 more
Robot hand-to-eye calibration is a necessary process for a robot arm to perceive and interact with its environment. Past approaches required collecting multiple images using a calibration board placed at different locations relative to the robot. When the robot or camera is displaced from its calibrated position, hand–eye calibration must be redone using the same tedious process. In this research, we developed a novel method that uses a semi-automatic process to perform hand-to-eye calibration with a stereo camera, generating a transformation matrix from the world to the camera coordinate frame from a single image. We use a robot-pointer tool attached to the robot’s end-effector to manually establish a relationship between the world and the robot coordinate frame. Then, we establish the relationship between the camera and the robot using a transformation matrix that maps points observed in the stereo image frame from two-dimensional space to the robot’s three-dimensional coordinate frame. Our analysis of the stereo calibration showed a reprojection error of 0.26 pixels. An evaluation metric was developed to test the camera-to-robot transformation matrix, and the experimental results showed median root mean square errors of less than 1 mm in the x and y directions and less than 2 mm in the z directions in the robot coordinate frame. The results show that, with this work, we contribute a hand-to-eye calibration method that uses three non-collinear points in a single stereo image to map camera-to-robot coordinate-frame transformations.
- Research Article
- 10.1002/njz2.70005
- Jan 1, 2026
- New Zealand Journal of Zoology
- Alice Jane Sacheverall Tansell + 5 more
Kiwis have high conservation needs and populations are often closely monitored. Mark‐recapture approaches provide useful information, but it is difficult to implement with these species. We developed a method using trail cameras that, with development, could identify individuals in the wild and may allow for a mark‐recapture approach. We measured kiwi bills with high accuracy, in ideal conditions (daylight, stationary, taxidermied kiwi), using a home‐modified stereo camera. Under these conditions, the level of accuracy obtained was within the required inter‐observer error (<1.5%) for hand measurement of live birds. Measuring a live kiwi bill at night in field conditions imposed additional challenges. Background composition was key in correctly identifying the bill tip. Camera orientation changes (e.g., a higher, angled‐down placement) may potentially mitigate at least one source of error (the correct identification of the bill tip). We were able to synchronise videos, but synchronising photos would allow for higher‐quality images, providing greater accuracy in identifying features for measurements. Given ongoing improvements in trail camera technology, there is great potential to use stereo cameras for the measurement of kiwi bills to determine age class and eventually individual identification.
- Research Article
- 10.1016/j.aquaculture.2025.743251
- Jan 1, 2026
- Aquaculture
- Atsushi Ikegami + 2 more
Spatio-temporal-dependent characteristic evaluation of yellowtail (Seriola quinqueradiata) in aquaculture cages based on stereo camera measurements
- Research Article
- 10.3126/jiee.v8i1.82571
- Dec 31, 2025
- Journal of Innovations in Engineering Education
- Bal Krishna Shah + 4 more
The robotics sector struggles to integrate vision-based navigation on a bipedal humanoid robot capable of performing human-like tasks. Although the use of ultrasonic sensors and infrared sensors is a traditional method for object detection, it has significant drawbacks such as low range, high cost and sensitivity to the environment. “Enhancing Humanoid Robot Functionality Through Vision-Based Navigation with Fall Recovery and Object Manipulation” proposes to give vision to the robot, making it capable of transporting objects from one location to another. The two ESP32-CAMs are used as a stereo camera for image capturing, employing the use of YOLOv11 for object detection, and the principle of the stereo camera for depth calculation. With the use of one of the most robust and accurate object detection algorithms available, the project aims to enhance object transportation within the visual range of the robot. The final robot can navigate intelligently and grab objects using image processing. The developed humanoid robot encompasses the feature of automatic fall recovery in simulation and natural human movement patterns through kinematical calculations, showcasing potential applications in hazardous environments, industrial automation and interplanetary exploration.
- Research Article
- 10.1088/1361-6501/ae2348
- Dec 30, 2025
- Measurement Science and Technology
- Hanna Pot + 2 more
Abstract Wave-structure interactions of flexible membrane-type materials are an emerging research field, driven by their potential in renewable energy and breakwater concepts. This study proposes stereoscopic Digital Image Correlation (DIC) as a scalable method for spatiotemporal measurements of fluid-structure interactions in wave tanks. The scalability is presented by two setups with domain dimensions ranging from O(10^{-1} m) to O(10^1 m). The calibrations of 5 adjacent and synchronized stereoscopic camera pairs are projected on a common frame of reference to cover the large domain. The presented methodology includes suggestions on the calibration method, and a practical speckle application technique is proposed. The benefits of the method are highlighted by the preliminary indication of a dynamic scaling law for wave-structure interactions. This work can serve as a foundation for further development and application of stereoscopic DIC for such structures. It is expected that this large domain method will contribute to further physical understanding of the fluid-structure interactions of large floating structures in waves.
- Research Article
- 10.3390/machines14010027
- Dec 24, 2025
- Machines
- Mingxin Li + 6 more
In industrial environments such as ports and warehouses, autonomous logistics vehicles face significant challenges in coordinating multiple vehicles while ensuring safe and efficient path planning. This study proposes a novel real-time cooperative control framework for autonomous vehicles, combining reinforcement learning (RL) and distributed model predictive control (DMPC). The RL agent dynamically adjusts the optimization weights of the DMPC to adapt to the vehicle’s real-time environment, while the DMPC enables decentralized path planning and collision avoidance. The system leverages multi-source sensor fusion, including GNSS, UWB, IMU, LiDAR, and stereo cameras, to provide accurate state estimations of vehicles. Simulation results demonstrate that the proposed RL-DMPC approach outperforms traditional centralized control strategies in terms of tracking accuracy, collision avoidance, and safety margins. Furthermore, the proposed method significantly improves control smoothness compared to rule-based strategies. This framework is particularly effective in dynamic and constrained industrial settings, offering a robust solution for multi-vehicle coordination with minimal communication delays. The study highlights the potential of combining RL with DMPC to achieve real-time, scalable, and adaptive solutions for autonomous logistics.
- Research Article
- 10.20965/jrm.2025.p1602
- Dec 20, 2025
- Journal of Robotics and Mechatronics
- Hayato Mitsuhashi + 2 more
This study proposes a novel stair recognition method that integrates a monocular camera and a laser to improve the safety of the stair-climbing function in an omnidirectional autonomous electric wheelchair equipped with three Mecanum wheels mounted on a single axle. We performed three-dimensional stair measurements, estimated the angle of descent using a camera and laser, and built an automatic stair angle adjustment function into the wheelchair. The proposed method uses coordinate points to detect the staircase structure in three dimensions (x,y,z) , performs distance transformation to achieve high-accuracy three-dimensional distance estimation, and provides a detailed visualization of the staircase geometry. Estimating the descent angle from the obtained 3D data yielded a maximum error of 2.72° and an average error of 1.04°, demonstrating higher accuracy than a stereo camera. Furthermore, the automatic stair angle adjustment function of the proposed wheelchair was validated, and an algorithm was developed to automatically maintain the wheelchair’s horizontal orientation based on the acquired stair angle. The experimental results confirmed that the proposed method can accurately adjust in real time with varying staircase angles, significantly improving the safety of the stair-climbing function. In addition, by applying this method to an autonomous mobile robot, it can detect obstacles and recognize staircase structures in the absence of ambient illumination, allowing for autonomous operation while analyzing its environment in three dimensions.
- Research Article
- 10.1002/rob.70136
- Dec 19, 2025
- Journal of Field Robotics
- Jianyuan Ruan + 1 more
ABSTRACT Public data sets are essential for progress in autonomous robotics. The advancements in sensor technology and evolving application scenarios create ongoing demands for updated benchmarking resources. This paper introduces the HK‐MEMS Data set, the first public data set to provide automotive‐grade MEMS LiDAR (Micro‐Electromechanical Systems Light Detection and Ranging) data in complex urban environments. While MEMS LiDAR has emerged as a cost‐effective and durable alternative to mechanical LiDAR for autonomous vehicles, the lack of data sets hinders corresponding research. Our work targets the under‐explored robustness challenge of Simultaneous Localization and Mapping (SLAM) in degenerate and dynamic urban scenarios. The data set integrates multi‐modal sensors–including MEMS LiDAR, stereo cameras, GNSS, and an Inertial Navigation System (INS)–collected across three platforms: a handheld device, a mobile robot, and public buses exhibiting real‐world driving behaviors. Over 187 min (75.4 km) of data were captured, spanning diverse urban scenarios with dynamic objects and degenerate scenarios, such as tunnels, highways, shopping areas, and subway stations. Comprehensive evaluations of this benchmark's state‐of‐the‐art SLAM algorithms reveal significant performance degradation in degenerate and dynamic scenes, highlighting unresolved challenges in real‐world deployment. The HK‐MEMS Data set provides a comprehensive resource for evaluating emerging MEMS LiDAR technology and establishes a challenging benchmark for advancing robust SLAM methodologies in urban navigation. The data set is openly available at: https://github.com/RuanJY/HK_MEMS_Dataset .
- Research Article
- 10.63313/aerpc.9063
- Dec 17, 2025
- Advances in Engineering Research Possibilities and Challenges
- Yuanhang Xuan
This paper presents the design and implementation of a real-time 3D acquisition system utilizing binocular stereoscopic vision. The primary objective is to develop a robust and efficient pipeline capable of capturing, processing, and reconstructing threedimensional geometry of dynamic scenes with low latency. The core methodology integrates synchronized image capture from a calibrated stereo camera pair, followed by real-time stereo rectification and a dense stereo matching algorithm optimized for speed. The system successfully achieves re-al-time performance, generating dense depth maps at a frame rate sufficient for interactive applications. Experimental results demonstrate the system's accu-racy in reconstructing static objects and its capability to track depth variations in moderately dynamic environments. The conclusion highlights that the im-plemented system provides a practical and effective solution for real-time 3D perception, establishing a reliable foundation for applications in robotics guid-ance, quality inspection, and augmented reality where immediate spatial feedback is critical.
- Research Article
- 10.3390/s25247495
- Dec 9, 2025
- Sensors (Basel, Switzerland)
- Pierluigi Rossi + 6 more
HighlightsWhat is the chance of detecting distance errors in RGB-D Cameras?Depth camera errors can be predicted according to distance, angle of the target and light conditions;How can these sources of measurement bias be considered?A geometry-aware model was tested to provide depth measurement corrections in outdoor environments;What is the behavior of the sensor across different distances?Distance measurement errors grow with distance and angle up to 3.5 m with targets at 16 m;What is the precision of the depth correction model developed in this research?Depth correction models can achieve RMSE between 0.46 and 0.64 m, even at long distances.Stereo cameras, also known as depth cameras or RGB-D cameras, are increasingly employed in a large variety of machinery for obstacle detection purposes and navigation planning. This also represents an opportunity in agricultural machinery for safety purposes to detect the presence of workers on foot and avoid collisions. However, their outdoor performance at medium and long range under operational light conditions remains weakly quantified: the authors then fit a field protocol and a model to characterize the pipeline of stereo cameras, taking the Intel RealSense D455 as benchmark, across various distances from 4 m to 16 m in realistic farm settings. Tests have been conducted using a 1 square meter planar target in outdoor environments, under diverse illumination conditions and with the panel being located at 0°, 10°, 20° and 35° from the center of the camera’s field of view (FoV). Built-in presets were also adjusted during tests, to generate a total of 128 samples. The authors then fit disparity surfaces to predict and correct systematic bias as a function of distance and radial FoV position, allowing them to compute mean depth and estimate a model of systematic error that takes depth bias as a function of distance, light conditions and FoV position. The results showed that the model can predict depth errors achieving a good degree of precision in every tested scenario (RMSE: 0.46–0.64 m, MAE: 0.40–0.51 m), enabling the possibility of replication and benchmarking on other sensors and field contexts while supporting safety-critical perception systems in agriculture.
- Research Article
- 10.3390/s25247467
- Dec 8, 2025
- Sensors (Basel, Switzerland)
- Xu Li + 5 more
HighlightsA general lightweight SLAM system is proposed that achieves high real-time performance by replacing traditional feature matching with efficient sparse optical flow tracking.A coarse-to-fine pose estimation strategy ensures robust and accurate camera localization through RANSAC PnP and subsequent nonlinear optimization.What are the main findings?A lightweight visual SLAM approach for RGB-D and stereo cameras is proposed, incorporating loop detection and re-localization.A three-strategy adaptive ORB feature extraction method is combined with a coarse-to-fine, two-stage pose estimation process to improve localization accuracy.What are the implications of the main findings?The findings demonstrate that high-precision visual localization and mapping can be achieved without compromising computational efficiency.The integration of multi-level optical flow with coarse-to-fine pose refinement enables GL-VSLAM to balance accuracy and speed for deployment on resource-limited platforms.Feature-based indirect SLAM is more robust than direct SLAM; however, feature extraction and descriptor computation are time-consuming. In this paper, we propose GL-VSLAM, a general lightweight visual SLAM approach designed for RGB-D and stereo cameras. GL-VSLAM utilizes sparse optical flow matching based on uniform motion model prediction to establish keypoint correspondences between consecutive frames, rather than relying on descriptor-based feature matching, thereby achieving high real-time performance. To enhance positioning accuracy, we adopt a coarse-to-fine strategy for pose estimation in two stages. In the first stage, the initial camera pose is estimated using RANSAC PnP based on robust keypoint correspondences from sparse optical flow. In the second stage, the camera pose is further refined by minimizing the reprojection error. Keypoints and descriptors are extracted from keyframes for backend optimization and loop closure detection. We evaluate our system on the TUM and KITTI datasets, as well as in a real-world environment, and compare it with several state-of-the-art methods. Experimental results demonstrate that our method achieves comparable positioning accuracy, while its efficiency is up to twice that of ORB-SLAM2.
- Research Article
- 10.1002/itl2.70187
- Dec 8, 2025
- Internet Technology Letters
- Yunna Liu + 3 more
ABSTRACT This paper introduces an enhanced motion analysis technology and framework specifically designed for sports scenarios in the industrial 5.0 environment. This technology takes advantage of human gaze features and binocular vision, which significantly improves the accuracy and reliability of motion analysis. By using dual cameras, the system can precisely calibrate the gaze direction, achieving more accurate gaze tracking. In addition, this framework conducts a comprehensive analysis of the inherent errors in the binocular gaze tracking algorithm, which achieves more accurate and robust motion detection. The intelligent motion analysis system built based on advanced image recognition technology further consolidates the capabilities of this technology. This system not only provides real‐time, high‐precision motion data but also has outstanding application potential in the industrial Internet of Things (IIoT) environment, as accurate motion understanding is crucial for optimizing human‐machine interaction and improving operational efficiency in such an environment. Moreover, the system's adaptability to dynamic conditions ensures consistent performance across various industrial settings.
- Research Article
- 10.1080/01431161.2025.2579803
- Dec 5, 2025
- International Journal of Remote Sensing
- Väinö Karjalainen + 7 more
ABSTRACT Drones are increasingly used in forestry to capture high-resolution remote sensing data, supporting enhanced monitoring, assessment, and decision-making processes. While operations above the forest canopy are already highly automated, flying inside forests remains challenging, primarily relying on manual piloting. In dense forests, relying on the Global Navigation Satellite System (GNSS) for localization is not feasible. In addition, the drone must autonomously adjust its flight path to avoid collisions. Recently, advancements in robotics have enabled autonomous drone flights in GNSS-denied obstacle-rich areas. In this article, a step towards autonomous forest data collection is taken by building a prototype of a robotic under-canopy drone utilizing state-of-the-art open source methods and validating its performance for data collection inside forests. Specifically, the study focused on camera-based autonomous flight under the forest canopy and photogrammetric post-processing of the data collected with the low-cost onboard stereo camera. The autonomous flight capability of the prototype was evaluated through multiple test flights in boreal forests. The tree parameter estimation capability was studied by performing diameter at breast height (DBH) estimation. The prototype successfully carried out flights in selected challenging forest environments, and the experiments showed promising performance in forest 3D modelling with a miniaturized stereoscopic photogrammetric system. The DBH estimation achieved a root mean square error (RMSE) of 3.33 - 3.97 cm (10.69 - 12.98 %) across all trees. For trees with a DBH less than 30 cm, the RMSE was 1.16 - 2.56 cm (5.74 - 12.47 %). The results provide valuable insights into autonomous under-canopy forest mapping and highlight the critical next steps for advancing lightweight robotic drone systems for mapping complex forest environments.
- Research Article
- 10.1055/s-0045-1813722
- Dec 1, 2025
- Arquivos Brasileiros de Neurocirurgia: Brazilian Neurosurgery
- Fabiana Ramos Viana + 5 more
Abstract Neuronavigation systems have become an essential tool for accurate surgical guidance. However, the influence of operator experience on the accuracy of these systems is still debated. This study aims to investigate the accuracy and precision of neuronavigation in an environment mimicking the conditions found in a surgical room and the impact of operator experience. We conducted a series of experiments using a neuronavigation system with operators of varying levels of experience. The accuracy of the system was measured and compared across 3 different operators. Inexperienced operators exhibited significantly lower levels of accuracy compared with their more experienced counterparts. The measured accuracy for an experienced operator was 2.9 ± 1.2mm, with an overall mean of 3.5 ± 1.7 mm when including results from inexperienced individuals. The best scenario appears to be when the point of interest is in the right temporal region, closer to the stereo vision camera of the tracking system. Our results demonstrate different accuracies in the neuronavigation system between operators with varying levels of experience. However, individuals without prior experience or training exhibit an acceptable level of accuracy for its use in surgical applications.