Infrared object detection for robot vision based on multiple focus diffusion and task interaction alignment
Infrared object detection plays a key role in tasks such as autonomous robot navigation, industrial inspection, and search and rescue. However, the small gray-scale difference between the object and the background region in the infrared grayscale image and the single gray-scale information lead to the blurring of the semantic information of the image, which makes the robot unable to detect the object effectively. To address the above problems, this paper proposes a robot visual infrared object detection (MFDTIA-Detection) based on multiple focus diffusion and task interaction alignment. The feature extraction module adopts a dual-stream fusion structure in the backbone network, which combines the local feature extraction of CNN with the global feature modeling of transformer. The task-aligned dynamic detection header enhances inter-task interaction by sharing convolutional and task-aligned structures, exploiting task interaction features to optimize detection accuracy while reducing the number of parameters to accommodate resource-constrained devices. Experimental results on public datasets show that the model in this paper improves precision by 1.3%, recall by 1%, mAP50 by 0.6%, and mAP50-95 by 1.4%.
91
- 10.1109/lgrs.2023.3330957
- Jan 1, 2023
- IEEE Geoscience and Remote Sensing Letters
110
- 10.1007/s10489-020-01882-2
- Sep 19, 2020
- Applied Intelligence
80
- 10.1016/j.procs.2020.03.302
- Jan 1, 2020
- Procedia Computer Science
30175
- 10.1109/tpami.2016.2577031
- Jun 6, 2016
- IEEE Transactions on Pattern Analysis and Machine Intelligence
103
- 10.1016/j.patcog.2023.109788
- Jun 28, 2023
- Pattern Recognition
13
- 10.1109/tcsvt.2022.3202094
- Feb 1, 2023
- IEEE Transactions on Circuits and Systems for Video Technology
54
- 10.1016/j.eswa.2022.119132
- Oct 28, 2022
- Expert Systems with Applications
1235
- 10.1109/tpami.2019.2956516
- Apr 1, 2021
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- Research Article
3
- 10.1016/j.infrared.2024.105627
- Nov 19, 2024
- Infrared Physics and Technology
YOFIR: High precise infrared object detection algorithm based on YOLO and FasterNet
- Research Article
- 10.3390/s25134054
- Jun 29, 2025
- Sensors (Basel, Switzerland)
Currently, infrared object detection is utilized in a broad spectrum of fields, including military applications, security, and aerospace. Nonetheless, the limited computational power of edge devices presents a considerable challenge in achieving an optimal balance between accuracy and computational efficiency in infrared object detection. In order to enhance the accuracy of infrared target detection and strengthen the implementation of robust models on edge platforms for rapid real-time inference, this paper presents LKD-YOLOv8, an innovative infrared object detection method that integrates YOLOv8 architecture with masked generative distillation (MGD), further augmented by the lightweight convolution design and attention mechanism for improved feature adaptability. Linear deformable convolution (LDConv) strengthens spatial feature extraction by dynamically adjusting kernel offsets, while coordinate attention (CA) refines feature alignment through channel-wise interaction. We employ a large-scale model (YOLOv8s) as the teacher to imparts knowledge and supervise the training of a compact student model (YOLOv8n). Experiments show that LKD-YOLOv8 achieves a 1.18% mAP@0.5:0.95 improvement over baseline methods while reducing the parameter size by 7.9%. Our approach effectively balances accuracy and efficiency, rendering it applicable for resource-constrained edge devices in infrared scenarios.
- Research Article
- 10.3390/drones9070479
- Jul 7, 2025
- Drones
Within the field of remote sensing, Unmanned Aerial Vehicle (UAV) infrared object detection plays a pivotal role, especially in complex environments. However, existing methods face challenges such as insufficient accuracy or low computational efficiency, particularly in the detection of small objects. This paper proposes a lightweight and accurate UAV infrared object detection model, YOLO-UIR, for small object detection from a UAV perspective. The model is based on the YOLO architecture and mainly includes the Efficient C2f module, lightweight spatial perception (LSP) module, and bidirectional feature interaction fusion (BFIF) module. The Efficient C2f module significantly enhances feature extraction capabilities by combining local and global features through an Adaptive Dual-Stream Attention Mechanism. Compared with the existing C2f module, the introduction of Partial Convolution reduces the model’s parameter count while maintaining high detection accuracy. The BFIF module further enhances feature fusion effects through cross-level semantic interaction, thereby improving the model’s ability to fuse contextual features. Moreover, the LSP module efficiently combines features from different distances using Large Receptive Field Convolution Layers, significantly enhancing the model’s long-range information capture capability. Additionally, the use of Reparameterized Convolution and Depthwise Separable Convolution ensures the model’s lightweight nature, making it highly suitable for real-time applications. On the DroneVehicle and HIT-UAV datasets, YOLO-UIR achieves superior detection performance compared to existing methods, with an mAP of 71.1% and 90.7%, respectively. The model also demonstrates significant advantages in terms of computational efficiency and parameter count. Ablation experiments verify the effectiveness of each optimization module.
- Conference Article
5
- 10.1109/ines.2012.6249883
- Jun 1, 2012
The paper deals with the issues of the autonomous navigation of mobile robots. Nowadays several automatic operated vehicles, robots are used widely from the industry (storage system) to the everyday use (automatic parking system). The paper deals with the realization of a smaller sized robot vehicle which is capable for autonomous navigation. Within this framework we describe the realization of the vehicle, the developed board computer for navigation and the possible sensors as well. At the beginning of the paper some developments are presented which also dealt with the autonomous navigation of robots and mapping their environment. In knowledge of these works we started to develop our robot system. In a former paper we presented our robot vehicle and an early navigation algorithm. In the current paper it is summarized which sensors can detect the environment and the localization of the vehicle. The paper presents our experimental results about the accuracy of the Kinect sensor in more details.
- Conference Article
2
- 10.1109/honet50430.2020.9322812
- Dec 14, 2020
Robot navigation, heavily, utilizes sensors to identify its current location and to guide forward into its designated position. This movement is achievable, mostly, via semi or fully autonomous navigation method that has a pre-built map. However, we argue that an autonomous robot navigation should not only provide route path to go to their target location, it should also has a feature to return to their starting position or reference location. In this work, an autonomous robot navigation has been built using combination of Simultaneous Localization and Mapping (SLAM) and signal of a device as reference point to find the original position. We make use of Turtlebot2 robot with Kinect sensor to create map, and Infrared beam to create the home signal guidance. In achieving target location, robot uses SLAM that has Odometry to track the distance, where at the same time sensing the presence of homing signal. Then, it returns to its home location by tracking the navigation map and home signal to find the best route path. Our result shows that the combination of SLAM and signal reference point has simplified the autonomous robot navigation management.
- Research Article
2
- 10.1016/j.infrared.2024.105454
- Nov 4, 2024
- Infrared Physics and Technology
Small aircraft detection in infrared aerial imagery based on deep neural network
- Conference Article
14
- 10.1109/iros.2008.4651215
- Sep 1, 2008
Autonomous robot navigation in unstructured outdoor environments is a challenging area of active research. The navigation task requires identifying safe, traversable paths which allow the robot to progress toward a goal while avoiding obstacles. Machine learning techniques - in particular, classifier ensembles - are well adapted to this task, accomplishing near-to-far learning by augmenting near-field stereo readings in order to identify safe terrain and obstacles in the far field. Composition of the ensemble and subsequent combination of model outputs in this dynamic problem domain remain open questions. Recently, Ensemble selection has been proposed as a mechanism for selecting and combining models from an existing model library and shown to perform well in static domains. We propose the adaptation of this technique to the time-evolving data associated with the outdoor robot navigation domain. Important research questions as to the composition of the model library, as well as how to combine selected modelspsila outputs, are addressed in a two-factor experimental evaluation. We evaluate the performance of our technique on six fully labeled datasets, and show that our technique outperforms memoryless baseline techniques that do not leverage past experience.
- Conference Article
8
- 10.1109/robot.2010.5509634
- May 1, 2010
Autonomous robot navigation in unstructured outdoor environments is a challenging and largely unsolved area of active research. The navigation task requires identifying safe, traversable paths that allow the robot to progress towards a goal while avoiding obstacles. Machine learning techniques are well adapted to this task, accomplishing near-to-far learning by training appearance-based models using near-field stereo readings in order to predict safe terrain and obstacles in the far field. However, these methods are subject to degraded performance when training data sets exhibit class imbalance, or skew, where data instances of one class outnumber those in another. In such scenarios, classifiers can be overwhelmed by the majority class, and will tend to ignore the minority class. In this paper, we show that typical outdoor terrain scenarios are associated with training data imbalance, and examine the impact of using undersampling, oversampling, SMOTE, and biased penalties techniques to correct for imbalance in stereo-derived training data. We conduct a statistically significant, repeated measures empirical evaluation and demonstrate improved far-field terrain prediction performance when using such methods for handling class imbalance versus taking no corrective action at all.
- Research Article
- 10.3390/fractalfract9100640
- Sep 30, 2025
- Fractal and Fractional
Autonomous robotic navigation is essential in modern systems for revolutionising various industries that operate in both static and dynamic environments. To achieve this autonomous navigation, various conventional techniques that handle environment mapping, path planning, and motion control as individual modules often face challenges in addressing the complexities of autonomous navigation. Therefore, this paper presents an integrated technique that combines three essential components, such as environment mapping, path planning, and motion control, to enhance autonomous navigation performance. The first component, i.e., the mapping, utilises both binary and probabilistic occupancy maps to represent the environment. The second component is path planning, which incorporates various graph- and sampling-based algorithms such as PRM, A*, Hybrid A*, RRT, RRT*, and BiRRT, which are evaluated in terms of path length, computational time, and safety margin on various maps. The final component, i.e., motion control, utilises both conventional and advanced controller strategies such as PID, FOPID, SFC, and MPC, for better sinusoidal trajectory tracking. The four case studies for path planning and one case study on trajectory tracking on various occupancy maps demonstrated that the A* algorithm and MPC outperformed all the compared techniques in terms of optimal path length, computational time, safety margin, and trajectory tracking error. Thus, the proposed integrated approach reveals that the interplay between mapping fidelity, planning efficiency, and control robustness is vital for reliable autonomous navigation.
- Conference Article
132
- 10.1145/3308561.3353771
- Oct 24, 2019
Navigation robots have the potential to overcome some of the limitations of traditional navigation aids for blind people, specially in unfamiliar environments. In this paper, we present the design of CaBot (Carry-on roBot), an autonomous suitcase-shaped navigation robot that is able to guide blind users to a destination while avoiding obstacles on their path. We conducted a user study where ten blind users evaluated specific functionalities of CaBot, such as a vibro-tactile handle to convey directional feedback; experimented to find their comfortable walking speed; and performed navigation tasks to provide feedback about their overall experience. We found that CaBot's performance highly exceeded users' expectations, who often compared it to navigating with a guide dog or sighted guide. Users' high confidence, sense of safety, and trust on CaBot poses autonomous navigation robots as a promising solution to increase the mobility and independence of blind people, in particular in unfamiliar environments.
- Conference Article
5
- 10.1109/nebc.2005.1431924
- Apr 2, 2005
The complexity of an autonomous robot's navigation task, poses several roadblocks to the use of traditional fuzzy control schemes, such as much larger input space than typical fuzzy applications, adding inputs increases the required number of set evaluations exponentially, as the size of a rule base swells, manual description becomes difficult to impossible. Therefore in this paper a new behavior based navigation system is described. A serious problem in fuzzy behavior implementation is the requirement of multiple fuzzy rules, which results in multiple output recommendations. After studying different methods used for coordination of multiple behaviors and issues of conflict resolution among competing behaviors, it has been observed that the degrees of applicability are analogous to neuronal activation levels. Therefore, a novel neuro-fuzzy system is proposed for behavior integration, which results in a more accurate and optimal path.
- Conference Article
2
- 10.1109/wrc-sara.2019.8931954
- Aug 1, 2019
In order to solve the problem of autonomous localization and navigation caused by the loss of the position information of the mobile robot in a complex environment, a method for autonomous navigation of a mobile robot based on ROS is proposed in this paper. The ROS platform is used to avoid the problems of code redundancy, poor portability and so on. In this method, firstly, monte carlo algorithm is used to realize the autonomous localization of the robot, raster map is constructed with lidar data, and the concept of raster occupancy is introduced to reduce the influence of sensor noise. Then an improved ant colony algorithm is proposed for global path planning to realize autonomous navigation of the robot. At the end of this paper, the robot control system is combined with ROS navigation system for map construction and autonomous navigation experiments. The experimental results show that the improved ant colony algorithm can effectively shorten the number of iterations, reduce the number of iterations by 38%, improve the adaptability to the environment, facilitate the fast finding of the shortest path, and accurately realize the autonomous navigation task of the robot.
- Conference Article
12
- 10.1109/ictc49870.2020.9289333
- Oct 21, 2020
Developing an autonomous indoor mobile robot navigation system from scratch is very difficult and it takes a long time. It requires a series of complex tasks such as handling given sensor inputs, calculating all the information needed for autonomous navigation, and controlling actuators required for movement. In this paper, an autonomous navigation system for indoor mobile robots is introduced mainly based on open source provided by the robot operating system. The presented system is capable of autonomously navigating an unstructured indoor environment avoiding collision with static or dynamic objects. To this end, the system consists of three main modules: mapping, localization, and planning. The mapping module builds a global map for an unknown environment by means of a simultaneous localization and mapping algorithm based on laser scanner data. The localization module estimates the mobile robot’s pose within the prebuilt map by way of an adaptive Monte Carlo localization approach. The planning module builds a local cost map for collision avoidance, generates collision-free trajectories to reach a goal pose based on the cost map, and produces control commands to follow the trajectories. The presented system has been tested not only in simulation environments built in the Gazebo simulator but also in real environments utilizing the Jackal mobile robot, to validate its performance for autonomous navigation including collision avoidance.
- Conference Article
1
- 10.1109/hsi.2014.6860458
- Jun 1, 2014
Towards fully autonomous navigation, guidance plays an important task for successful autonomous navigation. In this paper, the authors propose an obstacle avoidance strategy based on distance clustering analysis for safe autonomous robot navigation. Autonomous navigation systems must be able to recognize objects in order to perform a collision free motion in both unknown indoor/outdoor environments. Firstly, it was proposed to detect objects using the Density-based spatial clustering of applications with noise (DBSCAN) method through a dynamic density-reachable implementation. Secondly, in order to determine an optimal path for collision avoidance a distance clustering analysis was implemented. Subsequently, a set of possible waypoints were extracted in order to estimate the best path candidate. Preliminary results were gathered and tested on a group of consecutive frames. These specific methods of measurement were chosen to prove their effectiveness.
- Conference Article
8
- 10.1109/icma.2014.6885953
- Aug 1, 2014
Simultaneous Localization and Mapping (SLAM) aims to estimate the positions and orientations of the mobile robot and to construct the model of the environment. SLAM can help the robot to plan and execute a collision-free trajectory from the current configuration to the target configuration, so it is essential and critical for the mobile robot's autonomous navigation and effective task execution. SLAM plays a quite important role in a wide range of application fields, from indoor to outdoor, from industry to military, from terrain, submarine to outer space, etc. In indoor dynamic scenarios where there are moving objects, robust SLAM is also important for the mobile robot to co-exist with humans safely and to improve the capability in robot's estimation for its own state of pose and the surrounding world model. The research goal of this dissertation is to design, implement and validate Graph-SLAM algorithms for mobile robots in indoor office-like dynamic scenarios. Graph-SLAM belongs to the category that addresses the issues of localization and mapping in a hierarchical way, where a topological graph is constructed to represent the robot poses, the local relative motion constraints between them are estimated as the edges of the graph and the global consistent registration is performed to estimate the trajectory of the robot. Graph-SLAM can lead to much accurate results approaching the ground truth. The overview of the designed and implemented Graph-SLAM algorithm includes the following three parts: scan matching to estimate the local relative roto-translation, batch optimization to estimate the global mobile robot's trajectory, and the global line-feature-based mapping to construct the global line-feature model of the environment. The details of each chapter are briefly shown as follows: On the local level, moving-object-detection based scan matching is accomplished: first, conditioned-Hough-Transform-based segmentation is performed to extract and group the small-scale line-feature-candidate samples; second, occupancy-analysis-based moving-object detection is executed to detect and discard the segments corresponding to the moving objects; third, linear-regression-based line-feature matching is applied to merge the similar small-scale line features into larger-scale line features, and also to match the larger-scale line features in order to estimate the roto-translation values. The experiments will prove the effectiveness of the algorithm to estimate the relative roto-translation value even faced with the disturbances of the moving objects in the dynamic scenario. On the global level, the motion constraints computed from scan matching between the immediate consecutive, the close-by-but-not-adjacent robot poses are used to construct the topological graph, and the least-square cost function associated with the graph is optimized by a linear solution. The experimental tests dealing with the publicly available dataset will prove the effectiveness of the batch optimization method, which is quite efficient and accurate. In addition, for the local-level relative roto-translation estimation, yet-another robust wall-detection-based scan-matching algorithm is proposed and implemented to enhance the capability of the previous scan-matching algorithm: first, conditioned-Hough-Transform-and-linear-regression-based line-segment detection is performed to detect the line segments from the raw laser-scan-range data; second, wall detection is done to select the line segments that correspond to the walls of the environment; third, matching by fitting point to line is executed to estimate the roto-translation value. The experimental result will verify the effectiveness of this algorithm even when the moving object is close to the wall and there is much rotation error in the input odometry data. Moreover, on the global level, with the knowledge of the estimated global robot poses for the transfor
- Research Article
- 10.1177/17298806251360659
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251325135
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251348118
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251342040
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251360454
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251356720
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251352007
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251363648
- Jul 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251352059
- May 1, 2025
- International Journal of Advanced Robotic Systems
- Research Article
- 10.1177/17298806251339684
- May 1, 2025
- International Journal of Advanced Robotic Systems
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.