Infrared object detection for robot vision based on multiple focus diffusion and task interaction alignment

  • Abstract
  • References
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Infrared object detection plays a key role in tasks such as autonomous robot navigation, industrial inspection, and search and rescue. However, the small gray-scale difference between the object and the background region in the infrared grayscale image and the single gray-scale information lead to the blurring of the semantic information of the image, which makes the robot unable to detect the object effectively. To address the above problems, this paper proposes a robot visual infrared object detection (MFDTIA-Detection) based on multiple focus diffusion and task interaction alignment. The feature extraction module adopts a dual-stream fusion structure in the backbone network, which combines the local feature extraction of CNN with the global feature modeling of transformer. The task-aligned dynamic detection header enhances inter-task interaction by sharing convolutional and task-aligned structures, exploiting task interaction features to optimize detection accuracy while reducing the number of parameters to accommodate resource-constrained devices. Experimental results on public datasets show that the model in this paper improves precision by 1.3%, recall by 1%, mAP50 by 0.6%, and mAP50-95 by 1.4%.

ReferencesShowing 8 of 8 papers
  • Cite Count Icon 91
  • 10.1109/lgrs.2023.3330957
DS-UNet: Dual-Stream U-Net for Oil Spill Detection of SAR Image
  • Jan 1, 2023
  • IEEE Geoscience and Remote Sensing Letters
  • Chunshan Li + 3 more

  • Cite Count Icon 110
  • 10.1007/s10489-020-01882-2
TIRNet: Object detection in thermal infrared images for autonomous driving
  • Sep 19, 2020
  • Applied Intelligence
  • Xuerui Dai + 2 more

  • Open Access Icon
  • Cite Count Icon 80
  • 10.1016/j.procs.2020.03.302
Review on recent development in infrared small target detection algorithms
  • Jan 1, 2020
  • Procedia Computer Science
  • Sur Singh Rawat + 2 more

  • Open Access Icon
  • Cite Count Icon 30175
  • 10.1109/tpami.2016.2577031
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.
  • Jun 6, 2016
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Shaoqing Ren + 3 more

  • Cite Count Icon 103
  • 10.1016/j.patcog.2023.109788
Infrared small target segmentation networks: A survey
  • Jun 28, 2023
  • Pattern Recognition
  • Renke Kou + 8 more

  • Cite Count Icon 13
  • 10.1109/tcsvt.2022.3202094
Representative Feature Alignment for Adaptive Object Detection
  • Feb 1, 2023
  • IEEE Transactions on Circuits and Systems for Video Technology
  • Shan Xu + 7 more

  • Cite Count Icon 54
  • 10.1016/j.eswa.2022.119132
Info-FPN: An Informative Feature Pyramid Network for object detection in remote sensing images
  • Oct 28, 2022
  • Expert Systems with Applications
  • Silin Chen + 6 more

  • Open Access Icon
  • Cite Count Icon 1235
  • 10.1109/tpami.2019.2956516
Cascade R-CNN: High Quality Object Detection and Instance Segmentation.
  • Apr 1, 2021
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • Zhaowei Cai + 1 more

Similar Papers
  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.infrared.2024.105627
YOFIR: High precise infrared object detection algorithm based on YOLO and FasterNet
  • Nov 19, 2024
  • Infrared Physics and Technology
  • Mi Wen + 5 more

YOFIR: High precise infrared object detection algorithm based on YOLO and FasterNet

  • Research Article
  • 10.3390/s25134054
LKD-YOLOv8: A Lightweight Knowledge Distillation-Based Method for Infrared Object Detection.
  • Jun 29, 2025
  • Sensors (Basel, Switzerland)
  • Xiancheng Cao + 2 more

Currently, infrared object detection is utilized in a broad spectrum of fields, including military applications, security, and aerospace. Nonetheless, the limited computational power of edge devices presents a considerable challenge in achieving an optimal balance between accuracy and computational efficiency in infrared object detection. In order to enhance the accuracy of infrared target detection and strengthen the implementation of robust models on edge platforms for rapid real-time inference, this paper presents LKD-YOLOv8, an innovative infrared object detection method that integrates YOLOv8 architecture with masked generative distillation (MGD), further augmented by the lightweight convolution design and attention mechanism for improved feature adaptability. Linear deformable convolution (LDConv) strengthens spatial feature extraction by dynamically adjusting kernel offsets, while coordinate attention (CA) refines feature alignment through channel-wise interaction. We employ a large-scale model (YOLOv8s) as the teacher to imparts knowledge and supervise the training of a compact student model (YOLOv8n). Experiments show that LKD-YOLOv8 achieves a 1.18% mAP@0.5:0.95 improvement over baseline methods while reducing the parameter size by 7.9%. Our approach effectively balances accuracy and efficiency, rendering it applicable for resource-constrained edge devices in infrared scenarios.

  • Research Article
  • 10.3390/drones9070479
YOLO-UIR: A Lightweight and Accurate Infrared Object Detection Network Using UAV Platforms
  • Jul 7, 2025
  • Drones
  • Chao Wang + 4 more

Within the field of remote sensing, Unmanned Aerial Vehicle (UAV) infrared object detection plays a pivotal role, especially in complex environments. However, existing methods face challenges such as insufficient accuracy or low computational efficiency, particularly in the detection of small objects. This paper proposes a lightweight and accurate UAV infrared object detection model, YOLO-UIR, for small object detection from a UAV perspective. The model is based on the YOLO architecture and mainly includes the Efficient C2f module, lightweight spatial perception (LSP) module, and bidirectional feature interaction fusion (BFIF) module. The Efficient C2f module significantly enhances feature extraction capabilities by combining local and global features through an Adaptive Dual-Stream Attention Mechanism. Compared with the existing C2f module, the introduction of Partial Convolution reduces the model’s parameter count while maintaining high detection accuracy. The BFIF module further enhances feature fusion effects through cross-level semantic interaction, thereby improving the model’s ability to fuse contextual features. Moreover, the LSP module efficiently combines features from different distances using Large Receptive Field Convolution Layers, significantly enhancing the model’s long-range information capture capability. Additionally, the use of Reparameterized Convolution and Depthwise Separable Convolution ensures the model’s lightweight nature, making it highly suitable for real-time applications. On the DroneVehicle and HIT-UAV datasets, YOLO-UIR achieves superior detection performance compared to existing methods, with an mAP of 71.1% and 90.7%, respectively. The model also demonstrates significant advantages in terms of computational efficiency and parameter count. Ablation experiments verify the effectiveness of each optimization module.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/ines.2012.6249883
Map building with RGB-D camera for mobil robot
  • Jun 1, 2012
  • Laszlo Somlyai + 1 more

The paper deals with the issues of the autonomous navigation of mobile robots. Nowadays several automatic operated vehicles, robots are used widely from the industry (storage system) to the everyday use (automatic parking system). The paper deals with the realization of a smaller sized robot vehicle which is capable for autonomous navigation. Within this framework we describe the realization of the vehicle, the developed board computer for navigation and the possible sensors as well. At the beginning of the paper some developments are presented which also dealt with the autonomous navigation of robots and mapping their environment. In knowledge of these works we started to develop our robot system. In a former paper we presented our robot vehicle and an early navigation algorithm. In the current paper it is summarized which sensors can detect the environment and the localization of the vehicle. The paper presents our experimental results about the accuracy of the Kinect sensor in more details.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/honet50430.2020.9322812
A Consolidation of SLAM and Signal Reference Point for Autonomous Robot Navigation
  • Dec 14, 2020
  • I Made Murwantara + 3 more

Robot navigation, heavily, utilizes sensors to identify its current location and to guide forward into its designated position. This movement is achievable, mostly, via semi or fully autonomous navigation method that has a pre-built map. However, we argue that an autonomous robot navigation should not only provide route path to go to their target location, it should also has a feature to return to their starting position or reference location. In this work, an autonomous robot navigation has been built using combination of Simultaneous Localization and Mapping (SLAM) and signal of a device as reference point to find the original position. We make use of Turtlebot2 robot with Kinect sensor to create map, and Infrared beam to create the home signal guidance. In achieving target location, robot uses SLAM that has Odometry to track the distance, where at the same time sensing the presence of homing signal. Then, it returns to its home location by tracking the navigation map and home signal to find the best route path. Our result shows that the combination of SLAM and signal reference point has simplified the autonomous robot navigation management.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.infrared.2024.105454
Small aircraft detection in infrared aerial imagery based on deep neural network
  • Nov 4, 2024
  • Infrared Physics and Technology
  • Kai Zhang + 3 more

Small aircraft detection in infrared aerial imagery based on deep neural network

  • Conference Article
  • Cite Count Icon 14
  • 10.1109/iros.2008.4651215
Learning in dynamic environments with Ensemble Selection for autonomous outdoor robot navigation
  • Sep 1, 2008
  • M.J Procopio + 2 more

Autonomous robot navigation in unstructured outdoor environments is a challenging area of active research. The navigation task requires identifying safe, traversable paths which allow the robot to progress toward a goal while avoiding obstacles. Machine learning techniques - in particular, classifier ensembles - are well adapted to this task, accomplishing near-to-far learning by augmenting near-field stereo readings in order to identify safe terrain and obstacles in the far field. Composition of the ensemble and subsequent combination of model outputs in this dynamic problem domain remain open questions. Recently, Ensemble selection has been proposed as a mechanism for selecting and combining models from an existing model library and shown to perform well in static domains. We propose the adaptation of this technique to the time-evolving data associated with the outdoor robot navigation domain. Important research questions as to the composition of the model library, as well as how to combine selected modelspsila outputs, are addressed in a two-factor experimental evaluation. We evaluate the performance of our technique on six fully labeled datasets, and show that our technique outperforms memoryless baseline techniques that do not leverage past experience.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/robot.2010.5509634
Coping with imbalanced training data for improved terrain prediction in autonomous outdoor robot navigation
  • May 1, 2010
  • Michael J Procopio + 2 more

Autonomous robot navigation in unstructured outdoor environments is a challenging and largely unsolved area of active research. The navigation task requires identifying safe, traversable paths that allow the robot to progress towards a goal while avoiding obstacles. Machine learning techniques are well adapted to this task, accomplishing near-to-far learning by training appearance-based models using near-field stereo readings in order to predict safe terrain and obstacles in the far field. However, these methods are subject to degraded performance when training data sets exhibit class imbalance, or skew, where data instances of one class outnumber those in another. In such scenarios, classifiers can be overwhelmed by the majority class, and will tend to ignore the minority class. In this paper, we show that typical outdoor terrain scenarios are associated with training data imbalance, and examine the impact of using undersampling, oversampling, SMOTE, and biased penalties techniques to correct for imbalance in stereo-derived training data. We conduct a statistically significant, repeated measures empirical evaluation and demonstrate improved far-field terrain prediction performance when using such methods for handling class imbalance versus taking no corrective action at all.

  • Research Article
  • 10.3390/fractalfract9100640
Integrated Analysis of Mapping, Path Planning, and Advanced Motion Control for Autonomous Robotic Navigation
  • Sep 30, 2025
  • Fractal and Fractional
  • Kishore Bingi + 4 more

Autonomous robotic navigation is essential in modern systems for revolutionising various industries that operate in both static and dynamic environments. To achieve this autonomous navigation, various conventional techniques that handle environment mapping, path planning, and motion control as individual modules often face challenges in addressing the complexities of autonomous navigation. Therefore, this paper presents an integrated technique that combines three essential components, such as environment mapping, path planning, and motion control, to enhance autonomous navigation performance. The first component, i.e., the mapping, utilises both binary and probabilistic occupancy maps to represent the environment. The second component is path planning, which incorporates various graph- and sampling-based algorithms such as PRM, A*, Hybrid A*, RRT, RRT*, and BiRRT, which are evaluated in terms of path length, computational time, and safety margin on various maps. The final component, i.e., motion control, utilises both conventional and advanced controller strategies such as PID, FOPID, SFC, and MPC, for better sinusoidal trajectory tracking. The four case studies for path planning and one case study on trajectory tracking on various occupancy maps demonstrated that the A* algorithm and MPC outperformed all the compared techniques in terms of optimal path length, computational time, safety margin, and trajectory tracking error. Thus, the proposed integrated approach reveals that the interplay between mapping fidelity, planning efficiency, and control robustness is vital for reliable autonomous navigation.

  • Conference Article
  • Cite Count Icon 132
  • 10.1145/3308561.3353771
CaBot: Designing and Evaluating an Autonomous Navigation Robot for Blind People
  • Oct 24, 2019
  • João Guerreiro + 5 more

Navigation robots have the potential to overcome some of the limitations of traditional navigation aids for blind people, specially in unfamiliar environments. In this paper, we present the design of CaBot (Carry-on roBot), an autonomous suitcase-shaped navigation robot that is able to guide blind users to a destination while avoiding obstacles on their path. We conducted a user study where ten blind users evaluated specific functionalities of CaBot, such as a vibro-tactile handle to convey directional feedback; experimented to find their comfortable walking speed; and performed navigation tasks to provide feedback about their overall experience. We found that CaBot's performance highly exceeded users' expectations, who often compared it to navigating with a guide dog or sighted guide. Users' high confidence, sense of safety, and trust on CaBot poses autonomous navigation robots as a promising solution to increase the mobility and independence of blind people, in particular in unfamiliar environments.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/nebc.2005.1431924
Behavior coordination of autonomous mobile robot navigation by neuro-fuzzy system
  • Apr 2, 2005
  • S Khatoon

The complexity of an autonomous robot's navigation task, poses several roadblocks to the use of traditional fuzzy control schemes, such as much larger input space than typical fuzzy applications, adding inputs increases the required number of set evaluations exponentially, as the size of a rule base swells, manual description becomes difficult to impossible. Therefore in this paper a new behavior based navigation system is described. A serious problem in fuzzy behavior implementation is the requirement of multiple fuzzy rules, which results in multiple output recommendations. After studying different methods used for coordination of multiple behaviors and issues of conflict resolution among competing behaviors, it has been observed that the degrees of applicability are analogous to neuronal activation levels. Therefore, a novel neuro-fuzzy system is proposed for behavior integration, which results in a more accurate and optimal path.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/wrc-sara.2019.8931954
An autonomous navigation method for mobile robot based on ROS
  • Aug 1, 2019
  • Shuyu Wang + 2 more

In order to solve the problem of autonomous localization and navigation caused by the loss of the position information of the mobile robot in a complex environment, a method for autonomous navigation of a mobile robot based on ROS is proposed in this paper. The ROS platform is used to avoid the problems of code redundancy, poor portability and so on. In this method, firstly, monte carlo algorithm is used to realize the autonomous localization of the robot, raster map is constructed with lidar data, and the concept of raster occupancy is introduced to reduce the influence of sensor noise. Then an improved ant colony algorithm is proposed for global path planning to realize autonomous navigation of the robot. At the end of this paper, the robot control system is combined with ROS navigation system for map construction and autonomous navigation experiments. The experimental results show that the improved ant colony algorithm can effectively shorten the number of iterations, reduce the number of iterations by 38%, improve the adaptability to the environment, facilitate the fast finding of the shortest path, and accurately realize the autonomous navigation task of the robot.

  • Conference Article
  • Cite Count Icon 12
  • 10.1109/ictc49870.2020.9289333
Autonomous Mobile Robot Navigation in Indoor Environments: Mapping, Localization, and Planning
  • Oct 21, 2020
  • Samyeul Noh + 2 more

Developing an autonomous indoor mobile robot navigation system from scratch is very difficult and it takes a long time. It requires a series of complex tasks such as handling given sensor inputs, calculating all the information needed for autonomous navigation, and controlling actuators required for movement. In this paper, an autonomous navigation system for indoor mobile robots is introduced mainly based on open source provided by the robot operating system. The presented system is capable of autonomously navigating an unstructured indoor environment avoiding collision with static or dynamic objects. To this end, the system consists of three main modules: mapping, localization, and planning. The mapping module builds a global map for an unknown environment by means of a simultaneous localization and mapping algorithm based on laser scanner data. The localization module estimates the mobile robot’s pose within the prebuilt map by way of an adaptive Monte Carlo localization approach. The planning module builds a local cost map for collision avoidance, generates collision-free trajectories to reach a goal pose based on the cost map, and produces control commands to follow the trajectories. The presented system has been tested not only in simulation environments built in the Gazebo simulator but also in real environments utilizing the Jackal mobile robot, to validate its performance for autonomous navigation including collision avoidance.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/hsi.2014.6860458
Laser based obstacle avoidance strategy for autonomous robot navigation using DBSCAN for versatile distance
  • Jun 1, 2014
  • Danilo Caceres Hernandez + 2 more

Towards fully autonomous navigation, guidance plays an important task for successful autonomous navigation. In this paper, the authors propose an obstacle avoidance strategy based on distance clustering analysis for safe autonomous robot navigation. Autonomous navigation systems must be able to recognize objects in order to perform a collision free motion in both unknown indoor/outdoor environments. Firstly, it was proposed to detect objects using the Density-based spatial clustering of applications with noise (DBSCAN) method through a dynamic density-reachable implementation. Secondly, in order to determine an optimal path for collision avoidance a distance clustering analysis was implemented. Subsequently, a set of possible waypoints were extracted in order to estimate the best path candidate. Preliminary results were gathered and tested on a group of consecutive frames. These specific methods of measurement were chosen to prove their effectiveness.

  • Conference Article
  • Cite Count Icon 8
  • 10.1109/icma.2014.6885953
Graph-based robust localization and mapping for autonomous mobile robotic navigation
  • Aug 1, 2014
  • Jingchun Yin + 3 more

Simultaneous Localization and Mapping (SLAM) aims to estimate the positions and orientations of the mobile robot and to construct the model of the environment. SLAM can help the robot to plan and execute a collision-free trajectory from the current configuration to the target configuration, so it is essential and critical for the mobile robot's autonomous navigation and effective task execution. SLAM plays a quite important role in a wide range of application fields, from indoor to outdoor, from industry to military, from terrain, submarine to outer space, etc. In indoor dynamic scenarios where there are moving objects, robust SLAM is also important for the mobile robot to co-exist with humans safely and to improve the capability in robot's estimation for its own state of pose and the surrounding world model. The research goal of this dissertation is to design, implement and validate Graph-SLAM algorithms for mobile robots in indoor office-like dynamic scenarios. Graph-SLAM belongs to the category that addresses the issues of localization and mapping in a hierarchical way, where a topological graph is constructed to represent the robot poses, the local relative motion constraints between them are estimated as the edges of the graph and the global consistent registration is performed to estimate the trajectory of the robot. Graph-SLAM can lead to much accurate results approaching the ground truth. The overview of the designed and implemented Graph-SLAM algorithm includes the following three parts: scan matching to estimate the local relative roto-translation, batch optimization to estimate the global mobile robot's trajectory, and the global line-feature-based mapping to construct the global line-feature model of the environment. The details of each chapter are briefly shown as follows: On the local level, moving-object-detection based scan matching is accomplished: first, conditioned-Hough-Transform-based segmentation is performed to extract and group the small-scale line-feature-candidate samples; second, occupancy-analysis-based moving-object detection is executed to detect and discard the segments corresponding to the moving objects; third, linear-regression-based line-feature matching is applied to merge the similar small-scale line features into larger-scale line features, and also to match the larger-scale line features in order to estimate the roto-translation values. The experiments will prove the effectiveness of the algorithm to estimate the relative roto-translation value even faced with the disturbances of the moving objects in the dynamic scenario. On the global level, the motion constraints computed from scan matching between the immediate consecutive, the close-by-but-not-adjacent robot poses are used to construct the topological graph, and the least-square cost function associated with the graph is optimized by a linear solution. The experimental tests dealing with the publicly available dataset will prove the effectiveness of the batch optimization method, which is quite efficient and accurate. In addition, for the local-level relative roto-translation estimation, yet-another robust wall-detection-based scan-matching algorithm is proposed and implemented to enhance the capability of the previous scan-matching algorithm: first, conditioned-Hough-Transform-and-linear-regression-based line-segment detection is performed to detect the line segments from the raw laser-scan-range data; second, wall detection is done to select the line segments that correspond to the walls of the environment; third, matching by fitting point to line is executed to estimate the roto-translation value. The experimental result will verify the effectiveness of this algorithm even when the moving object is close to the wall and there is much rotation error in the input odometry data. Moreover, on the global level, with the knowledge of the estimated global robot poses for the transfor

More from: International Journal of Advanced Robotic Systems
  • Research Article
  • 10.1177/17298806251360659
Development of a low-cost modular snake-like robot with 2-DOF modules for rescue operations in collapsed environments with fast communication
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • G Seeja + 3 more

  • Research Article
  • 10.1177/17298806251325135
Research on variable impedance control of SEA-driven upper limb rehabilitation robot based on singular perturbation method
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • Bingshan Hu + 4 more

  • Research Article
  • 10.1177/17298806251348118
Automatic cutting and suturing control system based on improved FP16 visual recognition algorithm
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • Jiayin Wang + 1 more

  • Research Article
  • 10.1177/17298806251342040
Infrared object detection for robot vision based on multiple focus diffusion and task interaction alignment
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • Jixu Zhang + 6 more

  • Research Article
  • 10.1177/17298806251360454
Robustness evaluation of offline reinforcement learning for robot control against action perturbations
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • Shingo Ayabe + 3 more

  • Research Article
  • 10.1177/17298806251356720
Real-time path planning for Mecanum-wheeled robots with type-2 fuzzy logic controller
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • Thanh-Lam Bui + 2 more

  • Research Article
  • 10.1177/17298806251352007
Smooth likelihood-based collision avoidance for polygon shaped and differential drive vehicles
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • Yang Zhou + 1 more

  • Research Article
  • 10.1177/17298806251363648
Evaluation of a model-driven approach for the integration of robot operating system-based complex robot systems
  • Jul 1, 2025
  • International Journal of Advanced Robotic Systems
  • Nadia Hammoudeh García + 3 more

  • Research Article
  • 10.1177/17298806251352059
Intraoperative computed tomography-guided robotic needle biopsy system with real-time imaging ability and remote-center-of-motion control
  • May 1, 2025
  • International Journal of Advanced Robotic Systems
  • Zheng-Yang Lai + 6 more

  • Research Article
  • 10.1177/17298806251339684
A two-wheeled robotic wheelchair with a slidable seat for elderly and people with lower limb disabilities
  • May 1, 2025
  • International Journal of Advanced Robotic Systems
  • Munyu Kim + 6 more

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon