PurposeIn the context of fire incidents within buildings, efficient scene perception by firefighting robots is particularly crucial. Although individual sensors can provide specific types of data, achieving deep data correlation among multiple sensors poses challenges. To address this issue, this study aims to explore a fusion approach integrating thermal imaging cameras and LiDAR sensors to enhance the perception capabilities of firefighting robots in fire environments.Design/methodology/approachPrior to sensor fusion, accurate calibration of the sensors is essential. This paper proposes an extrinsic calibration method based on rigid body transformation. The collected data is optimized using the Ceres optimization algorithm to obtain precise calibration parameters. Building upon this calibration, a sensor fusion method based on coordinate projection transformation is proposed, enabling real-time mapping between images and point clouds. In addition, the effectiveness of the proposed fusion device data collection is validated in experimental smoke-filled fire environments.FindingsThe average reprojection error obtained by the extrinsic calibration method based on rigid body transformation is 1.02 pixels, indicating good accuracy. The fused data combines the advantages of thermal imaging cameras and LiDAR, overcoming the limitations of individual sensors.Originality/valueThis paper introduces an extrinsic calibration method based on rigid body transformation, along with a sensor fusion approach based on coordinate projection transformation. The effectiveness of this fusion strategy is validated in simulated fire environments.