In emergency rescue operations, unmanned aerial vehicles (UAVs) equipped with thermal infrared (TIR) sensors are essential to obtain ground information during nighttime operations. However, existing target detection algorithms mainly consider the detection accuracy but not the processing speed and storage space requirements. Furthermore, current neural network target detection algorithms primarily focus on conventional RGB images and are not optimized for UAV-based TIR video stream data. This paper proposes an improved Mask-RCNN algorithm for target detection in UAV TIR video streams to address current research deficiencies. First, MobileNetV3, which is used to process RGB images, is applied to process TIR data for outdoor emergency rescue operations, significantly increasing the time efficiency of the algorithm. Second, prior knowledge such as the projection model of the airborne camera and the target temperature characteristics in the UAV TIR video stream is utilized to filter the detecting results instead of pre-detection temperature masks. Compared with the original Mask-RCNN algorithm, the improved algorithm increases the processing speed, reduces the storage space requirements, and provides detection performances equal or slightly superior to that before the improvement.