Terahertz (THz) waves can penetrate many non-metallic materials (such as paper, plastic, and clothing), making them highly valuable in security inspection areas. However, the low contrast between target objects and the background in THz images poses problems for object detection and recognition tasks. Though many schemes have been proposed to tackle the problems, the low detection accuracy and high computational complexity remain challenges. To address these challenges, we propose a concealed hazardous object detection method called Cross-Feature Fusion Transformer YOLO (CFT-YOLO) for THz images. We design a Multi-Channel Spatial Feature Module (MCSFM) to achieve cross-channel and cross-spatial information transfer between feature maps. MCSFM also realizes the feature parameter transfer between the backbone and neck of the network. Additionally, we develop a CrossFuse-Transformer (CF-Former) module that leverages self-attention to fully exploit the correlations and structural information between different feature maps. We conduct a series of experiments on public datasets. The results show that CFT-YOLO outperforms the other advanced methods in THz hazardous object detection tasks. Moreover, the CFT-YOLO achieves a better balance between model complexity and detection accuracy compared to YOLOv8s, making it more practical for real-world applications.