This study addresses the urgent need for an efficient and accurate smoke detection system to enhance safety measures in fire monitoring, industrial safety, and urban surveillance. Given the complexity of detecting smoke in diverse environments and under real-time constraints, our research aims to solve challenges related to low-resolution imagery, limited computational resources, and environmental variability. This study introduces a novel smoke detection system that utilizes the real-time detection Transformer (RT-DETR) architecture to enhance the speed and precision of video analysis. Our system integrates advanced modules, including triplet attention, ADown, and a high-level screening-feature fusion pyramid network (HS-FPN), to address challenges related to low-resolution imagery, real-time processing constraints, and environmental variability. The triplet attention mechanism is essential for detecting subtle smoke features, often overlooked due to their nuanced nature. The ADown module significantly reduces computational complexity, enabling real-time operation on devices with limited resources. Furthermore, the HS-FPN enhances the system’s robustness by amalgamating multi-scale features for reliable detection across various smoke types and sizes. Evaluation using a diverse dataset showcased notable improvements in average precision (AP50) and frames per second (FPS) metrics compared to existing state-of-the-art networks. Ablation studies validated the contributions of each component in achieving an optimal balance between accuracy and operational efficiency. The RT-DETR-based smoke detection system not only meets real-time requirements for applications like fire monitoring, industrial safety, and urban surveillance but also establishes a new performance benchmark in this field.