In recent years, advancements in smart home technologies have underscored the need for the development of early fire and smoke detection systems to enhance safety and security. Traditional fire detection methods relying on thermal or smoke sensors exhibit limitations in terms of response time and environmental adaptability. To address these issues, this paper introduces the multi-scale information transformer–DETR (MITI-DETR) model, which incorporates multi-scale feature extraction and transformer-based attention mechanisms, tailored specifically for fire detection in smart homes. MITI-DETR achieves a precision of 99.00%, a recall of 99.50%, and a mean average precision (mAP) of 99.00% on a custom dataset designed to reflect diverse lighting and spatial conditions in smart homes. Extensive experiments demonstrate that MITI-DETR outperforms state-of-the-art models in terms of these metrics, especially under challenging environmental conditions. This work provides a robust solution for early fire detection in smart homes, combining high accuracy with real-time deployment feasibility.