Worldwide, fire disasters cause social, economic, and environmental damage, so their early detection and instant response is crucial for saving lives and property, which is possible via a soft computing approach. Researchers have therefore proposed different artificial intelligence-based approaches to detect fire at its early stages. However, several factors including domain shift problems, limited performance, and model complexity reduced their effectiveness in a real-world environment. To address these challenges, this study proposes an extensive data augmentation approach to tackle the domain shift problem by employing a mono-depth and atmospheric scattering model to generate and synthesise hazy, foggy, and night-time images. To further enhance the detection of small fire-affected areas, a refined DenseNet backbone is employed. This network is fine-tuned to extract intricate spatial details across different scales and establish meaningful connections between them. Subsequently, we introduce a soft attention mechanism into the backbone network. This entails utilising a 3D convolutional operation within the proposed attention framework to capture more extensive spatial features, which contributes to an enhanced discriminatory capability for detecting fire. Moreover, we have created a novel dataset explicitly designed for effective fire detection from a remote sensing perspective, known as the aerial drone and satellite fire (ADSF) dataset, encompasses images captured by drones and satellites. This broader range of image sources affords coverage across a wider expanse of geographical locations, surpassing the capabilities of CCTV cameras. Our proposed model achieves significant accuracy improvements, surpassing the state-of-the-art (SOTA) on both original and augmented datasets. For the Yar, FD, DFAN, and ADSF original datasets, we observed average accuracy increases of 1.25%, 1.95%, 2%, and 1.5%, respectively. In the comparisons with augmented versions of these datasets, our model consistently outperforms the SOTA, with accuracy improvements of 0.7, 1.97%, 1.42%, and 1.53%, respectively. Additionally, in our ablation studies, the proposed model excels in terms of both performance and resource efficiency. It achieves higher frame rates of 22.0, 72.5, and 7.0 FPS on CPU, GPU, and Raspberry Pi, respectively, with just 7.2 million learning parameters, making it well-suited for real-time deployment on edge devices. Finally, a Grad-CAM XAI method is utilised to visualise and highlight the most important regions in images emphasized by the model. The code and dataset are publicly available at https://github.com/Hikmat-Yar/ISPRS-Fire-Detection.
Read full abstract