Thermopile sensor arrays provide a sufficient counterbalance between person detection and localization while preserving privacy through low resolution. The latter is especially important in the context of smart building automation applications. Current research has shown that there are two machine learning-based algorithms that are particularly prominent for general object detection: You Only Look Once (YOLOv5) and Detection Transformer (DETR). Over the course of this paper, both algorithms are adapted to localize people in 32 × 32-pixel thermal array images. The drawbacks in precision due to the sparse amount of labeled data were counteracted with a novel generative image generator (IIG). This generator creates synthetic thermal frames from the sparse amount of available labeled data. Multiple robustness tests were performed during the evaluation process to determine the overall usability of the aforementioned algorithms as well as the advantage of the image generator. Both algorithms provide a high mean average precision (mAP) exceeding 98%. They also prove to be robust against disturbances of warm air streams, sun radiation, the replacement of the sensor with an equal type sensor, new persons, cold objects, movements along the image frame border and people standing still. However, the precision decreases for persons wearing thick layers of clothes, such as winter clothing, or in scenarios where the number of present persons exceeds the number of persons the algorithm was trained on. In summary, both algorithms are suitable for detection and localization purposes, although YOLOv5m has the advantage in real-time image processing capabilities, accompanied by a smaller model size and slightly higher precision.
Read full abstract