HOLT-Net: Detecting smokers via human–object interaction with lite transformer network

Hua-Bao Ling,Dong Huang,Jinrong Cui,Chang-Dong Wang

doi:10.1016/j.engappai.2023.106919

Abstract

The increasing concerns of public health and safety lead to a practical need to detect smoking behaviors (or smokers) in public places. Previous smoker detection methods often focus on cigarette detection, which overlook the interaction between the smoker and the cigarette. In light of this, this paper presents a single-image smoker detection framework via human-object interaction with lite transformer network (HOLT-Net). Specifically, a one-stage human–object interaction module is devised to identify the interaction between the smoker and the cigarette. To incorporate the global information for better feature representation, a simple yet powerful lite transformer module is leveraged, where the multi-head self-attention blocks are exploited. Beyond that, a post-refinement module is integrated for taking advantage of an additional fine-grained cigarette detector to enhance the interaction detection accuracy. It is noteworthy that we present a new benchmark dataset named SCAU-Smoker Detection (SCAU-SD), which, to the best of our knowledge, is the first benchmark dataset for the specific task of smoker detection in single images with human–object interaction annotations. Extensive experimental results demonstrate the effectiveness of our HOLT-Net framework. The code is publicly available at https://github.com/JackKoLing/HOLT-Net.

Full Text