Abstract

A large proportion of construction accidents are caused by unintentional and unsafe actions and behaviors. It is of significant difficulties and ineffectiveness to monitor unsafe behaviors using conventional manual supervision due to the complex and dynamic working conditions on construction sites. Recently, surveillance videos and computer vision (CV) techniques have been increasingly adopted to automatically identify risky behaviors. However, the challenge remains that spatial and temporal features in video clips cannot be effectively captured and fused by current CV models. To address this challenge, this paper describes a deep learning model named Spatial Temporal Relation Transformer (STR-Transformer), where spatial and temporal features of work behaviors are simultaneously extracted in paralleling video streams and then fused by a specially designed module. To verify the effectiveness of the STR-Transformer, a customized dataset is developed, including seven categories of construction worker behaviors and 1595 video clips. In numerical experiments and case studies, the STR-Transformer achieves an average precision of 88.7%, 4.0% higher than the baseline model. The STR-Transformer enables more accurate and reliable automatic safety surveillance on construction projects, and is expected to reduce accident rates and management costs. Moreover, the performance of STR-Transformer relies on efficient feature integration, which may inspire future studies to identify, extract, and fuse richer features when applying CV-based deep learning models in construction management.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.