Abstract

Using computer vision and deep learning (e.g., Convolutional Neural Networks) to automatically recognise unsafe behaviour from digital images can help managers identify and respond quickly to such actions and mitigate an adverse event. However, there has been a tendency for computer vision studies in construction to focus solely on detecting unsafe behaviour (i.e., object detection) or the regions of interest with pre-defined labels. Moreover, such approaches have been unable to consider rich semantic information among multiple unsafe actions in a digital image. The research we present in this paper uses a safety rule query to determine and locate several unsafe behaviours in a digital image by employing a visual grounding approach. Our approach consists of: (1) visual and text feature extraction, (2) recursive sub-query, and (3) generation of the bounding box. We validate our approach by conducting an experiment to demonstrate it is effectiveness. The results from an experimental study demonstrate an average precision, recall, and F1-score were 0.55, 0.85, and 0.65, respectively, suggesting our approach can accurately identify and locate different types of unsafe behaviours from digital images acquired from a construction site.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.