Automatic classification of construction safety reports using semi-supervised YAKE-Guided LDA approach

Hrishikesh Gadekar,Nikhil Bugalia

doi:10.1016/j.aei.2023.101929

Abstract

Literature on supervised Machine-Learning (ML) approaches for classifying text-based safety reports for the construction sector has been growing. Recent studies have emphasized the need to build ML approaches that balance high classification accuracy and performance on management criteria, such as resource intensiveness. However, despite being highly accurate, the extensively focused, supervised ML approaches may not perform well on management criteria as many factors contribute to their resource intensiveness. Alternatively, the potential for semi-supervised ML approaches to achieve balanced performance has rarely been explored in the construction safety literature. The current study contributes to the scarce knowledge on semi-supervised ML approaches by demonstrating the applicability of a state-of-the-art semi-supervised learning approach, i.e., Yet, Another Keyword Extractor (YAKE) integrated with Guided Latent Dirichlet Allocation (GLDA) for construction safety report classification. Construction-safety-specific knowledge is extracted as keywords through YAKE, relying on accessible literature with minimal manual intervention. Keywords from YAKE are then seeded in the GLDA model for the automatic classification of safety reports without requiring a large quantity of prelabeled datasets. The YAKE-GLDA classification performance (F1 score of 0.66) is superior to existing unsupervised methods for the benchmark data containing injury narratives from Occupational Health and Safety Administration (OSHA). The YAKE-GLDA approach is also applied to near-miss safety reports from a construction site. The study demonstrates a high degree of generality of the YAKE-GLDA approach through a moderately high F1 score of 0.86 for a few categories in the near-miss data. The current research demonstrates that, unlike the existing supervised approaches, the semi-supervised YAKE-GLDA approach can achieve a novel possibility of consistently achieving reasonably good classification performance across various construction-specific safety datasets yet being resource-efficient. Results from an objective comparative and sensitivity analysis contribute to much-required knowledge-contesting insights into the functioning and applicability of the YAKE-GLDA. The results from the current study will help construction organizations implement and optimize an efficient ML-based knowledge-mining strategy for domains beyond safety and across sites where the availability of a pre-labeled dataset is a significant limitation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic classification of construction safety reports using semi-supervised YAKE-Guided LDA approach

Abstract

Talk to us

Similar Papers

More From: Advanced Engineering Informatics

Lead the way for us

Journal: Advanced Engineering Informatics	Publication Date: Mar 10, 2023
Citations: 17

Similar Papers

AUTOMATIC INFORMATION EXTRACTION FROM TEXT
...
Zenodo (CERN European Organization for Nuclear Research) | VOL. -
, et. al. ...
01 Jan 2018
Zenodo (CERN European Organization for Nuclear Research) | VOL. -

Iterative processes: a review of semi-supervised machine learning in rehabilitation science
Emily A Kringle ... Lauren Terhorst
Disability and Rehabilitation: Assistive Technology | VOL. 15
Emily A Kringle, et. al.Emily A Kringle ... Lauren Terhorst
08 Jul 2019
Disability and Rehabilitation: Assistive Technology | VOL. 15

A semi-supervised machine learning approach for in-process monitoring of laser powder bed fusion
Ngoc Vu Nguyen ... Tuan Tran
Materials Today: Proceedings | VOL. 70
Ngoc Vu Nguyen, et. al.Ngoc Vu Nguyen ... Tuan Tran
01 Jan 2021
Materials Today: Proceedings | VOL. 70

Detecting Zero-Day Intrusion Attacks Using Semi-Supervised Machine Learning Approaches
Innocent Mbona ... Jan H P Eloff
IEEE Access | VOL. 10
Innocent Mbona, et. al.Innocent Mbona ... Jan H P Eloff
01 Jan 2021
IEEE Access | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic classification of construction safety reports using semi-supervised YAKE-Guided LDA approach

Abstract

Talk to us

Similar Papers

More From: Advanced Engineering Informatics