Abstract

According to the latest fatal work injury rates reported by the Bureau of Labors Statistics, construction sites remain the most hazardous workplaces. In the construction sector, fatality investigation summary reports are available for past accidents and by investigating such reports, valuable insights can be gained. In this study, text mining algorithms are explored for automatic construction accident causes classification. To be more specific, Word2Vec skip-gram model is utilized to learn word embedding from a domain-specific corpus and a hybrid structured deep neural network is proposed by incorporating the learned word embedding for accident reports classification. Dataset from Occupational Safety and Health Administration (OSHA) is employed in the experiment to evaluate the performance of the proposed approach. Besides, five baseline models: support vector machine (SVM), linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), Naive Bayes (NB) are employed to compare with the proposed approach. Experiment results show that the proposed model achieves the highest average weighted F1 score among all models considered in this study. The result also proves the effectiveness of applying Word2Vec skip-gram algorithm for semantic information augmentation. As a result, robustness of the model is improved when classifying cases of low support values.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call