Abstract

Workplace safety is a major concern in many countries. Among various industries, the construction sector is identified as the most hazardous workplace. Construction accidents not only cause human sufferings but also result in huge financial loss. To prevent recurrence of similar accidents in the future and make scientific risk control plans, analysis of accidents is essential. In the construction industry, fatality and catastrophe investigation summary reports are available for past accidents. In this study, text mining and natural language process (NLP) techniques are applied to analyze construction accident reports. To be more specific, five baseline models, support vector machine (SVM), linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), Naive Bayes (NB) and an ensemble model are proposed to classify the causes of the accidents. Besides, Sequential Quadratic Programming (SQP) algorithm is used to perfect the weight of each classifier involved in the ensemble model. EXperiment results show that the optimized ensemble model outperforms the rest models considered in this study in terms of average weighted F1 score. The result also shows that the proposed approach is more robust to cases of low support.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call