Most of the construction practices in the field of risk identification focus on the expertise, views, and judgments of subject matter experts. While the conventional expert-based approaches provide worth, several challenges exist due to time-consuming and expensive aspects. Moreover, limited experience in major projects makes public agencies susceptible to subjective judgment biases. To address these limitations, this study introduced a data-driven framework for risk identification using historical data and artificial intelligence techniques, particularly word embedding models. The model matches various risk items in past projects by considering the semantic meaning of words to find high frequency and consequence risks. Risk registers from more than 70 U.S. major transportation projects form the input dataset. The model is tested with more than 66% recall and 0.59 F1-score for risk detection for new projects. Acquired knowledge from previous projects assists project teams and public agencies to be well-equipped with a risk identification model instead of starting from scratch.
Read full abstract