PurposeLearning from past construction accident reports is critical to reducing their occurrence. Digital technology provides feasibility for extracting risk factors from unstructured reports, but there are few related studies, and there is a limitation that textual contextual information cannot be considered during extraction, which tends to miss some important factors. Meanwhile, further analysis, assessment and control for the extracted factors are lacking. This paper aims to explore an integrated model that combines the advantages of multiple digital technologies to effectively solve the above problems.Design/methodology/approachA total of 1000 construction accident reports from Chinese government websites were used as the dataset of this paper. After text pre-processing, the risk factors related to accident causes were extracted using KeyBERT, and the accident texts were encoded into structured data. Tree-augmented naive (TAN) Bayes was used to learn the data and construct a visualized risk analysis network for construction accidents.Findings The use of KeyBERT successfully considered the textual contextual information, prompting the extracted risk factors to be more complete. The integrated TAN successfully further explored construction risk factors from multiple perspectives, including the identification of key risk factors, the coupling analysis of risk factors and the troubleshooting method of accident risk source. The area under curve (AUC) value of the model reaches up to 0.938 after 10-fold cross-validation, indicating good performance.Originality/value This paper presents a new machine-assisted integrated model for accident report mining and risk factor analysis, and the research findings can provide theoretical and practical support for accident safety management.