Accidents in chemical production usually result in fatal injury, economic loss and negative social impact. Chemical accident reports which record past accident information, contain a large amount of expert knowledge. However, manually finding out the key factors causing accidents needs reading and analyzing of numerous accident reports, which is time-consuming and labor intensive. Herein, in this paper, a semi-automatic method based on natural language process (NLP) technology is developed to construct a knowledge graph of chemical accidents. Firstly, we build a named entity recognition (NER) model using SoftLexicon (simplify the usage of lexicon) + BERT-Transformer-CRF (conditional random field) to automatically extract the accident information and risk factors. The risk factors leading to accident in chemical accident reports are divided into five categories: human, machine, material, management, and environment. Through analysis of the extraction results of different chemical industries and different accident types, corresponding accident prevention suggestions are given. Secondly, based on the definition of classes and hierarchies of information in chemical accident reports, the seven-step method developed at Stanford University is used to construct the ontology-based chemical accident knowledge description model. Finally, the ontology knowledge description model is imported into the graph database Neo4j, and the knowledge graph is constructed to realize the structured storage of chemical accident knowledge. In the case of information extraction from 290 Chinese chemical accident reports, SoftLexicon + BERT-Transformer-CRF shows the best extraction performance among nine experimental models. Demonstrating that the method developed in the current work can be a promising tool in obtaining the factors causing accidents, which contributes to intelligent accident analysis and auxiliary accident prevention.
Read full abstract