Abstract

Accident investigation reports are text documents that systematically review and analyze the cause and process of accidents after accidents have occurred and have been widely used in the fields such as transportation, construction and aerospace. With the aid of accident investigation reports, the cause of the accident can be clearly identified, which provides an important basis for accident prevention and reliability assessment. However, since accident record reports are mostly composed of unstructured data such as text, the analysis of accident causes inevitably relies on a lot of expert experience and statistical analyses also require a lot of manual classification. Although, in recent years, with the development of natural language processing technology, there have been many efforts to automatically analyze and classify text. However, the existing methods either rely on large corpus and data preprocessing methods, which are cumbersome, or extract text information based on bidirectional encoder representation from transformers (BERT), but the computational cost is extremely high. These shortcomings make it still a great challenge to automatically analyze accident investigation reports and extract the information therein. To address the aforementioned problems, this study proposes a text-mining-based accident causal classification method based on a relational graph convolutional network (R-GCN) and pre-trained BERT. On the one hand, the proposed method avoids preprocessing such as stop word removal and word segmentation, which not only preserves the information of accident investigation reports to the greatest extent, but also avoids tedious operations. On the other hand, with the help of R-GCN to process the semantic features obtained by BERT representation, the dependence of BERT retraining on computing resources can be avoided.

Highlights

  • Accident investigation reports are usually text documents formed by professional investigators or teams through visits, conversations, viewing video surveillance and analyzing recorded data after accidents occur [1] and have been widely used in aviation, construction, transportation and other fields [2]

  • To address the aforementioned problems, this study proposes a text-mining-based accident causal classification method based on a relational graph convolutional network (R-GCN) and pre-trained bidirectional encoder representation from transformers (BERT)

  • The results of each cross-validation were measured through the average F1-score to evaluate the performance of the whole model and the model with the highest average F1-score was adopted as the final built model combining the pre-trained BERT and a R-GCN

Read more

Summary

Introduction

Accident investigation reports are usually text documents formed by professional investigators or teams through visits, conversations, viewing video surveillance and analyzing recorded data after accidents occur [1] and have been widely used in aviation, construction, transportation and other fields [2]. The process and consequences of the accident recorded in the reports can be leveraged by experts to analyze the cause of the accident, which is of great significance for preventing the recurrence of the accident or forming the accident response plan [3]. The current analysis of accident investigation reports mainly relies on expert experience to manually determine the cause of the accident, which requires a lot of work, and the accuracy is affected by the subjective experience of experts [4]. On 29 October 2018, an Indonesian Lion Air Boeing 737 MAX8 plane carrying 189 passengers and crew was flying from Jakarta’s Soekarno Hatta International Airport to Penang Port, Bangka Belitung Province. Experts have been investigating the cause of the accident as soon as possible after the accident, on 10 March 2019, another Ethiopian Boeing 737 MAX8 with 157 passengers and crew on board suffered the same accident [6]. If the causes of some accidents can be identified as early as possible, for example, the cause of the accident can be preliminary determined based on the records of the accident and it is possible to take appropriate measures in advance to avoid the occurrence of the accident [7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call