Abstract
The aviation safety reporting system (ASRS) data, which are accident investigation reports issued by the national transportation safety board (NTSB), are often used for aviation risk level identification modeling. However, ASRS data suffer from sample imbalance and inaccurate encoding of textual information. To address these issues, this paper proposes a novel risk identification model. Initially, the pre-trained sentence bidirectional encoder representations from transformers (SBERT) model is fine-tuned using the textual information in the ASRS data, improving the accuracy of the text encoding. Subsequently, a generative adversarial network (GAN) is employed to generate samples for underrepresented categories in the ASRS data, thereby addressing the sample imbalance problem. Finally, a bayesian optimization algorithm is integrated to automatically search for the hyperparameters of the risk identification model, further enhancing the model's performance in risk level identification. Experimental results demonstrate that not only does the SBERT fine-tuning substantially improve the accuracy of text information encoding, but both data augmentation and automatic hyperparameter search also significantly contribute to the performance improvement of the risk level recognition model.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have