Abstract
For the extraction of judicial events for Tibetan, a low-resource language, traditional simple neural network approaches struggle to adequately capture the deep semantics and features of the texts because Tibetan texts are usually lengthy and contain numerous judicial-related entities. To overcome this limitation, this research presents an event extraction model combining deep word representation with hybrid neural networks for the Tibetan judicial domain. The model introduces the Chinese minority pre-trained language model (CINO), which generates dynamic word vector representations, addressing the challenge of modeling the deep semantics inherent in Tibetan texts. During feature extraction, a bidirectional long short-term memory network (BiLSTM) is applied to extract the temporal and contextual dependencies, while a convolutional neural network (CNN) is utilized to capture the local semantic features to construct a comprehensive global semantic representation. Finally, the sequences are decoded through conditional random field (CRF) to generate optimal prediction results, thus achieving the efficient extraction of Tibetan judicial events. The experimental findings indicate that the model outperforms the baselines by achieving F1 scores of 70.47% for trigger detection and 62.99% for argument recognition, with improvements of 16.6% and 16.42%, respectively. These results confirm the effectiveness and superiority of the proposed model.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have