A novel masking model for Buddhist literature understanding by using Generative Adversarial Networks

Chaowen Yan,Yong Wang,Lili Chang,Qiang Zhang,Tao He

doi:10.1016/j.eswa.2024.125241

Abstract

This paper is focused on ancient Chinese Buddhist literature understanding. Buddhist literature incorporates a plethora of dialects and slang, which makes it challenging to extract semantic meaning seamlessly. To address this issue, a Generative Adversarial Network Masking Model (GAN-MM) is proposed to pre-train BERT models. This method focuses on optimizing the Masked Language Model (MLM) and takes advantage of the rich semantics of Buddhist terminologies as well as the absence of functional words in Buddhist literature. Furthermore, a semi-supervised learning algorithm was developed to train the GAN-MM. The model was evaluated by assessing its performance on two private tasks related to Buddhist literature understanding: sentiment classification and text segmentation, as well as two public tasks pertaining to the ancient Chinese understanding. The experimental results demonstrated a significant enhancement in the pre-training of BERT models when utilizing GAN-MM, as compared to conventional MLM methods. A large-scale Buddhist dataset, including 20,075 utterance documents and 146 million tokens, is public released at https://data.mendeley.com/datasets/5hzs8w46jh/1.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A novel masking model for Buddhist literature understanding by using Generative Adversarial Networks

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications

Lead the way for us

Similar Papers

Conditional BERT Contextual Augmentation
Xing Wu ... Shangwen Lv
-
Xing Wu, et. al.Xing Wu ... Shangwen Lv
01 Jan 2019
01 Jan 2019

Effective deep learning through bidirectional reading on masked language model
Hiroyuki Nishimoto
-
Hiroyuki NishimotoHiroyuki Nishimoto
01 Jan 2020
01 Jan 2020

MEM-KGC: Masked Entity Model for Knowledge Graph Completion With Pre-Trained Language Model
Bonggeun Choi ... Youngjoong Ko
IEEE Access | VOL. 9
Bonggeun Choi, et. al.Bonggeun Choi ... Youngjoong Ko
01 Jan 2020
IEEE Access | VOL. 9

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order
Yi Liao ... Qun Liu
-
Yi Liao, et. al.Yi Liao ... Qun Liu
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A novel masking model for Buddhist literature understanding by using Generative Adversarial Networks

Abstract

Talk to us

Similar Papers

More From: Expert Systems With Applications