Abstract

The rapid development of generative artificial intelligence has brought significant opportunities for the advancement of digital humanities research. Intelligent processing of ancient texts, as an essential part of digital humanities, is also undergoing a transformation in research methodologies in the wave of AIGC. The integration of generative pre-trained models with Chinese ancient texts, a vital carrier of Chinese culture, allows for deep mining of the content of these texts and provides services that make ancient texts more understandable and accessible to the general public. In this research, we propose a method that combines the most renowned Chinese anthology, the “Siku Quanshu,” with generative pre-trained models. We developed the SikuGPT model, a generative model for ancient text processing tasks, based on GPT-type language models by continued pretraining. This model was tested on two typical tasks of ancient text processing: translation between classical and modern Chinese, and classification of ancient texts. The findings reveal that our model achieves advantages in understanding and generating scenarios of ancient texts. The capability of SikuGPT in processing traditional Chinese texts helps to promote the organization of ancient information and knowledge services, and advances the international dissemination of traditional Chinese culture.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call