Abstract
Classical Chinese poetry has been increasingly popular in recent years, and modeling its topic is quite a promising area of research. Chinese poems have the characteristic of short in length, but traditional topic models perform poorly when faced with short texts due to the text sparsity. Therefore, topic model should be improved to satisfy the scenario of classical Chinese poems. In this paper, a relational background knowledge boosting based topic model (RBKBTM) was proposed to overcome the text sparsity of Chinese poems. We incorporated background information into the model, which expanded the text content from the semantic perspective. The background knowledge was combined using word embedding and TextRank and was then fed into the core computing process. Subsequently, a new sampling formula was derived. Our proposed model was tested on three different tasks using three different datasets. The results demonstrate that the incorporated background knowledge can effectively overcomes text sparsity, improving the performance and effectiveness of the topic model.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have