The Mixture of Textrank and Lexrank Techniques of Single Document Automatic Summarization Research in Tibetan

Ailin Li,Tao Jiang,Qingshuai Wang,Hongzhi Yu

doi:10.1109/ihmsc.2016.278

Abstract

Today is an era of knowledge economy and information dominated. Automatic summarization is an important research in the field of natural language processing, its purpose to explore human obtain valuable information from natural language texts. As the Tibetan information processing technology is backward, and the achievements of automatic summarization have not been publicly reported in Tibetan. This paper references the existed Chinese and English automatic summarization technology in domestic and foreign, and proposes a method of Tibetan automatic summarization. Combination with the advantage of keyword processing based on TextRank and processing of the relationship between sentences based on LexRank algorithm. Take full account of the frequency, part of speech, word position, word length, content and position of a sentence. In particular, the generated summarization considering the similarity of candidate sentences. Experiments analysis three summarization methods based on TextRank, based on LexRank and based on LexRank+TextRank respectively, and using the ROUGE value to evaluate the effect of summarization. Experimental results show that, the effect of the mixture of TextRank and LexRank techniques of single document automatic summarization in Tibetan is better and accuracy reached 80%.

Full Text