Abstract
An essential part of a text generation task is to extract critical information from the text. People usually obtain critical information in the text via manual extraction; however, the asymmetry between the ability to process information manually and the speed of information growth makes it impossible. This problem can be solved by automatic keyphrase extraction. In this paper, the mainstream unsupervised methods to extract keyphrases are summarized, and we analyze in detail the reasons for the differences in the performance of methods then provided some solutions.
Highlights
Under the background of the continuous development of the information age, the content based on words grows exponentially, making it more challenging to manage this large-scale information.This information could be processed manually in the past
Keyphrase extraction is widely used in many fields, such as natural language processing (NLP), information retrieval (IR) [9,10,11,12], opinion mining [13,14,15], document indexing [16], and document classification [17]
SingleRank: In view of the fact that the graphs constructed by TextRank are unweighted graphs and the weights of the edges can reflect the strength of the semantic relationship between the two nodes, using the weighted graph may be better in the keyphrase extraction task
Summary
Under the background of the continuous development of the information age, the content based on words grows exponentially, making it more challenging to manage this large-scale information This information could be processed manually in the past. The supervised method [18] transforms the keyphrase extraction task into a classification problem [19,20] or regression problem [21] It trains the model on the labeled training set and uses the trained model to determine whether a candidate word in a text is a keyphrase. We divide keyphrase extraction into the linguistic school and the statistical school We continue this classification method to divide commonly used metrics, features that affect keyphrase extraction, and mainstream unsupervised keyphrase extraction methods, making the structure and development path of the entire field look clear.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have