Abstract

The automatic keyphrases extraction (AKE) of a document is any expression by which we can learn its content without having to read it. Keyphrases are exploited in natural language processing (NLP) applications. These phrases are often mentioned in the document but there may be some keyphrases that are not mentioned. In the field of AKE, researchers have exploited many techniques, such as statistical calculation, deep learning algorithms, graph representation, and sentence embedding techniques. Approaches that exploit embedding techniques calculate the similarity between a document and a candidate keyphrase, where similar phrases to the document are considered as keyphrases. Representing the document by a single vector makes its performance poor, especially in long documents. This is in addition to the inability of these methods to generate absent keyphrases. In order to overcome these problems, our paper proposes an unsupervised approach to AKE, based on the universal sentence encoder (USE) to represent candidate keyphrases and parts of the document probably containing keyphrases. Our method also generates keyphrases not mentioned in the text. We compared the performance of the proposed approach with other methods based on embedding techniques, where the results showed the superiority of our approach especially in long documents.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.