Abstract

More and more attention has been paid to automatic keyphrase generation as it facilitates a wide variety of downstream AI applications, such as information retrieval, text summarization and opinion mining. Although sequence-to-sequence architecture with attention and copy mechanisms (CopyNet) to this task shows promising results, it still suffered from the following shortcomings: (i) it only encodes the keyphrase (usually consists of several words) in word level which can not adequately capture the overall meaning of keyphrase; (ii) it lacks a suitable way to model the correlation among different keyphrases which is very helpful for generating richer and more comprehensive candidate phrases. To overcome these challenges, a novel keyphrase generation model named Hierarchical CopyNet with graph attention networks (HCopy-GAT) is proposed. Firstly, the Hierarchical Recurrent Encode-Decoder neural network (HRED) is employed to learn the expressive embeddings of keyphrases in both word-level and phrase-level. Secondly, the graph attention neural networks (GAT) is applied to model the correlation among different keyphrases. Furthermore, we developed a new dataset named SOFTWARE, which can be taken as a new testbed for keyword generation tasks. With empirical experiments on several real datasets (including our newly built dataset), the proposed HCopy-GAT model outperforms state-of-the-art keyphrase generation models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call