Abstract

Generating absent keyphrases, which do not appear in the input document, is challenging in the keyphrase prediction task. Most previous works treat the problem as an autoregressive sequence-to-sequence generation task, which demonstrates promising results for generating grammatically correct and fluent absent keyphrases. However, such an end-to-end process with a complete data-driven manner is unconstrained, which is prone to generate keyphrases inconsistent with the input document. In addition, the existing autoregressive decoding method makes the generation of keyphrases must be done from left to right, leading to slow speed during inference. In this paper, we propose a constrained absent keyphrase generation method in a prompt-based learning fashion. Specifically, the prompt will be created firstly based on the keywords, which are defined as the overlapping words between absent keyphrase and document. Then, a mask-predict decoder is used to complete the absent keyphrase on the constraint of prompt. Experiments on keyphrase generation benchmarks have demonstrated the effectiveness of our approach. In addition, we evaluate the performance of constrained absent keyphrases generation from an information retrieval perspective. The result shows that our approach can generate more consistent keyphrases, which can improve document retrieval performance. What’s more, with a non-autoregressive decoding manner, our model can speed up the absent keyphrase generation by 8.67× compared with the autoregressive method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call