Keyphrase prediction is a crucial task that can effectively provide underlying support for numerous downstream Natural Language Processing (NLP) tasks, e.g., information retrieval and document summarization. Existing keyphrase prediction approaches mostly focus on either extractive or generative methods. Extractive methods directly extract keyphrases that present in the document, while they cannot obtain absent keyphrases. Generative methods are designed to generate both present and absent keyphrases. However, the absent keyphrases are generated at the cost of hurting the performance of the present keyphrase prediction. The generation of present keyphrases mainly relies on the copying mechanism, ignoring the interdependence of the overall decisions. In contrast, the extractive model that directly extracts a text span from the document is more suitable for predicting the present keyphrase. Therefore, it is necessary to coordinate the extractive and generative patterns to obtain accurate and comprehensive keyphrases. Specifically, we divide the keyphrase prediction into two subtasks, i.e., present keyphrase extraction (PKE) and absent keyphrase generation (AKG), and propose a joint inference framework to exploit their respective advantages fully. For PKE, we treat it as a sequence labeling problem and apply a BERT-based sentence selector to select salient sentences that contain present keyphrases. For AKG, we introduce a Transformer-based architecture equipped with a gated fusion attention module, which fully integrates the present keyphrase knowledge learned from PKE by the fine-tuned BERT. The experimental results demonstrate that our approach can achieve state-of-the-art performance on all benchmark datasets.
Read full abstract