Abstract
Based on an exponentially increasing number of academic articles, discovering and citing comprehensive and appropriate resources has become a non-trivial task. Conventional citation recommender methods suffer from severe information loss. For example, they do not consider the section of the paper that the user is writing and for which they need to find a citation, the relatedness between the words in the local context (the text span that describes a citation), or the importance on each word from the local context. These shortcomings make such methods insufficient for recommending adequate citations to academic manuscripts. In this study, we propose a novel embedding-based neural network called “dual attention model for citation recommendation (DACR)” to recommend citations during manuscript preparation. Our method adapts embedding of three semantic information: words in the local context, structural contexts, and the section on which a user is working. A neural network model is designed to maximize the similarity between the embedding of the three input (local context words, section and structural contexts) and the target citation appearing in the context. The core of the neural network model is composed of self-attention and additive attention, where the former aims to capture the relatedness between the contextual words and structural context, and the latter aims to learn the importance of them. The experiments on real-world datasets demonstrate the effectiveness of the proposed approach.
Highlights
When writing an academic paper, one of the most frequent questions considered is: “Which paper should I cite at this place?” Based on the massive number of papers being published, it is impossible for a researcher to read every article that might be relevant to their study
We propose a novel embedding-based neural network called dual attention model for citation recommendation (DACR) that is designed to capture the relatedness and importance of words in the context which needs citations and structural contexts in the manuscript, as well as the section for which the user is working
The citations were removed in this method, and the recommendations were made by ranking the IN document vectors via cosine similarity relative to the vector inferred from the learnt model by taking context words and structural contexts as input
Summary
When writing an academic paper, one of the most frequent questions considered is: “Which paper should I cite at this place?” Based on the massive number of papers being published, it is impossible for a researcher to read every article that might be relevant to their study. Studies in (McNee et al, 2002; Gori and Pucci, 2006; Caragea et al, 2013; Kucuktuncet al., 2013; Jia and Saule, 2018) considered recommendations based on a collection of seed papers, and (Alzoghbi et al, 2015; Li et al, 2018) proposed methods using meta-data, such as authorship information, titles, abstracts, keyword lists, and publication years When applying such methods to real-world paper-writing tasks, there is a lack of consideration for the local context of a citation within a draft, which can potentially lead to suboptimal results. Adequate recommendations of citations for a manuscript should capture the relatedness and importance of words and cited articles in the context which needs citations, as well as the purpose of the section on which the writer is currently working. The proposed model embeds sections into an embedding space and utilizes the embedded sections as additional features for recommendation tasks
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have