Abstract

The digital repository of research articles is increasing at a rapid rate and hence searching the right paper becoming a tedious task for researchers. A research paper recommendation system is advocated to help researchers in this context. In the process of designing such a system, proper representation of articles, more specifically, feature identification and extraction are two essential tasks. The existing approaches mainly consider direct features which are readily available from research articles. However, there are certain features which are not readily available from a paper, but may greatly influence the performance of recommendation systems. This paper proposes four indirect features: keyword diversification, text complexity, citation analysis over time, and scientific quality measurement to represent a research article. The keyword diversification measures the uniqueness of the keywords of a paper which helps variation in recommendation. The text complexity measurement helps to provide a paper by matching the user’s understandability level. The citation analysis over time decides the relevancy of a paper. The scientific quality measurement helps to measure the scientific values of papers. Formal definitions of the proposed indirect features, schemes to extract the feature values given a research article, and metrics to measure them quantitatively are discussed in this paper. To substantiate the efficacy of the proposed features, a number of experiments have been carried out. The experimental results reveal that the proposed indirect features uniquely define a research article than the direct features. Given a research paper, extraction of feature vector is computationally fast and thus feasible to filter a large corpus of papers in real time. More significantly, indirect features are matchable with user’s profile features, thus satisfying an important criterion in collaborative filtering.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call