Abstract

This paper proposes a new method to improve information retrieval performance of the vector space model (VSM) in part by preserving user-supplied relevance information in the long term in the system. The proposed method incorporates user relevance feedback information and original document similarity information into the retrieval model that is built using a sequence of linear transformations. High-dimensional and sparse vectors are mapped into the a low-dimensional vector space, namely the space representing the latent semantic meanings of words, by using SPCA (simple principal component analysis). An experimental information retrieval system based on the proposed method has been built. Experiments on the Medline collection and Cranfield collection have been carried out. Improved average precision compared with the LSI (latent semantic indexing) model, are 6.80% (Medline) and 67.46% (Cranfield) for the two training data sets, and 4.71% (Medline) and 8.12% (Cranfield) for the test data, respectively. The results of our experiments show that the proposed method has better retrieval performance and provides an approach that makes it possible to preserve user-supplied relevance information in the long term in the system in order to use it later.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.