Abstract

AbstractThe objectives of this paper are to describe the effect of using weighted index terms in a document retrieval system, and to evaluate retrieval performance when queries are expanded by terms occurring in clusters with the query terms. Three data collections, each indexed by several methods, two of which were studied and reported on in previous work, are used to develop explicit results. The study both expands upon and extends previous work at the University of Maryland.The effect of weighting index terms in the document collection, the queries and the formation of clusters is analyzed. Eight cases are investigated in which index terms are weighted and unweighted. The best results are obtained when weighted index terms are used in forming clusters, in queries, and in documents. In this case, the results on the new collection demonstrate a significant improvement in retrieval performance relative to the performance with the unmodified data base, when clustered terms are added to queries. The improvement is in contrast to the results in the previous study, where a degradation in performance, or at best an insignificant improvement, was obtained.Comparisons are made to related work by Sparck‐Jones and her colleagues. This study tends to support the conclusion of Sparck‐Jones that weighted index terms provide better retrieval performance than unweighted terms.The cluster addition of index terms to queries yields unpredictable results. Some collections show an improvement in retrieval performance, others a degradation or no change in performance. Sparck‐Jones obtained an improvement in retrieval performance for her document collection. We conclude that the results are highly dependent upon the document collection, and the technique should be employed with caution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call