Abstract

Ranking as a key functionality of Web search engines, is a user-centric process. However, click-through data, which is the source of implicit feedback of users, are not included in almost all of datasets published for the task of ranking. This limitation is also observable in the majority of benchmark datasets prepared for the learning to rank which is a new and promising trend in the information retrieval literature. In this paper, inspiring from the click-through data concept, the notion of click-through features is introduced. Click-through features could be derived from the given primitive dataset even in the absence of click-through data in the utilized benchmark dataset. These features are categorized into three different categories and are either related to the users’ queries, results of searches or clicks of users. With the use of click-through features, in this research, a novel learning to rank algorithm is proposed. By taking into account informativeness measures such as MAP, NDCG, InformationGain and OneR, at its first step, the proposed algorithm generates a classifier for each category of click-through features. Thereafter, these classifiers are fused together by using exponential ordered weighted averaging operators. Experimental results obtained from a plenty of investigations on WCL2R and LETOR4.0 benchmark datasets, demonstrate that the proposed method can substantially outperform well-known ranking methods in the presence of explicit click-through data based on MAP and NDCG criteria. Specifically, such an improvement is more noticeable on the top of ranked lists, which usually attract users’ attentions more than other parts of these lists. This betterment on WCL2R dataset is about 20.25% for P@1 and 5.68% for P@3 in comparison with SVMRank, which is a well-known learning to rank algorithm. CF-Rank can also obtain higher or comparable performance with baseline methods even in the absence of explicit click-through data in utilized primitive datasets. In this regard, the proposed method on the LETOR4.0 dataset has achieved an improvement of about 2.7% on MAP measure compared to AdaRank-NDCG algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call