Abstract

Random forest (RF)-based pointwise learning-to-rank (LtR) algorithms use surrogate loss functions to minimize the ranking error. In spite of their competitive performance to other state-of-the-art LtR algorithms, these algorithms, unlike other frameworks such as boosting and neural network, have not been thoroughly investigated in the literature so far. In the first part of this study, we aim to better understand and improve the RF-based pointwise LtR algorithms. When working with such an algorithm, currently we need to choose a setting from a number of available options such as (1) classification versus regression setting, (2) using absolute relevance judgements versus mapped labels, (3) the number of features using which a split-point for data is chosen, and (4) using weighted versus un-weighted average of the predictions of multiple base learners (i.e., trees). We conduct a thorough study on these four aspects as well as on a pairwise objective function for RF-based rank-learners. Experimental results on several benchmark LtR datasets demonstrate that performance can be significantly improved by exploring these aspects. In the second part of this paper, we, guided by our investigations performed into RF-based rank-learners, conduct extensive comparison between these and state-of-the-art rank-learning algorithms. This comparison reveals some interesting and insightful findings about LtR algorithms including the finding that RF-based LtR algorithms are among the most robust techniques across datasets with diverse properties.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call