Abstract
Ranking function is instrumental in affecting the performance of a search engine. Designing and optimizing a search engine's ranking function remains a daunting task for computer and information scientists. Recently, genetic programming (GP), a machine learning technique based on evolutionary theory, has shown promise in tackling this very difficult problem. Ranking functions discovered by GP have been found to be significantly better than many of the other existing ranking functions. However, current GP implementations for ranking function discovery are all designed utilizing the Vector Space model in which the same term weighting strategy is applied to all terms in a document. This may not be an ideal representation scheme at the individual query level considering the fact that many query terms should play different roles in the final ranking. In this paper, we propose a novel nonlinear ranking function representation scheme and compare this new design to the well-known Vector Space model. We theoretically show that the new representation scheme subsumes the traditional Vector Space model representation scheme as a special case and hence allows for additional flexibility in term weighting. We test the new representation scheme with the GP-based discovery framework in a personalized search (information routing) context using a TREC web corpus. The experimental results show that the new ranking function representation design outperforms the traditional Vector Space model for GP-based ranking function discovery.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.