COLTR: Semi-Supervised Learning to Rank With Co-Training and Over-Parameterization for Web Search

Yuchen Li,Qingzhong Wang,Haifang Li,Jiang Bian,Linghe Kong,Shuaiqiang Wang,Dawei Yin,Haoyi Xiong,Hao Liu,Dejing Dou,Guihai Chen

doi:10.1109/tkde.2023.3270750

Abstract

While <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">learning to rank (LTR) has been widely used in web search to prioritize most relevant webpages among the retrieved contents subject to the input queries, the traditional LTR models fail to deliver decent performance due to two main reasons: 1) the lack of well-annotated query-webpage pairs with ranking scores to cover search queries of various popularity, and 2) ill-trained models based on a limited number of training samples with poor generalization performance. To improve the performance of LTR models, tremendous efforts have been done from above two aspects, such as enlarging training sets with pseudo-labels of ranking scores by self-training, or refining the features used for LTR through feature extraction and dimension reduction. Though LTR performance has been marginally increased, we still believe these methods could be further improved in the newly-fashioned “interpolating regime”. Specifically, instead of lowering the number of features used for LTR models, our work proposes to transform original data with random Fourier feature, so as to over-parameterize the downstream LTR models (e.g., GBRank or LightGBM) with features in ultra-high dimensionality and achieve superb generalization performance. Furthermore, rather than self-training with pseudo-labels produced by the same LTR model in a “self-tuned” fashion, the proposed method incorporates the diversity of prediction results between the listwise and pointwise LTR models while co-training both models with a cyclic labeling-prediction pipeline in a “ping-pong” manner. We deploy the proposed <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Co-trained and Over-parameterized LTR system <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">COLTR at Baidu search and evaluate <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">COLTR with a large number of baseline methods. The results show that <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">COLTR could achieve <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\Delta NDCG_{4}$</tex-math></inline-formula> =3.64% <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\sim$</tex-math></inline-formula> 4.92%, compared to baselines, under various ratios of labeled samples. We also conduct a 7-day A/B Test using the realistic web traffics of Baidu Search, where we can still observe significant performance improvement around <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\Delta NDCG_{4}$</tex-math></inline-formula> =0.17% <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$\sim$</tex-math></inline-formula> 0.92% in real-world applications. <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">COLTR performs consistently both in online and offline experiments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

COLTR: Semi-Supervised Learning to Rank With Co-Training and Over-Parameterization for Web Search

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering

Lead the way for us

Journal: IEEE Transactions on Knowledge and Data Engineering	Publication Date: Dec 1, 2023
Citations: 6

Similar Papers

On the Sample Complexity of Random Fourier Features for Online Learning
Ming Lin ... Changshui Zhang
ACM Transactions on Knowledge Discovery from Data | VOL. 8
Ming Lin, et. al.Ming Lin ... Changshui Zhang
01 Jun 2014
ACM Transactions on Knowledge Discovery from Data | VOL. 8

Chapter 4 - Feature Extraction and Dimension Reduction
Abdulhamit Subasi
Practical Guide for Biomedical Signals Analysis Using Machine Learning Techniques | VOL. -
Abdulhamit SubasiAbdulhamit Subasi
01 Jan 2019
Practical Guide for Biomedical Signals Analysis Using Machine Learning Techniques | VOL. -

Detecting Cardiac Abnormalities from 12-lead ECG Signals Using Feature Extraction, Dimensionality Reduction, and Machine Learning Classification
Garrett Perkins ... Bradley Whitaker
-
Garrett Perkins, et. al.Garrett Perkins ... Bradley Whitaker
30 Dec 2020
30 Dec 2020

Learning Adaptive Random Features
Yanjun Li ... Jun Wang
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33
Yanjun Li, et. al.Yanjun Li ... Jun Wang
17 Jul 2019
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 33

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

COLTR: Semi-Supervised Learning to Rank With Co-Training and Over-Parameterization for Web Search

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering