Abstract
Incorporating semantic information into document representation is effective and potentially significant to improve retrieval performance. Recently, log-bilinear language model (LBL), as a form of neural language model, has been proved to be an effective way to learn semantic word representations, but its feasibility and effectiveness in information retrieval is mostly unknown. In this paper, we study how to efficiently use LBL to improve as-hoc retrieval. We propose a log-bilinear document language model (LB-DM) within the language modeling framework. The key idea is to learn semantically oriented representations for words, and estimate document language models based on these representations. Noise-constrictive estimation is employed to perform fast training on large document collections. Experiment results on standard TREC collections show that LB-DM performs better than translation language model and LDA-based retrieval model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.