Abstract

Mining query sub-intents or sub-topics is one of the important task in information retrieval. It provides the user several potential queries to explore possible search intents of the user. With the development of Big Data and Natural Language Processing, pre-trained language models have been applied to model complex semantic information of different text resources for mining robust query sub-intents. These studies usually utilize search results and query logs independently as two important resources to generate query sub-intents. However, we deem that the contextual information contained in search results and user interest information contained in query logs can be incorporated together to enhance the effectiveness of user sub-intent mining, which can maximize the best of both resources. To generate high-quality sub-intents, we design a sequence-to-sequence pretrained language model which accepts search result texts and query suggestions extracted from query logs as the input, and outputs generated sub-intent phrases. For modeling the relation between search results and query logs, we design two information encoder and a novel attention mechanism at the decoder part. At each decoding step, the model weights the attention between the input search results and query logs to determine the output token. The experimental results on MIMICS dataset outperform strong baseline methods in almost all evaluation metrics, illustrating the effectiveness of our proposed methods. We also conduct removing studies to prove the effectiveness of search results and query logs individually, and then study and compare different generation paradigms of sub-intent with experiments. We finally show several generated examples to illustrate the quality of our generated sub-intents directly.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.