Abstract
Pseudo-relevance feedback has been perceived as an effective solution for automatic query expansion. However, a recent study has shown that traditional pseudo-relevance feedback may bring into topic drift and hence be harmful to the retrieval performance. It is often crucial to identify those good feedback documents from which useful expansion terms can be added to the query. Compared with traditional query expansion, XML query expansion needs not only content expansion but also considering structural expansion. This paper presents a solution for both identifying related documents and selecting good expansion information with new content and path constrains. Combined with XML semantic feature, a naive document similarity measurement is proposed in this paper. Based on this, kmedian clustering algorithm is firstly implemented and some related documents are found. Secondly, query expansion is only performed by two steps in the set of related documents, which key phrase extraction algorithm is carried out to expand original query in the first step and the second step is structural expansion based on the expanded key phrases. Finally a full-edged content-structure query expression which can represent user’s intention is formalized. Experimental results on IEEE CS collection show that the proposed method can reduce the topic drift effectively and obtain the better retrieval quality.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.