Abstract

<p> </p> <p>With the massive and fast-growing amount of information on the Web, maintaining the effectiveness of Information Retrieval (IR) is a real challenge. The system in charge of online search must be able to search through billions of documents stored on millions of devices (Manning Christopher D et al., 2010). Traditional information retrieval systems try to sort out the input queries by mostly emphasizing on lexical similarity and exact term matching between query and documents using frequency-based methods. In other words, the relevancy of a query to a document is viewed based on the closeness of the distribution of words in a candidate document to the query. Since the lexical content of the optimal response is not usually known to the user, the user formulates a query with vocabulary that may have minimal overlap with the vocabulary appearing in its optimal document. Low overlap between query and document vocabulary is called term mismatch which emerges in retrieval results as poor recall performance. The term mismatch problem also has been referred to as lexical gap or lexical chasm with query on one side of the gap and documents on the other side. IR systems use different techniques to bridge the lexical chasm and solve the term mismatch problem. Many different query refinement techniques have already been developed. Given the user query, each refinement technique outputs a modified version of user’s query that can be used as an arch over the lexical gap from the query side to the document side.</p>

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.