Abstract
Sentence retrieval is an information retrieval technique that aims to find sentences corresponding to an information need. It is used for tasks like question answering (QA) or novelty detection. Since it is similar to document retrieval but with a smaller unit of retrieval, methods for document retrieval are also used for sentence retrieval like term frequency—inverse document frequency (TF-IDF), BM 25 , and language modeling-based methods. The effect of partial matching of words to sentence retrieval is an issue that has not been analyzed. We think that there is a substantial potential for the improvement of sentence retrieval methods if we consider this approach. We adapted TF-ISF, BM 25 , and language modeling-based methods to test the partial matching of terms through combining sentence retrieval with sequence similarity, which allows matching of words that are similar but not identical. All tests were conducted using data from the novelty tracks of the Text Retrieval Conference (TREC). The scope of this paper was to find out if such approach is generally beneficial to sentence retrieval. However, we did not examine in depth how partial matching helps or hinders the finding of relevant sentences.
Highlights
Information retrieval involves finding material of an unstructured nature, that satisfies an information need from within large collections [1]
One of the first and most successful methods for sentence retrieval is the term frequency—inverse sentence frequency ( TF-ISF) method, which is an adaptation of the term frequency—inverse document frequency ( TF-IDF) method to sentence retrieval [3,5]
Our experiment was entirely focused on sentence retrieval, which represents the first task of novelty detection
Summary
Information retrieval involves finding material (e.g., documents) of an unstructured nature (e.g., text), that satisfies an information need from within large collections [1]. Sentence retrieval is similar to document retrieval and it’s defined as the task of acquiring relevant sentences as a response to a query, question, or reference sentence [2] or, task of finding relevant sentences from a document [3,4]. It can be used in various ways to simplify the end user task of finding the right information from document collections [4]. BM25 and language modeling-based methods are used for sentence retrieval where the sentence is the unit of retrieval [6]
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have