Abstract

We present an information retrieval system that simultaneously allows to search for text and speech documents. The retrieval system accepts vague queries and performs a best-match search to find those documents that are relevant to the query. The output of the retrieval system is a list of ranked documents where the documents on the top of the list satisfy best the user's information need. The relevance of the documents is estimated by means of metadata (document description vectors). The metadata is automatically generated and it is organized such that queries can be processed efficiently. We introduce a controlled indexing vocabulary for both speech and text documents. The size of the new indexing vocabulary is small (1000 features) compared with the sizes of indexing vocabularies of conventional text retrieval (10000 - 100000 features). We show that the retrieval effectiveness based on such a small indexing vocabulary is similar to the retrieval effectiveness of a Boolean retrieval system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.