Abstract

An Information Retrieval System is a system that is capable of storage, retrieval, and maintenance of an Information. In this context Information can be composed of text (including numeric and date data), images, audio, video and other multi-media objects. The TF-IDF weight is a statistical measure used to evaluate how important a word is to a document in a collection or corpus. There exist various models for weighting terms of corpus documents and query terms. This work is carried out to analyze and evaluate the retrieval effectiveness of vector -- space model while using the new data set of FIRE 2011. The experiments were performed with TF-IDF and its variants. For all experiments and evaluation the open search engine, Terrier 3.5 was used. Our result shows that TF-IDF model gives the highest precision values with the new corpus dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.