Abstract

When searching for information with an information retrieval (IR) system, sometimes the results of the search documents provided by the system do not match the information needs of the user. Pseudo Relevance Feedback (PRF) based Query expansion (QE) tries to overcome these problems by adding words that are expected to improve retrieval results from top N ranked documents retrieved. The use of firefly algorithm (FA) as one of the optimization methods has been proven by the previous study to improve the performance of the IR system. However, in that study the weighting of words was done using the rocchio function of the Pseudo Relevant Document (PRD), so it is feared that the performance of IR system will be reduced if the number of relevant documents in PRD is little or none at all. Therefore, scoring by term relationship between query and PRD is used in this study combined with rocchio algorithm. The results of the study showed that usage of term relationship word co-occurrence or word similarity can improve the performance of the IRS that was previously built. In addition, word co-occurrence with jaccard have the best performance compared to the previous study and other combinations. FA itself was able to choose the optimal terms, even though the number of top N ranked documents increased. Furthermore, the combination of term relationship and rocchio algorithm can increase the convergence rate than the ones without rocchio algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.