Abstract

Pseudorelevance feedback (PRF) was proposed to solve the limitation of relevance feedback (RF), which is based on the user‐in‐the‐loop process. In PRF, the top‐k retrieved images are regarded as PRF. Although the PRF set contains noise, PRF has proven effective for automatically improving the overall retrieval result. To implement PRF, the Rocchio algorithm has been considered as a reasonable and well‐established baseline. However, the performance of Rocchio‐based PRF is subject to various representation choices (or factors). In this article, we examine these factors that affect the performance of Rocchio‐based PRF, including image‐feature representation, the number of top‐ranked images, the weighting parameters of Rocchio, and similarity measure. We offer practical insights on how to optimize the performance of Rocchio‐based PRF by choosing appropriate representation choices. Our extensive experiments on NUS‐WIDE‐LITE and Caltech 101 + Corel 5000 data sets show that the optimal feature representation is color moment + wavelet texture in terms of retrieval efficiency and effectiveness. Other representation choices are that using top‐20 ranked images as pseudopositive and pseudonegative feedback sets with the equal weight (i.e., 0.5) by the correlation and cosine distance functions can produce the optimal retrieval result.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.