Outlier Detection from Pooled Data for Image Retrieval System Evaluation

Wei Xiong,Changsheng Xu,Ning Zhang,Joo Hwee Lim,Kelvin Foong,S.H Ong,Qi Tian

doi:10.1109/icassp.2007.366073

Abstract

Widely used in the evaluation of retrieval systems, the pooling method collects top ranked images from submitted retrieval systems resulting in possibly a very large pool of images. Inevitably, the pool may contain outliers. Human experts then manually annotate the relevance of them to create a ground truth for evaluation. Studies show that this annotation is time-consuming, tedious and inconsistent. To reduce human workload, this paper introduces an automatic method to detect outliers. Different from traditional detection methods using unsupervised techniques only, we utilize both supervised and unsupervised techniques sequentially as both positive and negative examples are (partially) available in this context. Specifically, support vector machines (SVMs) and fuzzy c-means clustering are used to predict data relevance and outlierness. Performance improvements using our method after outlier removal have been validated on the medical image retrieval task in ImageCLEF 2004.

Full Text