Abstract

This paper presents a project called KnowIng camera prototype SyStem (KISS) for real-time places-of-interest (POI) recognition and annotation for smartphone photos, with the availability of online geotagged images for POIs as our knowledge base. We propose a “Spatial+Visual” (S+V) framework which consists of a probabilistic field-of-view (pFOV) model in the spatial phase and sparse coding similarity metric in the visual phase to recognize phone-captured POIs. Moreover, we put forward an offline Collaborative Salient Area (COSTAR) mining algorithm to detect common visual features (called Costars) among the noisy photos geotagged on each POI, thus to clean the geotagged image database. The mining result can be utilized to annotate the region-of-interest on the query image during the online query processing. Besides, this mining procedure also improves the efficiency and accuracy of the S+V framework. Furthermore, we extend the pFOV model into a Bayesian FOV( <inline-formula><tex-math notation="LaTeX"> $\beta$</tex-math></inline-formula> FOV) model which improves the spatial recognition accuracy by more than 30 percent and also further alleviates visual computation. From a bayesian point of view, the likelihood of a certain POI being captured by phones is a prior probability in pFOV model which is represented as a posterior probability in <inline-formula> <tex-math notation="LaTeX">$\beta$</tex-math></inline-formula> FOV model.Our experiments in the real-world and Oxford 5K datasets show promising recognition results. In order to provide a fine-grained annotation ground truth, we labeled a new dataset based on Oxford 5K and make it public available on the web. Our COSTAR mining techniqueoutperforms state-of-the-art approach on both dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call