Abstract
The authors present a robust and extendable localization system for monocular images. To have both robustness toward noise factors and extendibility to unfamiliar scenes simultaneously, our system combines traditional content-based image retrieval structure with CNN feature extraction model to localize monocular images. The core model of the system is a deep CNN feature extraction model. The feature extraction model can map an image to a d-dimension space where image pairs in the real word have smaller Euclidean distances. The feature extraction model is achieved using a deep Convnet modified from GoogLeNet. A special way to train the feature extraction model is proposed in the article using localization results from Cambridge Landmarks dataset. Through experiments, it is shown that the system is robust to noise factors supported by high level CNN features. Furthermore, the authors show that the system has a powerful extendibility to other unfamiliar scenes supported by a feature extract model's generic property and structure.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Software Science and Computational Intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.