Abstract
We tackle the problem of mobile visual search. Moving pictures experts group (MPEG) has completed a standard named compact descriptor for visual search (CDVS) to provide a standardized syntax in the context of image retrieval application. CDVS applies principal components analysis to reduce the dimension of local feature descriptor as the input of global descriptor pipeline, and utilizes traditional fisher vector as the local feature descriptor aggregation algorithm. However, the descriptor components of SIFT and Fisher Vector (FV) have highly non-Gaussian statistics, and applying a single PCA transform can in-fact hurt compression performance at high rates. We develop a net-based architecture combining neural networks with FV layer to obtain fisher vector. There are two advantages in our architecture comparing with CDVS global descriptor pipeline. One is that we employ “autoencoder” networks to reduce the dimensionality of data, the other is that we exploit a trainable system to learn parameters after the FV codebook obtained. The experiments demonstrate an obvious advantage of our proposed architecture in terms of CDVS retrieval task.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.