Abstract

The bag of visual words (BOW) model has been widely applied in the field of image recognition and image classification. However, all scale-invariant feature transform (SIFT) features are clustered to construct the visual words which result in a substantial loss of discriminative power for the visual words. The corresponding visual phrases will further render the generated BOW histogram sparse. In this study, the authors aim to improve the classification accuracy by extracting high discriminative SIFT features and feature pairs. First, high discriminative SIFT features are extracted with the within- and between-class correlation coefficients. Second, the high discriminative SIFT feature pairs are selected by using minimum spanning tree and its total cost. Next, high discriminative SIFT features and feature pairs are exploited to construct the visual word dictionary and visual phrase dictionary, respectively, which are concatenated to a joint histogram with different weights. Compared with the state-of-the-art BOW-based methods, the experimental results on Caltech 101 dataset show that the proposed method has higher classification accuracy.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.