Abstract

In recent years, Bag-of-Visual-Word (BoVW) model has been widely used in computer vision. However, BoVW ignores not only spatial information but also semantic information between visual words. In this study, a latent Dirichlet allocation (LDA) based model has been proposed to obtain the semantic relations of visual words. Because the LDA-based topic model used alone usually degrade performance. Thus, a visual language model (VLM) is combined with LDA-based topic model linearly to represent each image. On our dataset, the proposed approach has been compared with state-of-the-art approaches (such as BoVW, LLC, SPM and VLM). Experimental results indicate that the proposed approach outperforms the original BoVW, LLC, SPM and VLM.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.