Abstract
We propose an enhanced latent topic model based on latent Dirichlet allocation and convolutional neural nets for event classification and annotation in images. Our model builds on the semantic structure relating events, objects and scenes in images. Based on initial labels extracted from convolution neural networks (CNNs), and possibly user-defined tags, we estimate the event category and final annotation of an image through a refinement process based on the expectation–maximization (EM) algorithm. The EM steps allow to progressively ascertain the class category and refine the final annotation of the image. Our model can be thought of as a two-level annotation system, where the first level derives the image event from CNN labels and image tags and the second level derives the final annotation consisting of event-related objects/scenes. Experimental results show that the proposed model yields better classification and annotation performance in the two standard datasets: UIUC-Sports and WIDER.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.