Abstract

Image aesthetic assessment (IAA) aims to design algorithms that can make human-like aesthetic decisions. Due to its high subjectivity and complexity, visual information alone is limited to fully predict the aesthetic quality of an image. More and more researchers try to use complementary information from user comments. However, user comments are not always available due to various technical and practical reasons. Therefore, it is necessary to find a way to reconstruct the missing textual information for aesthetic prediction with visual information only. This paper solves this problem by proposing a Confidence-based Dynamic Cross-modal Memory Network (CDCM-Net). Specifically, the proposed CDCM-Net consists of two key components: Visual and Textual Memory (VTM) network and Confidence-based Dynamical Multi-modal Fusion module (CDMF). VTM is based on the key–value memory network. It consists of a visual key memory and a textual value memory. The visual key memory learns the visual information. While the textual value memory learns to remember the textual feature and align them with the corresponding visual features. During inference, textual information can be reconstructed using only visual features. Furthermore, a CDMF module is introduced to perform trustworthy fusion. CDMF evaluates modality-level informativeness and then dynamically integrates reliable information. Extensive experiments are performed to demonstrate the superiority of the proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.