Question‐aware prediction with candidate answer recommendation for visual question answering

B Kim,J Kim

doi:10.1049/el.2017.1881

Abstract

An approach for visual question answering is described. The proposed network solves an open-ended problem with candidate answer recommendation, which is generated solely from the given question. Then, the score from the proposed question-aware prediction module and the score from candidate answer recommendation module are combined to determine the final composite score. The proposed approach uses the bag-of-words framework to understand questions, instead of a complex and neural-network-based module; therefore, an additional dataset to pre-train the language model is not required. Although the proposed approach does not achieve the state-of-the-art performance overall, the approach performs the best for certain types of questions with a small amount of training data.

Full Text