Answer-Type Prediction for Visual Question Answering

Kushal Kafle,Christopher Kanan

doi:10.1109/cvpr.2016.538

Answer-Type Prediction for Visual Question Answering

Kushal Kafle, Christopher Kanan

Open Access

https://doi.org/10.1109/cvpr.2016.538

Copy DOI

Publication Date: Jun 1, 2016

Citations: 134

Affiliation: Rochester Institute of Technology

#Visual Question Answering #Visual Question Answering Dataset + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued. In this paper, we build a system capable of answering open-ended text-based questions about images, which is known as Visual Question Answering (VQA). Our approach's key insight is that we can predict the form of the answer from the question. We formulate our solution in a Bayesian framework. When our approach is combined with a discriminative model, the combined model achieves state-of-the-art results on four benchmark datasets for open-ended VQA: DAQUAR, COCO-QA, The VQA Dataset, and Visual7W.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.