Abstract
With recent advancements in machine perception and scene understanding, Visual Question Answering (VQA) has garnered much attraction from researchers in the direction of training neural models for jointly analyzing, grounding and reasoning over the multi-modal space of image visual context and natural language in order to answer natural language questions pertaining to the image contents. However, though recent works have achieved significant improvement over state-of-art models for answering questions that are answerable by solely referring to the visual context of the image, such models are often limited, being incapable of tackling questions involving external world knowledge beyond the visible contents. Though recently, research has been driven towards tackling external knowledge based VQA as well, there is significant room for improvement as limited studies exist in this area.Inspired by the aforementioned challenges involved, this paper is aimed at answering free form and open ended natural language questions, not limited to visual context of an image, but external world knowledge as well. With this motive, inspired by human cognitive abilities of comprehending and reasoning answers when given a set of facts, this paper proposes a novel model architecture to model VQA as a factoid question answering problem, leveraging state-of-the-art deep learning techniques for reasoning and inferring answers to free form questions, in an attempt of improving the state-of-art in open ended visual question answering.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.