Abstract

Visual Question Answering (VQA) is a task that involves predicting an answer to a question depending on the content of an image. However, recent VQA methods have relied more on language priors between the question and answer rather than the image content. To address this issue, many debiasing methods have been proposed to reduce language bias in model reasoning. However, the bias can be divided into two categories: good bias and bad bias. Good bias can benefit to the answer prediction, while the bad bias may associate the models with the unrelated information. Therefore, instead of excluding good and bad bias indiscriminately in existing debiasing methods, we proposed a bias discrimination module to distinguish them. Additionally, bad bias may reduce the model’s reliance on image content during answer reasoning and thus attend little on image features updating. To tackle this, we leverage Markov theory to construct a Markov field with image regions and question words as nodes. This helps with feature updating for both image regions and question words, thereby facilitating more accurate and comprehensive reasoning about both the image content and question. To verify the effectiveness of our network, we evaluate our network on VQA v2 and VQA cp v2 datasets and conduct extensive quantity and quality studies to verify the effectiveness of our proposed network. Experimental resu- lts show that our network achieves significant performance against the previous state-of-the-art methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.