Abstract

We propose a model for free-form visual question answering (VQA) from human brain activity. The task of VQA is leading to an answer given an image and a question about the image. Given brain activity data measured by functional magnetic resonance imaging (fMRI) and a natural language question in terms of the viewed image, the proposed method can provide an accurate natural language answer with the VQA algorithm. Visual questions selectively approach various areas of an image such as objects and backgrounds. As a result, a more detailed understanding of the image and complex reasoning are typically needed than general image captioning models. In this paper, we propose a method of answering a given question about a viewed image from fMRI data based on the VQA algorithm. We estimate the relation between fMRI data and visual features extracted from viewed images. Based on the relationship, we convert fMRI data into visual features. Finally, the proposed method can answer to a visual question from fMRI data measured while subjects are viewing images. Experimental results show that the proposed method enables accurate answering for questions about viewed images.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call