Abstract

Visual question generation task aims at generating high-quality questions about a given image. To make this tak applicable to various scenarios, e.g., the growing demand for exams, it is important to generate diverse questions. The existing methods for this task control diverse question generation based on different question types, e.g., “what” and “when.” Although different question types lead to description diversity, they cannot guarantee semantic diversity when asking the same objects. Research in the field of psychology shows that humans pay attention to different objects in an image based on their preferences, which is beneficial to constructing semantically diverse questions. According to the research, we propose a multi-selector visual question generation (MS-VQG) model that aims to focus on different objects to generate diverse questions. Specifically, our MS-VQG model employs multiple selectors to imitate different humans to select different objects in a given image. Based on these different selected objects, our MS-VQG model can generate diverse questions corresponding to each selector. Extensive experiments on two datasets show that our proposed model outperforms the baselines in generating diverse questions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.