The Core of Smart Cities: Knowledge Representation and Descriptive Framework Construction in Knowledge-Based Visual Question Answering

Ruiping Wang,Shihong Wu,Xiaoping Wang

doi:10.3390/su142013236

Ruiping Wang, Shihong Wu + Show 1 more

Open Access

https://doi.org/10.3390/su142013236

Copy DOI

Journal: Sustainability	Publication Date: Oct 14, 2022
Citations: 1	License type: CC BY 4.0

Affiliation: Huazhong University of Science and Technology

Abstract

Visual question answering (VQA), which is an important presentation form of AI-complete task and visual Turing tests, coupled with its potential application value, attracted widespread attention from both researchers in computer vision and natural language processing. However, there are no relevant research regarding the expression and participation methods of knowledge in VQA. Considering the importance of knowledge for answering questions correctly, this paper analyzes and researches the stratification, expression and participation process of knowledge in VQA and proposes a knowledge description framework (KDF) to guide the research of knowledge-based VQA (Kb-VQA). The KDF consists of a basic theory, implementation methods and specific applications. This paper focuses on describing mathematical models at basic theoretical levels, as well as the knowledge hierarchy theories and key implementation behaviors established on this basis. In our experiment, using the statistics of VQA’s accuracy in the relevant literature, we propose a good corroboration of the research results from knowledge stratification, participation methods and expression forms in this paper.

Full Text