Abstract
Abstract Developing agents capable of commonsense reasoning is an important goal in Artificial Intelligence (AI) research. Because commonsense is broadly defined, a computational theory that can formally categorize the various kinds of commonsense knowledge is critical for enabling fundamental research in this area. In a recent book, Gordon and Hobbs described such a categorization, argued to be reasonably complete. However, the theory’s reliability has not been independently evaluated through human annotator judgments. This paper describes such an experimental study, whereby annotations were elicited across a subset of eight foundational categories proposed in the original Gordon-Hobbs theory. We avoid bias by eliciting annotations on 200 sentences from a commonsense benchmark dataset independently developed by an external organization. The results show that, while humans agree on relatively concrete categories like time and space, they disagree on more abstract concepts. The implications of these findings are briefly discussed.
Highlights
There is no single, agreed-upon definition of commonsense, most definitions include the notion that commonsense is a shared human ability of understanding and judging everyday matters (Davis & Marcus, 2015; Minsky, 2006)
Commonsense reasoning in Artificial Intelligence (AI) is about the development of computational agents that are capable of achieving human-like performance when presented with tasks that involve commonsense judgements (e.g. “True or False: If you throw a wine glass against a wooden floor, it will probably shatter”, from the CycIC-entangled task)
Eight of these areas were judged by the authors to be examples of foundational categories that are involved in human commonsense reasoning and are commonly used in knowledge representation research
Summary
There is no single, agreed-upon definition of commonsense, most definitions include the notion that commonsense is a shared human ability of understanding and judging everyday matters (Davis & Marcus, 2015; Minsky, 2006). These 988 concepts were clustered into 48 representational areas Eight of these areas (the ones selected for our experiment) were judged by the authors to be examples of foundational categories that are involved in human commonsense reasoning and are commonly used in knowledge representation research. Our interest in these foundational categories is their potential to support the development of a formal logic for commonsense reasoning. Our end-goal is similar in that we draw on the results of this study to advocate for better design of commonsense benchmarks We note that such a reliable categorization, if it is found to exist, can facilitate several important purposes in the AI sub-community that investigates commonsense reasoning. For some categories, this is the case, while for others, there is agreement among human annotators
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.