Abstract

In this theoretical paper, we consider the notion of semantic competence and its relation to general language understanding—one of the most sough-after goals of Artificial Intelligence. We come back to three main accounts of competence involving (a) lexical knowledge; (b) truth-theoretic reference; and (c) causal chains in language use. We argue that all three are needed to reach a notion of meaning in artificial agents and suggest that they can be combined in a single formalisation, where competence develops from exposure to observable performance data. We introduce a theoretical framework which translates set theory into vector-space semantics by applying distributional techniques to a corpus of utterances associated with truth values. The resulting meaning space naturally satisfies the requirements of a causal theory of competence, but it can also be regarded as some ‘ideal’ model of the world, allowing for extensions and standard lexical relations to be retrieved.

Highlights

  • From a high-level perspective, research in Natural Language Processing (NLP) can be said to be dedicated to the question ‘Can we give machines the faculty of language?’ Seen from a theoretical linguistics point of view, this question boils down to solving the problem of competence acquisition

  • We focus in this paper on the goal of finding a formal representation which would be amenable to defining various types of semantic competence, and which could be shown to be acquirable from performance data

  • According to Chomsky, the acquisition of competence from performance data implies the existence of an underlying Universal Grammar (UG), i.e. an innate system shared by all human beings, which kick-starts the process of learning one’s native language

Read more

Summary

Introduction

From a high-level perspective, research in Natural Language Processing (NLP) can be said to be dedicated to the question ‘Can we give machines the faculty of language?’ Seen from a theoretical linguistics point of view, this question boils down to solving the problem of competence acquisition. The job of linguistics is to describe the formal structure of competence, as theoreticians would have it, and to explain the cognitive processes that might lead to its acquisition from performance data Following this ideal, we focus in this paper on the goal of finding a formal representation which would be amenable to defining various types of semantic competence ( accounting for theoretical matters), and which could be shown to be acquirable from performance data ( accounting for cognitive reality and, of importance to us, allowing for the computational simulation of specific aspects of linguistic cognition). This space has a number of properties desirable in both formal and distributional semantics, which we will describe in the course of the paper: ability to compute pluralities and differentiate collectives from distributive predicates, compositionality, amenability to probabilistic approaches and word meaning contextualisation

Competence and Performance
Competence and Performance in Syntax
Competence and Performance in Semantics
How to Position this Paper
Preliminaries
Distributional Semantics
Grammar and Logic
A Distributional Account of Semantic Competence
Formalisation of the Super‐Competent Speaker
Computing Languages
The Ideal Entity Matrix
Aggregation Function
Similarity Function
Relation to Performance
Composition
Probabilistic Interpretation and Possible Worlds
Formalisation of Lexical Relations
Implementation
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call