We develop a vector space semantics for verb phrase ellipsis with anaphora using type-driven compositional distributional semantics based on the Lambek calculus with limited contraction (LCC) of Jäger (Anaphora and type logical grammar, Springer, Berlin, 2006). Distributional semantics has a lot to say about the statistical collocation based meanings of content words, but provides little guidance on how to treat function words. Formal semantics on the other hand, has powerful mechanisms for dealing with relative pronouns, coordinators, and the like. Type-driven compositional distributional semantics brings these two models together. We review previous compositional distributional models of relative pronouns, coordination and a restricted account of ellipsis in the DisCoCat framework of Coecke et al. (Mathematical foundations for a compositional distributional model of meaning, 2010. arXiv:1003.4394, Ann Pure Appl Log 164(11):1079–1100, 2013). We show how DisCoCat cannot deal with general forms of ellipsis, which rely on copying of information, and develop a novel way of connecting typelogical grammar to distributional semantics by assigning vector interpretable lambda terms to derivations of LCC in the style of Muskens and Sadrzadeh (in: Amblard, de Groote, Pogodalla, Retoré (eds) Logical aspects of computational linguistics, Springer, Berlin, 2016). What follows is an account of (verb phrase) ellipsis in which word meanings can be copied: the meaning of a sentence is now a program with non-linear access to individual word embeddings. We present the theoretical setting, work out examples, and demonstrate our results with a state of the art distributional model on an extended verb disambiguation dataset.
Read full abstract