Abstract

For many years, vector space models have been used in information retrieval and computational linguistics to represent terms, queries, and documents, using vector addition as a simple operator to model semantic composition. Though surprisingly successful, many aspects of meaning including word order, typed relationships, and nested structures are not captured by this modelling process.In recent years, this has changed dramatically. Several researchers have had considerable success at representing other semantic operations in vector models, including negation, typed relationships, distributed inference, adjective-noun modification, and nested composition. This success is partly due to the ready availability of established algebraic methods including orthogonal projection, tensor algebra and matrix multiplication, circular convolution, and permutation.When applied to vectors with complex or binary numbers as coordinates, these operations, their implementations, and experimental results sometimes differ markedly from those obtained with real numbers as coordinates. This brings our attention to a surprising gap in information retrieval and indeed machine learning: in these rapidly developing empirical fields, we tend to tacitly assume that real numbers are the canonical ground field. This is in marked contrast to physics, where complex numbers are ubiquitous, and logic, where binary numbers are the established starting point.In this talk, we will review some of the algebraic operators used today for modelling composition of meaning with vectors, and compare their implementations and behaviours when using different number fields for the vector coordinates. The main goal is to encourage theoretical and practical researchers in information retrieval to experiment much more with complex and binary vectors as well as real vectors, in the hope that such investigations may prove as fruitful for information retrieval as they have been for physics and logic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call