Distributional Vector Space Research Articles

We explore the potential of a popular distributional semantics vector space model, word2vec, for capturing meaningful relationships in ecological (complex polyphonic) music. More precisely, the skip-gram version of word2vec is used to model slices of music from a large corpus spanning eight musical genres. In this newly learned vector space, a metric based on cosine distance is able to distinguish between functional chord relationships, as well as harmonic associations in the music. Evidence, based on cosine distance between chord-pair vectors, suggests that an implicit circle-of-fifths exists in the vector space. In addition, a comparison between pieces in different keys reveals that key relationships are represented in word2vec space. These results suggest that the newly learned embedded vector representation does in fact capture tonal and harmonic characteristics of music, without receiving explicit information about the musical content of the constituent slices. In order to investigate whether proximity in the discovered space of embeddings is indicative of `semantically-related' slices, we explore a music generation task, by automatically replacing existing slices from a given piece of music with new slices. We propose an algorithm to find substitute slices based on spatial proximity and the pitch class distribution inferred in the chosen subspace. The results indicate that the size of the subspace used has a significant effect on whether slices belonging to the same key are selected. In sum, the proposed word2vec model is able to learn music-vector embeddings that capture meaningful tonal and harmonic relationships in music, thereby providing a useful tool for exploring musical properties and comparisons across pieces, as a potential input representation for deep learning models, and as a music generation device.

Read full abstract

The Distributional Compositional Categorical (DisCoCat) model is a mathematical framework that provides compositional semantics for meanings of natural language sentences. It consists of a computational procedure for constructing meanings of sentences, given their grammatical structure in terms of compositional type-logic, and given the empirically derived meanings of their words. For the particular case that the meaning of words is modelled within a distributional vector space model, its experimental predictions, derived from real large scale data, have outperformed other empirically validated methods that could build vectors for a full sentence. This success can be attributed to a conceptually motivated mathematical underpinning, something which the other methods lack, by integrating qualitative compositional type-logic and quantitative modelling of meaning within a category-theoretic mathematical framework. The type-logic used in the DisCoCat model is Lambekʼs pregroup grammar. Pregroup types form a posetal compact closed category, which can be passed, in a functorial manner, on to the compact closed structure of vector spaces, linear maps and tensor product. The diagrammatic versions of the equational reasoning in compact closed categories can be interpreted as the flow of word meanings within sentences. Pregroups simplify Lambekʼs previous type-logic, the Lambek calculus. The latter and its extensions have been extensively used to formalise and reason about various linguistic phenomena. Hence, the apparent reliance of the DisCoCat on pregroups has been seen as a shortcoming. This paper addresses this concern, by pointing out that one may as well realise a functorial passage from the original type-logic of Lambek, a monoidal bi-closed category, to vector spaces, or to any other model of meaning organised within a monoidal bi-closed category. The corresponding string diagram calculus, due to Baez and Stay, now depicts the flow of word meanings, and also reflects the structure of the parse trees of the Lambek calculus.

Read full abstract

Distributional Vector Space Research Articles

Related Topics

Articles published on Distributional Vector Space

Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

Exploring semantic differences between the Indonesian prefixesPE-andPEN-using a vector space model

Learning Lexical Subspaces in a Distributional Vector Space

Distinguishing between paradigmatic semantic relations across word classes: human ratings and distributional similarity

A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces

From context to concept: exploring semantic relationships in music with word2vec

Using a distributional semantic vector space with a knowledge base for reasoning in uncertain conditions

From Logical to Distributional Models

Geographic variation of quite + ADJ in twenty national varieties of English: A pilot study

Syntactic N-gram Collection from a Large-Scale Corpus of Internet Finnish.

Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Distributional Vector Space Research Articles

Related Topics

Articles published on Distributional Vector Space

Evaluation of taxonomic and neural embedding methods for calculating semantic similarity

Exploring semantic differences between the Indonesian prefixesPE-andPEN-using a vector space model

Learning Lexical Subspaces in a Distributional Vector Space

Distinguishing between paradigmatic semantic relations across word classes: human ratings and distributional similarity

A General Framework for Implicit and Explicit Debiasing of Distributional Word Vector Spaces

From context to concept: exploring semantic relationships in music with word2vec

Using a distributional semantic vector space with a knowledge base for reasoning in uncertain conditions

From Logical to Distributional Models

Geographic variation of quite + ADJ in twenty national varieties of English: A pilot study

Syntactic N-gram Collection from a Large-Scale Corpus of Internet Finnish.

Lambek vs. Lambek: Functorial vector space semantics and string diagrams for Lambek calculus

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA