Neuro-Symbolic Word Embedding Using Textual and Knowledge Graph Information

Dongsuk Oh,Jungwoo Lim,Heuiseok Lim

doi:10.3390/app12199424

Abstract

The construction of high-quality word embeddings is essential in natural language processing. In existing approaches using a large text corpus, the word embeddings learn only sequential patterns in the context; thus, accurate learning of the syntax and semantic relationships between words is limited. Several methods have been proposed for constructing word embeddings using syntactic information. However, these methods are not trained for the semantic relationships between words in sentences or external knowledge. In this paper, we present a method for improved word embeddings using symbolic graphs for external knowledge and the relationships of the syntax and semantic role between words in sentences. The proposed model sequentially learns two symbolic graphs with different properties through a graph convolutional network (GCN) model. A new symbolic graph representation is generated to understand sentences grammatically and semantically. This graph representation includes comprehensive information that combines dependency parsing and semantic role labeling. Subsequently, word embeddings are constructed through the GCN model. The same GCN model initializes the word representations that are created in the first step and trains the relationships of ConceptNet using the relationships between words. The proposed word embeddings outperform the baselines in benchmarks and extrinsic tasks.

Full Text