Combining Specialized Word Embeddings and Subword Semantic Features for Lexical Entailment Recognition

Van-Tan Bui,Phuong-Thai Nguyen,Van-Lam Pham

doi:10.1016/j.datak.2022.102077

Abstract

The challenge of Lexical Entailment Recognition (LER) aims to identify the is-a relation between words. This problem has recently received attention from researchers in the field of natural language processing because of its application to varied downstream tasks. However, almost all prior studies have only focused on datasets that include single words; thus, how to handle compound words effectively is still a challenge. In this study, we propose a novel method called LERC (Lexical Entailment Recognition Combination) to solve this problem by combining embedding representations and subword semantic features. For this aim, firstly a specialized word embedding model for the LER tasks is trained. Secondly, subword semantic information of word pairs is exploited to compute another feature vector. This feature vector is combined with embedding vectors for supervised classification. We considered three LER tasks, including Lexical Entailment Detection, Lexical Entailment Directionality, and Lexical Entailment Determination. Experimental results conducted on several benchmark datasets in English and Vietnamese languages demonstrated that the subword semantic feature is useful for these tasks. Moreover, LERC outperformed several methods published recently.

Full Text