Abstract

Word representations obtained from text using the distributional hypothesis have proved to be useful for various natural language processing tasks. To prepare vector representation from the text, some researchers use predictive model (Word2vec) or dense count-based model (GloVe), whereas others attempt to explore network structure obtained from text namely, distributional thesaurus network where the neighborhood of a word is a set of words having adequate context feature overlap. Being inspired by the successful application of network embedding techniques (DeepWalk, LINE, node2vec, etc.) in various tasks, we attempt to apply network embedding techniques to turn a distributional thesaurus network into dense word vectors and investigate the usefulness of distributional thesaurus embedding in improving the overall word vector representation. This is the first attempt where we show that combining the proposed word representation obtained by distributional thesaurus embedding with the state-of-the-art word representations helps in improving the performance by a significant margin when evaluated against several NLP tasks which include intrinsic tasks like word similarity and relatedness, subspace alignment, synonym detection, analogy detection; extrinsic tasks like noun compound interpretation, sentence pair similarity task as well as subconscious intrinsic evaluation methods using neural activation pattern in the brain, etc.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call