Abstract

Word embedding is a process of mapping words into real number vectors. The representation of a word as vector maps uniquely each word to exclusive vector in the vector space of word corpus. The word embedding in natural language processing is gaining popularity these days due to its capability to exploit real world tasks such as syntactic and semantic entailment of text. Syntactic text entailment comprises of tasks like Parts of Speech (POS) tagging, chunking and tokenization whereas semantic text entailment contains tasks such as Named Entity Recognition (NER), Complex Word Identification (CWI), Sentiment classification, community question answering, word analogies and Natural Language Inferences (NLI). This study has explored eight word embedding models used for aforementioned real world tasks and proposed a novice word embedding using deep learning neural networks. The experimentation performed on two freely available datasets of English Wikipedia dump corpus of April, 2017 and pre-processed Wikipedia text8 corpus. The performance of proposed word embedding is validated against the baseline of four traditional word embedding techniques evaluated on the same corpus. The average result of 10 epochs shows the better performance of proposed technique than other word embedding techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call