Abstract

AbstractWord Embedding plays a crucial role in natural language processing, and other related domains. The vast variety of language modelling and feature learning techniques often concludes in a quandary. The motivation behind this work was to produce comparative analysis among these methods and finally use them to flag hate-speech on social media. The progress in these word embedding techniques has led to remarkable results by incorporating various natural language applications. Understanding the different context of polysemous words is one of the features that evolved over time with these word embedding models. A systematic review on varying word embedding methodologies has been performed in this paper. Various experimental metrics have been used and detailed analysis has been done on each word embedding model. It is shown that analysis involves various aspects of the model like dealing with multi-sense words, and rarely occurring words, etc., and finally a coherent analysis report is presented. The various models under analysis are—Word2Vec (Skip-Gram, CBOW), GloVe, Fast-Text and ELMo. These models are then put to a real-life application in the form of Hate Speech detection of twitter data, and their individual capacities and accuracies are compared. Through this paper we show how ELMo uses different word embeddings for polysemous words to capture the context. We show how Hate speech can be better detected by ELMo because such speech requires better understanding of context of words for segregation from normal speech/text.KeywordsWord2VecSkip-gramCBOWGloVeFast-textElmoWord embeddingHate-speech detection

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.