With evolving technology, Hindi web content is growing considerably and catching popularity as the larger audience feels more connected and heard using their native language. A volcanic growth in online movie reviews (MRs) in Hindi has been observed lately; manually analyzing them is impossible. Hence, the research problem of automatic organization and classification of Hindi reviews is apparent as this can help viewers decide whether a movie is worth watching or not. This work focuses on developing a deep learning-based system for bi-polar sentiment classification of MRs for resource deficient language – Hindi. To this end, a primary Hindi movie review (MR) corpus is made and manually annotated with binary polarity class labels - positive or negative. The corpus is preprocessed using the preprocessing steps, and Random Word Embeddings (WEs) are utilized for feature extraction. This paper proposes an ensemble CNN_BiGRU, which is an integration of 1D CNN with BiGRU for the bipolar classification of Hindi MRs. To prove our ensemble’s efficacy; other widely used mainstream deep learning models (DLMs) such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN) based models – Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), are also applied and compared with proposed ensemble using average classification accuracy. Empirical results show the effectiveness of the proposed ensemble in achieving a reasonably good average accuracy of 89.366%. The results indicate that proposed CNN_BiGRU compares favorably to the state-of-the-art DLMs applied and hence gives an effective solution for sentence-level bi-polar classification of MRs in a resource deficient scenario.
Read full abstract