Aspect Based Sentiment Analysis Marketplace Product Reviews Using BERT, LSTM, and CNN

Syaiful Imron Syaiful Imron,Mauridhi Hery Purnomo Mauridhi Hery Purnomo,Joan Santoso Joan Santoso,Esther Irawati Setiawan

doi:10.29207/resti.v7i3.4751

Abstract

Bukalapak is one of the largest marketplaces in Indonesia. Reviews on Bukalapak are only in the form of text, images, videos, and stars without any special filters. Reading and analyzing manually makes it difficult for potential buyers. To help with this, we can extract this review by using aspect-based sentiment analysis because an entity cannot be represented by just one sentiment. Several previous research stated that using LSTM-CNN got better results than using LSTM or CNN. In addition, using BERT as word embedding gets better results than using word2vec or glove. For this reason, this study aims to classify aspect-based sentiment analysis from the Bukalapak marketplace with BERT as word embedding and using the LSTM-CNN method, where LSTM is for aspect extraction and CNN for sentiment extraction. Based on testing the LSTM-CNN method, it gets better results than LSTM or CNN. The LSTM-CNN model gets an accuracy of 93.91%. Unbalanced dataset distribution can affect model performance. With the increasing number of datasets used, the accuracy of a model will increase. Classification without using stemming on datasets can increase accuracy by 2.04%.

Full Text