One of the most popular online transportation providers in Indonesia is GO-JEK. At the first establishment, GO-JEK is only an online taxi motorbike service (in Bahasa Indonesia: ojek) that transforms from a conventional taxi motorbike. After several years GO-JEK began to develop more services, like GO-FOOD, GO-SEND, GO-CAR, GO-MART, GO-RIDE, GO-PAY, GO-TIX, GO-BOX, and GO-MED. As the GO-JEK services develop into more categories, it is more challenging to automatically analyze the sentiment polarity for each category of services category. An ordinary classification algorithm or single-label classification is concerned with learning from a set of examples associated with a single label classification. However, in this case, we want to classify GO-JEK services based on two-class targets, which are GO-JEK service categories and polarity sentiment classification. This research methodology contains Dataset Preparation, Feature Selection, Basic text mining process, train, and split dataset, and Classification. We implemented two classification methods: Multi-Label Classification and a simultaneous classification using the Random Forest Algorithm as a comparison. Based on this dataset, the most mentioned GO-JEK service is GO-FOOD followed by GO-SEND, GO-RIDE, GO-CAR, and GO-MART. Based on the service category, GO-FOOD gets the most positive reviews following by GO-RIDE and GOJEK 90. Some service categories like GO-SEND and GO-MART get more negative reviews than positive reviews. The accuracy of the Multi-label classification method raised 76%. Simultaneous classification accuracy using the Random Forest algorithm produce for service category classification yields 97% and only 78% for sentiment polarity classification. We can see that both algorithm, multi-label classification, and Random Forest algorithm, yields almost the same classification accuracy for polarity sentiment classification. We can conclude that the imperfect accuracy in polarity sentiment classification is related to the difficulty of identifying the most suitable polarity for each tweet. Many sarcasm tweets and slang words that cannot be identified.
Read full abstract