Abstract

Recently, many social media users expressed their conditions, ideas, emotions using local languages ​​on social media, for example via tweets or status. Due to the large number of texts, sentiment analysis is used to identify opinions, ideas, or thoughts from social media. Sentiment analysis research has also been widely applied to local languages. Karonese is one of the largest local languages ​​in North Sumatera, Indonesia. Karo society actively use the language in expression on twitter. This study proposes two things: Karonese tweet dataset for classification and analysis of sentiment on Karonese. Several machine learning algorithms are implemented in this research, that is Logistic regression, Naive bayes, K-nearest neighbor, and Support Vector Machine (SVM). Karonese tweets is obtained from timeline twitter based on several keywords and hashtags. Transcribers from ethnic figures helped annotating the Karo tweets into three classes: positive, negative, and neutral. To get the best model, several scenarios were run based on various compositions of training data and test data. The SVM algorithm has highest accuracy, precision, recall, and F-1 scores than others. As the research is a preliminary research of sentiment analysis on Karonese language, there are many feature works to improvement.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.