Abstract

Corona Virus Disease (Covid-19) is a virus that causes respiratory infections in humans. Indonesia is a country that has been infected with this virus, the implementation of restrictions on community activities (PPKM) is implemented by the government as a policy to reduce the spread of Covid-19. Pros and cons arise due to the impact of the policy. Therefore, assessing how public opinion or sentiment is towards this policy is important to do. This study aims to implement the XGBoost algorithm in the sentiment classification process. Sentiment analysis targets public opinion on Twitter, the dataset used is 1958 positive tweets and 3980 negative tweets. At the preprocessing stage, case-folding, stopwords removal, tokenizing, and stemming are carried out. Giving weights to terms uses the Term Frequency-Relevance Frequency method to turn each term into a number. In the final stage, classification is carried out by implementing the XGBoost method with optimal hyperparameter scores. K-fold cross validation is used to evaluate model performance. Based on the evaluation results, the best performance was obtained by a model with a hyperparameter value with an n_estimator of 1000, a learning_rate of 0.1, a max_depth of 6, a subsample of 1, a gamma of 0 and utilizing the stem-ming process in preprocessing with an accuracy value of 85.27%. precision of 86.07%, and recall of 85.23%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call