Abstract

Sentiment analysis is a method used to analyze opinions and feelings. The goal of sentiment analysis is to determine whether a document contains a positive or negative emotion. Along with the spread of Covid-19 cases, news related to Covid-19 has often become a trending topic in the mass media. Conducting sentiment analysis using all news becomes more challenging because it might take time and cost. Therefore, the sampling method is needed to obtain representative news for the analysis. Web scraping was employed to obtain the news article about Covid-19 in Indonesia. In order to select the representative news, two-step sampling was employed by using stratified and systematic random sampling. According to the topic modelling results using lambda 0.6, news articles are grouped into three topics: updating Covid-19 cases, vaccination, and government policy. In addition, based on the number of positive and negative words, news articles are grouped into news dominated by positive words, news dominated by negative words, and news with the same number of positive and negative words. Methods for representing text in numerical form have been developed. Some of them use tf-idf weighting and word embedding. It does not pay attention to word order or meaning, only based on the frequency of words both locally and globally. Furthermore, this method will form a vector size as large as the number of unique words in the document, so it is less effective when many documents are used. Meanwhile, the vector size generated from the word2vec method is not as much as the number of unique words in the corpus. In addition, word2vec considers the context of the words in the corpus.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.