Abstract

Over time, textual information on the World Wide Web (WWW) has increased exponentially, leading to potential research in the field of machine learning (ML) and natural language processing (NLP). Sentiment analysis of scientific domain articles is a very trendy and interesting topic nowadays. The main purpose of this research is to facilitate researchers to identify quality research papers based on their sentiment analysis. In this research, sentiment analysis of scientific articles using citation sentences is carried out using an existing constructed annotated corpus. This corpus is consisted of 8736 citation sentences. The noise was removed from data using different data normalization rules in order to clean the data corpus. To perform classification on this data set we developed a system in which six different machine learning algorithms including Naive-Bayes (NB), Support Vector Machine (SVM), Logistic Regression (LR), Decision Tree (DT), K-Nearest Neighbor (KNN) and Random Forest (RF) are implemented. Then the accuracy of the system is evaluated using different evaluation metrics e.g. F-score and Accuracy score. To improve the system’ accuracy additional features selection techniques, such as lemmatization, n-graming, tokenization, and stop word removal are applied and found that our system provided significant performance in every case compared to the base system. Our method achieved a maximum of about 9% improved results as compared to the base system.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.