Abstract

Increasing search for a journal article should be supported by appropriate journal articles search results. Searching for appropriate journal articles can be done by grouping journal articles or information presented based on certain categories. The categorization problem that is often faced is the unbalanced data categorization. So that the settlement of the problem of categorizing unbalanced journal articles in each group will be solved by over-sampling and text classification methods. The purpose of this study is to compare the text classification for the journal articles of Primary School Teacher Education (PGSD) with the Support Vector Machine (SVM) classification method by conducting over-sampling techniques or methods with Synthetic Minority Over-Sampling Technique (SMOTE) and classification of PGSD journal articles using the SVM method only. This research includes text preprocessing processes such as tokenizing and case folding which produce a set of terms or words. The results of words will be weighted using TFIDF. After obtaining the TFIDF value, SMOTE will be conducted to overcome the data imbalance. The data that has been through SMOTE will be classified by the SVM method. The comparison of performance for each method is measured by 3 test outputs, which are accuracy, recall, and precision. From the test results, it was proved that the SVM using SMOTE after the TFIDF value obtained is better for classifying than using the SVM method only. The test value for the SVM method using SMOTE has a precision value of 97.3%, recall of 97.23% and accuracy of 97.22%, while the SVM method has a precision value of 72.55%, recall of 50.71% and accuracy of 80.56%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call