Multi-class SVM Classification Comparison for Health Service Satisfaction Survey Data in Bahasa

Gede Indrawan,Aris Gunadi,Heri Setiawan

doi:10.28991/hij-2022-03-04-05

Gede Indrawan, Aris Gunadi + Show 1 more

Open Access

https://doi.org/10.28991/hij-2022-03-04-05

Copy DOI

Abstract

This study aimed to compare the Multi-class Support Vector Machine (MSVM) classification with the One-versus-One (OvO) and One-versus-Rest (OvR) approaches using unigram and bigram features. The study used the service satisfaction survey report of Denpasar public health centers by the Center for Public Health Innovation (CPHI), Medical School, Udayana University. As Bali is known as the world's main tourism destination, it is important to know about its supporting public health service through its representative capital city, Denpasar. Moreover, this study laid the foundation for the classification process using the available methods to fit in Indonesian health service satisfaction survey data, which assists in making decisions to improve health services. Since Bali is one of the provinces in Indonesia and all of those provinces refer to the same national regulation, health service satisfaction survey data that is in the Indonesian language (Bahasa) should have the same aspects, like category, priority, word-related matters (including abbreviations, acronyms, terminology), etc. that overall make it unique and need specific processing. That work was considered a contribution since there is no such study to the best of the author's knowledge and the foundation would be useful as a part of the future vision for the integrated system of Indonesian health big data. Since in reality, satisfaction survey data tends to be unbalanced, this study also compares the developed models using unigram and bigram features without and with feature selection (FS). Those features were then processed using the OvO MSVM and OvR MSVM models. k-fold cross-validation was used to divide training data and testing data and, at the same time, validate the models. Through experiments without and with FS, the OvO MSVM and OvR MSVM models with unigram features had better performance in general than the same models with bigram features. Without FS and with unigram features, comparable differences were found where the OvO MSVM model was slightly better on accuracy and precision, while the OvR MSVM model was slightly better on recall and the F1score. Without FS and with bigram features, comparable differences were also found, where the OvR MSVM model had slightly better performance than the OvO MSVM model. With FS and with unigram and bigram features, the OvR MSVM model had better performance in general than the OvO MSVM model. Doi: 10.28991/HIJ-2022-03-04-05 Full Text: PDF

Full Text