Analyzing Customer Satisfaction using Support Vector Machine and Naive Bayes Utilizing Filipino Text

Joseph B Campit

doi:10.37394/232015.2023.19.50

Abstract

The study aimed to compare the classification performance of Support Vector Machine (SVM) and Naive Bayes (NB) machine learning models for estimating customer satisfaction utilizing Filipino text. Specifically, it analyzed the characteristics of the customer satisfaction data. It also examined the impact of different model configurations, including n-gram, stop words, and stemming, on the classification performance of the two models. The research employed qualitative and quantitative methods, utilizing text analytics and sentiment analysis to extract and analyze valuable information from unstructured responses from a satisfaction survey of the University President’s leadership performance conducted among PSU personnel and students. The dataset comprised 56,000 Filipino and English-word responses, manually annotated and randomly split into training and testing datasets. The study followed a general framework encompassing data pre-processing, modeling, and model comparison. To validate the classifiers’ classification performance, a 10-fold cross-validation approach was employed. The findings revealed that most personnel and students expressed positive sentiment toward the University President’s leadership performance. SVM outperformed the NB model across all different model configurations. With both stop word removal and stemming, the SVM trigram model achieved the highest classification performance for estimating customer satisfaction, using 75% of the data for training and 25% for testing. The proposed model holds the potential for estimating customer satisfaction using other unstructured customer satisfaction data utilizing Filipino text.

Full Text