Empirical comparison of clustering and classification methods for detecting Internet addiction

Oksana V Klochko,Vasyl M Fedorets,Vitalii I Klochko

doi:10.55056/cte.664

Abstract

Machine learning methods for clustering and classification are widely used in various domains. However, their performance and applicability may depend on the characteristics of the data and the problem. In this paper, we present an empirical comparison of several clustering and classification methods using WEKA, a free software for machine learning. We apply these methods to the data collected from surveys of students from different majors, aiming to detect the signs of Internet addiction (IA), a behavioural disorder caused by excessive Internet use. We use Expectation Maximization, Farthest First and K-Means for clustering, and AdaBoost, Bagging, Random Forest and Vote for classification. We evaluate the methods based on their accuracy, complexity and interpretability. We also describe the models developed by these methods and discuss their implications for identifying the respondents with IA symptoms and risk groups. The results show that these methods can be effectively used for clustering and classifying IA-related data. However, they have different strengths and limitations when choosing the best method for a specific task.

Full Text