Mitigating Imbalanced Data in Online Social Networks using Stratified K-Means Sampling

P Kathiravan,R Saranya,P Shanmugavadivu

doi:10.1109/icbir57571.2023.10147677

Abstract

The K-means clustering technique is widely used in many fields, such as anomaly detection, customer segmentation, cyber-physical system, medical diagnoses, sentiment analysis, fraud detection, and other similar tasks. We used this k-means technique in handling imbalanced datasets by preserving minority class structure using the stratified resampling technique. For this experimental study, we used a benchmark dataset from Kaggle. It is a labeled dataset collected from online social media regarding fake news. This proposed model, The Stratified k-means Sampling (SKMS), is compared with Synthetic Minority Oversampling Technique (SMOTE) by empirically experimenting using different machine learning algorithms. Random Forest (RF) algorithm gives significant accuracy, and Support Vector Classification (SVC) produces a better F1-score than other algorithms. The SMOTE technique was compared with the same dataset using these same algorithms. While SKMS seeks to preserve the structure of the minority class, SMOTE aims to diversify the minority class by interpolating between existing samples. Depending on the dataset, one might be more relevant than the other.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mitigating Imbalanced Data in Online Social Networks using Stratified K-Means Sampling

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

SMOTE-based Category Imbalance for Radar Radiation Source Sorting and Identification
Weixun Ma
-
Weixun MaWeixun Ma
06 Nov 2020
06 Nov 2020

The Improvement of Stress Level Detection in Twitter: Imbalance Classification Using SMOTE
Mohd Shahrul Nizam Mohd Danuri ... Rohizah Abd Rahman
-
Mohd Shahrul Nizam Mohd Danuri, et. al.Mohd Shahrul Nizam Mohd Danuri ... Rohizah Abd Rahman
14 Nov 2022
14 Nov 2022

An Oversampling Technique for Handling Imbalanced Data in Patients with Metabolic Syndrome and Periodontitis
Sema Merve Altingöz ... Elif Ünsal
Cumhuriyet Dental Journal | VOL. 26
Sema Merve Altingöz, et. al.Sema Merve Altingöz ... Elif Ünsal
31 Dec 2024
Cumhuriyet Dental Journal | VOL. 26

The Impact of Oversampling with SMOTE on the Performance of 3 Classifiers in Prediction of Type 2 Diabetes.
Azra Ramezankhani ... Omid Pournik
Medical Decision Making | VOL. 36
Azra Ramezankhani, et. al.Azra Ramezankhani ... Omid Pournik
01 Dec 2014
Medical Decision Making | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mitigating Imbalanced Data in Online Social Networks using Stratified K-Means Sampling

Abstract

Talk to us

Similar Papers