Abstract

In the real world, there is a significant presence of imbalanced data due to the fact that the classes that make up the datasets are not evenly distributed. Even when using methods that are traditionally used to achieve class balance, such as re-sampling & re-weighting, current deep learning still faces a significant obstacle because of the class imbalance. This study’s major objective is proposing a data augmentation technique to balance the data to improve the sample sizes for the minority classes. Python, a well-known programming language, & multiple methods of machine learning are being employed in the execution of this study. Classification models like Logistic Regression, Naïve Bayes, Support Vector Machine, Decision Tree, Random Forest, Extra Trees Classifier, AdaBoost classifier, Gradient Boost classifier was used to implement this study. Precision, recall, & F-score were used to determine which model would be the most effective. According to the findings of this study's analysis, the Naive Bayes approach, which has a F1-Score of 95.85% & has Wn = 3, Cn = 3, & CWn =3 as its parameters, is the technique that yields the most accurate results.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.