Abstract

The imbalanced data classification is a major issue in data mining. Many researchers have proposed various solutions which addressed imbalanced data problem which is broadly categorized into data level and algorithm level. Class distributions are adjusted in data level method. Creating an algorithm or modifying the existing algorithm is an appropriate approach used in algorithm level method. Imbalanced data classification problem can be resolved by means of Sampling, Random over sampling, Random under sampling, Resampling and by SMOTE (Synthetic Minority Oversampling Techniques). Resampling includes k-means clustering, density-based clustering, neural networks and ensemble. However, no algorithm or a method has an ability to remove bias in data classification, thereby integration of kernel methods with sampling methods or integration of sampling and boosting methods or integration Kernel based with Support Vector Machines (SVM) need to be performed a great extent to get the desired accuracy and performance. The main objective of this paper is to focus on various sampling strategies that are based on sampling and resampling methods and improving the concept of learning within class imbalanced data. It also explains the objectives of the models used by several researchers and emphasized the performance along with the outcomes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.