Abstract

Imbalanced data learning is a research area and day by day development is going on. Due to these researchers are motivated to pay attention to find efficient and adaptive methods for real-world problems. Machine learning, as well as data mining, is a field where researchers are finding different methods to solve problems related to imbalanced datasets and also the challenges faced in day to day life. The uneven class distribution in the dataset is the reason behind the degradation of performance in approaches used by data mining as well as machine learning. Continuous advancements of machine learning as well as mining data combining it with big data, a deep insight is required to understand the nature of learning imbalanced data. New challenges are emerging due to this development. Among the two approaches algorithm level and data level, the most popular approach compared to this is the hybrid approach. It is found that there is a bias for the majority class which affects the decision making task and overall accuracy of classification. The ensemble method is an efficient technique to deal with the uneven distribution of data. The aim of the paper is to presents the overview of class imbalance problems, solutions to handle it, open issues and challenges in learning imbalanced datasets. Based on the experiment conducted on one dataset it is found that ensemble technique along with other data-level methods gives good results. This hybrid method can be applied in many real-life applications like software defect prediction, behavior analysis, intrusion detection, medical diagnosis, etc. The paper further provides research directions in learning from the imbalanced dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.