Abstract

The significant growth of data poses its own challenges, both in terms of storing, managing, and analyzing the available data. Untreated and unanalyzed data can only provide limited benefits to its owner. In many cases, the data we analyze is imbalanced. An example of natural data imbalance is in detecting financial fraud, where the number of non-fraudulent transactions is usually much higher than fraudulent ones. This imbalance issue can affect the accuracy and performance of machine learning classification models. Many machine learning classification models tend to learn more general patterns in the majority class. As a result, the model may overlook patterns that exist in the minority class. Various research has been conducted to address the problem of imbalanced data. The objective of this systematic literature review is to provide the latest developments regarding the cases, methods used, and evaluation techniques in handling imbalanced data. This research successfully identifies new methods and is expected to provide more choices for researchers so that imbalanced data can be properly handled, and classification models can produce unbiased, accurate, and consistent results.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.