Abstract
As a well known statistical method, logistic discrimination has been successfully used in many practical applications including medical diagnosis and personal credit assessment. In this paper, we apply this model to imbalanced problem which is also referred to as skewed or rare class problem, characterized by having many more instances of one class (negative class or majority class) than the other (positive class or minority class). However, traditional logistic discrimination tries to pursue a high accuracy by assuming that all classes have similar size, leading to the fact that instances with positive classes are often overlooked and misclassified to negative ones. To fully consider class imbalance, we re-learn the two basic measures for imbalanced problem, g-mean and f-measure, and design two new cost functions, i.e., g-mean based metric (GM) and f-measure based metric (FM), to supervise logistic discrimination learning the corresponding parameters, where GM is the geometric mean estimation of recall of both positive and negative class as g-mean and FM is a harmonic mean between recall and precision of positive class as f-measure. The experiments on UCI data sets show that the proposed method presents significant advantage comparing to state-of-the-art classification methods on all metrics used in this paper including accuracy, recall, f-measure and g-mean.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.