Abstract

AbstractIn this paper, we investigate the performance of statistical, mathematical programming and heuristic linear models for cost‐sensitive classification. In particular, we use five cost‐sensitive techniques including Fisher's discriminant analysis (DA), asymmetric misclassification cost mixed integer programming (AMC‐MIP), cost‐sensitive support vector machine (CS‐SVM), a hybrid support vector machine and mixed integer programming (SVMIP) and heuristic cost‐sensitive genetic algorithm (CGA) techniques. Using simulated datasets of varying group overlaps, data distributions and class biases, and real‐world datasets from financial and medical domains, we compare the performances of our five techniques based on overall holdout sample misclassification cost. The results of our experiments on simulated datasets indicate that when group overlap is low and data distribution is exponential, DA appears to provide superior performance. For all other situations with simulated datasets, CS‐SVM provides superior performance. In case of real‐world datasets from financial domain, CGA and AMC‐MIP hold a slight edge over the two SVM‐based classifiers. However, for medical domains with mixed continuous and discrete attributes, SVM classifiers perform better than heuristic (CGA) and AMC‐MIP classifiers. The SVMIP model is the most computationally inefficient model and poor performing model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.