Abstract
In this work, we propose two novel classifiers for multi-class classification problems using mathematical programming optimisation techniques. A hyper box-based classifier (Xu & Papageorgiou, 2009) that iteratively constructs hyper boxes to enclose samples of different classes has been adopted. We firstly propose a new solution procedure that updates the sample weights during each iteration, which tweaks the model to favour those difficult samples in the next iteration and therefore achieves a better final solution. Through a number of real world data classification problems, we demonstrate that the proposed refined classifier results in consistently good classification performance, outperforming the original hyper box classifier and a number of other state-of-the-art classifiers.Furthermore, we introduce a simple data space partition method to reduce the computational cost of the proposed sample re-weighting hyper box classifier. The partition method partitions the original dataset into two disjoint regions, followed by training sample re-weighting hyper box classifier for each region respectively. Through some real world datasets, we demonstrate the data space partition method considerably reduces the computational cost while maintaining the level of prediction accuracies.
Highlights
Given a set of samples, each of which is described by certain measurable features and labelled with a pre-determined class, data classification concerns identifying the pattern within the sample data and predicting the class labels of new samples
All the mathematical programming-based classification methods, including sample reweighting hyper box classifier (SRW_HB), hyper box (HB), and approaches proposed by Gehrlein (1986) and Bal and Orkcu (2011), are implemented in General Algebraic Modeling System (GAMS) 24.1 (GAMS Development Corporation, 2013) and solved using CPLEX 12.3 solver on a 2.40 GHz speed, 2393 MHz cpu computer system
We demonstrate that the proposed SRW_HB classifier, which modifies the traditional HB classifier by updating the misclassification costs for samples with type 2 errors after each iteration, gives overall better prediction accuracy compared with a number of state-of-the-art classifiers
Summary
Given a set of samples, each of which is described by certain measurable features and labelled with a pre-determined class, data classification concerns identifying the pattern within the sample data and predicting the class labels of new samples. The balance between distance of the constructed hyper plane to different classes of samples and the amount of misclassifications is controlled by a user-specified trade-off parameter. Despite its capacity to tackle datasets with non-linear and complex decision boundaries, the number of hidden layers, how many neurons allowed for each hidden layer, which activation function to use amount to a difficult optimisation problem, which limits the generality of the method (Hunter, Hao, Pukish, Kolbusz, & Wilamowski, 2012). The structure of the network, i.e. the number of layers, the number of neurons for each layer and the types of activation function, are usually specified by the user, which reduces the problem of training a neural network classifier to tune the weights of connections between consecutive layers of neurons to minimise the classification error. Training a neural network is known to be time consuming and can only guarantee local optimality
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have