In order to cope with classification problems involving large datasets, we propose a new mathematical programming algorithm by extending the clustering based polyhedral conic functions approach. Despite the high classification efficiency of polyhedral conic functions, the realization previously required a nested implementation of k-means and conic function generation, which has a computational load related to the number of data points. In the proposed algorithm, an efficient data reduction method is employed to the k-means phase prior to the conic function generation step. The new method not only improves the computational efficiency of the successful conic function classifier, but also helps avoiding model over-fitting by giving fewer (but more representative) conic functions.
Read full abstract