The Minimal Complexity Machine (MCM) is a kernel-based learning model that can learn very sparse models that yield comparable or better performance than Support Vector Machines (SVMs). However, kernel optimization for the MCM has not yet been explored. It has been shown in prior work that a data dependent kernel helps improve generalization. We show results on data dependent optimized kernels for the MCM and a large-scale MCM variant. Results on benchmark datasets demonstrate both model sparsity and improved generalization.
Read full abstract