Abstract
This paper concerns the classification task in discrete attribute spaces, but considers the task in a more fundamental framework: the learning of Boolean functions. The purpose of this paper is to present a new learning algorithm for Boolean functions called Boolean kernel classifier (BKC) employing capacity control using Boolean kernels. BKC uses support vector machines (SVMs) as learning engines and Boolean kernels are primarily used for running SVMs in feature spaces spanned by conjunctions of Boolean literals. However, another important role of Boolean kernels is to appropriately control the size of its hypothesis space, to avoid overfitting. After applying a SVM to learn a classifier f in a feature space H induced by a Boolean kernel, BKC uses another Boolean kernel to compute the projections f/sup k/ of f onto a subspace H/sub k/ of H spanned by conjunctions with length at most k. By evaluating the accuracy of f/sup k/ on training data for any k, BKC can determine the smallest k such that f/sup k/ is as accurate as f and learn another f' in H/sub k/ expected to have lower error for unseen data. By an empirical study on learning of randomly generated Boolean functions, it is shown that the capacity control is effective, and BKC outperforms C4.5 and naive Bayes classifiers.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have