Practical Conditions for Effectiveness of the Universum Learning

Vladimir Cherkassky,Wuyang Dai Wuyang Dai,Sauptik Dhar

doi:10.1109/tnn.2011.2157522

Abstract

Many applications of machine learning involve analysis of sparse high-dimensional data, in which the number of input features is larger than the number of data samples. Standard inductive learning methods may not be sufficient for such data, and this provides motivation for nonstandard learning settings. This paper investigates a new learning methodology called learning through contradictions or Universum support vector machine (U-SVM). U-SVM incorporates a priori knowledge about application data, in the form of additional Universum samples, into the learning process. This paper investigates possible advantages of U-SVM versus standard SVM, and describes the practical conditions necessary for the effectiveness of the U-SVM. These conditions are based on the analysis of the univariate histograms of projections of training samples onto the normal direction vector of (standard) SVM decision boundary. Several empirical comparisons are presented to illustrate the practical utility of the proposed approach.

Full Text