Abstract

Statistical learning theory provides a formal criterion for learning a concept from examples. This theory addresses directly the trade-off in empirical fit and generalization. In practice, this leads to the structural risk-minimization principle where one minimizes a bound on the overall risk functional. For learning linear discriminant functions, this bound is impacted by the minimum of two terms—the dimension and the inverse of the margin. A popular and powerful learning mechanism, support vector machines, focuses on maximizing the margin. We compare this to methods that focus on minimizing the dimensionality, which, coincidentally, fulfills another useful criterion—the minimum description length principle.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call