Abstract

In this paper, we deal with the classical Statistical Learning Theory’s problem of bounding, with high probability, the true risk of a hypothesis h chosen from a set of m hypotheses. The Union Bound (UB) allows one to state that where is the empirical errors, if it is possible to prove that and , when h, , and are chosen before seeing the data such that and . If no a priori information is available and are set to , namely equally distributed. This approach gives poor results since, as a matter of fact, a learning procedure targets just particular hypotheses, namely hypotheses with small empirical error, disregarding the others. In this work we set the and in a distribution-dependent way increasing the probability of being chosen to function with small true risk. We will call this proposal Distribution-Dependent Weighted UB (DDWUB) and we will retrieve the sufficient conditions on the choice of and that state that DDWUB outperforms or, in the worst case, degenerates into UB. Furthermore, theoretical and numerical results will show the applicability, the validity, and the potentiality of DDWUB.

Highlights

  • Statistical learning theory [1,2,3,4] deals with the problem of understanding and estimating the performance of a statistical learning procedure

  • When the hypothesis space is composed of an arbitrary finite number of hypothesis, and no additional information is provided, the evaluation of the total risk is usually made with the Union Bound (UB) [2,7,8]

  • In this work we derived, for an arbitrary finite hypothesis space, a fully empirical new upper bound on the generalization error of the hypothesis of minimal training error

Read more

Summary

Introduction

Statistical learning theory [1,2,3,4] deals with the problem of understanding and estimating the performance of a statistical learning procedure. Weighting more the risk associated with useful choices leads to tighter bounds on the generalization error of hypotheses that will be selected by the algorithm (hypotheses characterized by small empirical error) and looser estimates over the others (hypotheses characterized by high empirical error) This approach is mainly theoretical since the weights must be chosen before seeing the data and we cannot set them without an a priori knowledge about the problem. It is surely possible to consider even more general data independent functions for defining the weights, but we think that our definition is general enough to contemplate a wide variety of cases At this point the proposed DDWUB for bounding the generalization error of a hypothesis chosen from a finite set of possible ones can be stated. In the Appendices known results, proof, and technicalities (See in Appendixs A–C) are reported for completeness

Distribution-Dependent Weighted Union Bound
From Theory to Practice
Observation
Closed Form Results
Numerical Results
The Importance of γ and θ
What About the Computable Shell Decomposition Bounds?
Improving the Computable Shell Decomposition Bounds
Conclusions and Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call