Abstract

This paper develops a new computational model for learning stochastic rules, called PAD (Probably Almost Discriminative)-learning model, based on statistical hypothesis testing theory. The model deals with the problem of designing a discrimination algorithm to test whether or not any given test sequence of examples of pairs of (instance, label) has come from a given stochastic rule P*. Here a composite hypothesis \(\tilde P\)is unknown other than it belongs to a given class \(\mathcal{C}\) In this model, we propose a new discrimination algorithm on the basis of the MDL (Minimum Description Length) principle, and then derive upper bounds on the least test sample size required by the algorithm to guarantee that two types of error probabilities are respectively less than δ1 and δ2 provided that the distance between the two rules to be discriminated is not less than ε.For the parametric case where \(\mathcal{C}\) is a parametric class, this paper shows that an upper bound on test sample size is given by \(O(\tfrac{1}{\varepsilon }ln\tfrac{1}{{\delta _1 }} + \tfrac{1}{{\varepsilon ^2 }}ln\tfrac{1}{{\delta _2 }} + \tfrac{{\tilde k}}{\varepsilon } + \tfrac{{\tilde k}}{\varepsilon } + \tfrac{{\ell (\tilde M)}}{\varepsilon })\) Here \(\tilde k\) is the number of real-valued parameters for the composite hypothesis \(\tilde P\), and \(\ell (\tilde M)\) is the description length for the countable model for \(\tilde P\). Further this paper shows that the MDL-based discrimination algorithm performs well in the sense of sample complexity efficiency, comparing it with other kinds of information-criteria-based discrimination algorithms. This paper also shows how to transform any stochastic PAC (Probably Approximately Correct)-learning algorithm into a PAD-learning algorithm.For the non-parametric case where \(\mathcal{C}\) is a non-parametric class but the discrimination algorithm uses a parametric class, this paper demonstrates that the sample complexity bound for the MDL-based discrimination algorithm is essentially related to Barron and Cover's index of resolvability. The sample complexity bound gives a new view at the relationship between the index of resolvability and the MDL principle from the PAD-learning viewpoint.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.