Abstract

Estimating the error probability is of primordial importance for classifier selection. The method explored in the previous chapter attempts to solve this problem by using a testing sequence to obtain a reliable holdout estimate. The independence of testing and training sequences leads to a rather straightforward analysis. For a good performance, the testing sequence has to be sufficiently large (although we often get away with testing sequences as small as about log n). When data are expensive, this constitutes a waste. Assume that we do not split the data and use the same sequence for testing and training. Often dangerous, this strategy nevertheless works if the class of rules from which we select is sufficiently restricted. The error estimate in this case is appropriately called the resubstitution estimate and it will be denoted by L n (R) . This chapter explores its virtues and pitfalls. A third error estimate, the deleted estimate, is discussed in the next chapter. Estimates based upon other paradigms are treated briefly in Chapter 31.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.