It is often proposed ( R. P. W. Duin, I.E.E.E. Trans. Comput. 25 (1976) , 1175–1179; J. D. F. Habbema, J. Hermans, and J. Remme, “Compstat 1978” (Corsten and Hermans, Eds.), pp. 178–185, “Compstat 1974” (G. Bruckman, Ed.), pp. 101–110; [16] , [17] ) that Kullback-Leibler loss or likelihood cross-validation be used to select the window size when a kernel density estimate is constructed for purposes of discrimination. Some numerical work ( E. F. Schuster and G. G. Gregory, “Fifteenth Annual Symposium on the Interface of Computer Science and Statistics” (W. F. Eddy, Ed.), pp. 295–298, Springer-Verlag, New York, 1981 ) argues against this proposal, but a major theoretical contribution ( Y. S. Chow, S. Geman, and L. D. Wu, Ann. Statist. 11 (1983) , 25–38) demonstrates consistency in the important case of compactly supported kernels. In the present paper we argue that in the context of Kullback-Leibler loss and likelihood cross-validation, compactly supported kernels are an unwise choice. They can result in unnecessarily large loss, and can lead to infinite loss when likelihood cross-validation is used to select window size. Compactly supported kernels often dictate that window size be chosen by trading off one part of the variance component of loss against the other, with scant regard for bias; compare the classical theory, where minimum loss is achieved by trading off variance against bias.
Read full abstract