Abstract

We combine a new data model, where the random classification is subjected to rather weak restrictions which in turn are based on the Mammen−Tsybakov [E. Mammen and A.B. Tsybakov, Ann. Statis. 27 (1999) 1808–1829; A.B. Tsybakov, Ann. Statis. 32 (2004) 135–166.] small margin conditions, and the statistical query (SQ) model due to Kearns [M.J. Kearns, J. ACM 45 (1998) 983–1006] to what we refer to as PAC + SQ model. We generalize the class conditional constant noise (CCCN) model introduced by Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to the noise model orthogonal to a set of query functions. We show that every polynomial time PAC + SQ learning algorithm can be efficiently simulated provided that the random noise rate is orthogonal to the query functions used by the algorithm given the target concept. Furthermore, we extend the constant-partition classification noise (CPCN) model due to Decatur [S.E. Decatur, in ICML ’97: Proc. of the Fourteenth Int. Conf. on Machine Learn. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA (1997) 83–91] to what we call the constant-partition piecewise orthogonal (CPPO) noise model. We show how statistical queries can be simulated in the CPPO scenario, given the partition is known to the learner. We show how to practically use PAC + SQ simulators in the noise model orthogonal to the query space by presenting two examples from bioinformatics and software engineering. This way, we demonstrate that our new noise model is realistic.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call