Abstract

We introduce a new approach for designing computationally efficient learning algorithms that are tolerant to noise, and we demonstrate its effectiveness by designing algorithms with improved noise tolerance guarantees for learning linear separators. We consider both the malicious noise model of Valiant [1985] and Kearns and Li [1988] and the adversarial label noise model of Kearns, Schapire, and Sellie [1994]. For malicious noise, where the adversary can corrupt both the label and the features, we provide a polynomial-time algorithm for learning linear separators in ℜ d under isotropic log-concave distributions that can tolerate a nearly information-theoretically optimal noise rate of η = Ω(ϵ), improving on the Ω (ϵ 3 /log 2 ( d/ϵ )) noise-tolerance of Klivans et al. [2009a]. In the case that the distribution is uniform over the unit ball, this improves on the Ω (ϵ/ d 1/4 ) noise-tolerance of Kalai et al. [2005] and the Ω (ϵ 2 /log(d/ϵ)) of Klivans et al. [2009a]. For the adversarial label noise model, where the distribution over the feature vectors is unchanged and the overall probability of a noisy label is constrained to be at most η, we also give a polynomial-time algorithm for learning linear separators in ℜ d under isotropic log-concave distributions that can handle a noise rate of η = Ω(ϵ). In the case of uniform distribution, this improves over the results of Kalai et al. [2005], which either required runtime super-exponential in 1/ϵ (ours is polynomial in 1/ϵ) or tolerated less noise. 1 Our algorithms are also efficient in the active learning setting, where learning algorithms only receive the classifications of examples when they ask for them. We show that, in this model, our algorithms achieve a label complexity whose dependence on the error parameter ϵ is polylogarithmic (and thus exponentially better than that of any passive algorithm). This provides the first polynomial-time active learning algorithm for learning linear separators in the presence of malicious noise or adversarial label noise. Our algorithms and analysis combine several ingredients including aggressive localization, minimization of a progressively rescaled hinge loss, and a novel localized and soft outlier removal procedure. We use localization techniques (previously used for obtaining better sample complexity results) to obtain better noise-tolerant polynomial-time algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.