Robust Coreset Construction for Distributed Machine Learning

Hanlin Lu,Ming-Ju Li,Kevin S Chan,Shiqiang Wang,Ting He,Vijaykrishnan Narayanan

doi:10.1109/jsac.2020.3000373

Abstract

Coreset, which is a summary of the original dataset in the form of a small weighted set in the same sample space, provides a promising approach to enable machine learning over distributed data. Although viewed as a proxy of the original dataset, each coreset is only designed to approximate the cost function of a specific machine learning problem, and thus different coresets are often required to solve different machine learning problems, increasing the communication overhead. We resolve this dilemma by developing robust coreset construction algorithms that can support a variety of machine learning problems. Motivated by empirical evidence that suitably-weighted k -clustering centers provide a robust coreset, we harden the observation by establishing theoretical conditions under which the coreset provides a guaranteed approximation for a broad range of machine learning problems, and developing both centralized and distributed algorithms to generate coresets satisfying the conditions. The robustness of the proposed algorithms is verified through extensive experiments on diverse datasets with respect to both supervised and unsupervised learning problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Journal on Selected Areas in Communications	Publication Date: Oct 1, 2020
Citations: 16	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Robust Coreset Construction for Distributed Machine Learning

Abstract

Talk to us

Similar Papers

More From: IEEE Journal on Selected Areas in Communications

Lead the way for us

Similar Papers

Robust Coreset Construction for Distributed Machine Learning
Hanlin Lu ... Ming-Ju Li
-
Hanlin Lu, et. al.Hanlin Lu ... Ming-Ju Li
01 Dec 2019
01 Dec 2019

Evolving inborn knowledge for fast adaptation in dynamic POMDP problems
Eseoghene Ben-Iwhiwhu ... Andrea Soltoggio
-
Eseoghene Ben-Iwhiwhu, et. al.Eseoghene Ben-Iwhiwhu ... Andrea Soltoggio
25 Jun 2020
25 Jun 2020

Research on improving accuracy and efficiency of animal data collection and classification using machine learning
Qinlong Yang
Applied and Computational Engineering | VOL. 49
Qinlong YangQinlong Yang
22 Mar 2024
Applied and Computational Engineering | VOL. 49

Big Data Analytics
Tianbao Yang ... Rong Jin
-
Tianbao Yang, et. al.Tianbao Yang ... Rong Jin
10 Aug 2015
10 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Robust Coreset Construction for Distributed Machine Learning

Abstract

Talk to us

Similar Papers

More From: IEEE Journal on Selected Areas in Communications