Online Censoring for Large-Scale Regressions with Application to Streaming Big Data.

Dimitris Berberidis,Georgios B Giannakis,Vassilis Kekatos

doi:10.1109/tsp.2016.2546225

Abstract

On par with data-intensive applications, the sheer size of modern linear regression problems creates an ever-growing demand for efficient solvers. Fortunately, a significant percentage of the data accrued can be omitted while maintaining a certain quality of statistical inference with an affordable computational budget. This work introduces means of identifying and omitting less informative observations in an online and data-adaptive fashion. Given streaming data, the related maximum-likelihood estimator is sequentially found using first- and second-order stochastic approximation algorithms. These schemes are well suited when data are inherently censored or when the aim is to save communication overhead in decentralized learning setups. In a different operational scenario, the task of joint censoring and estimation is put forth to solve large-scale linear regressions in a centralized setup. Novel online algorithms are developed enjoying simple closed-form updates and provable (non)asymptotic convergence guarantees. To attain desired censoring patterns and levels of dimensionality reduction, thresholding rules are investigated too. Numerical tests on real and synthetic datasets corroborate the efficacy of the proposed data-adaptive methods compared to data-agnostic random projection-based alternatives.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Online Censoring for Large-Scale Regressions with Application to Streaming Big Data.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on signal processing : a publication of the IEEE Signal Processing Society

Lead the way for us

Journal: IEEE transactions on signal processing : a publication of the IEEE Signal Processing Society	Publication Date: Mar 23, 2016
Citations: 90

Similar Papers

Lightweight Metric Computation for Distributed Massive Data Streams
Emmanuelle Anceaume ... Yann Busnel
-
Emmanuelle Anceaume, et. al.Emmanuelle Anceaume ... Yann Busnel
01 Jan 2017
01 Jan 2017

IQ-Paths: Predictably High Performance Data Streams across Dynamic Network Overlays
Zhongtang Cai ... V Kumar
-
Zhongtang Cai, et. al. Zhongtang Cai ... V Kumar
10 Jul 2006
10 Jul 2006

ATLAS: A Small but Complete SQL Extension for Data Mining and Data Streams
Haixun Wang ... Carlo Zaniolo
Proceedings 2003 VLDB Conference | VOL. -
Haixun Wang, et. al.Haixun Wang ... Carlo Zaniolo
01 Jan 2003
Proceedings 2003 VLDB Conference | VOL. -

Online Tensor Decomposition and Imputation for Count Data
Chang Ye ... Gonzalo Mateos
-
Chang Ye, et. al.Chang Ye ... Gonzalo Mateos
01 Jun 2019
01 Jun 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Online Censoring for Large-Scale Regressions with Application to Streaming Big Data.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on signal processing : a publication of the IEEE Signal Processing Society