Subspace exploration: Bounds on Projected Frequency Estimation.

Graham Cormode,Charlie Dickens,David P Woodruff

doi:10.1145/3452021.3458312

Abstract

Given an n × d dimensional dataset A, a projection query specifies a subset C ⊆ [d] of columns which yields a new n × |C| array. We study the space complexity of computing data analysis functions over such subspaces, including heavy hitters and norms, when the subspaces are revealed only after observing the data. We show that this important class of problems is typically hard: for many problems, we show 2Ω(d) lower bounds. However, we present upper bounds which demonstrate space dependency better than 2 d . That is, for c, c' ∈ (0, 1) and a parameter N = 2 d an Nc -approximation can be obtained in space , showing that it is possible to improve on the naïve approach of keeping information for all 2 d subsets of d columns. Our results are based on careful constructions of instances using coding theory and novel combinatorial reductions that exhibit such space-approximation tradeoffs.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Subspace exploration: Bounds on Projected Frequency Estimation.

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems

Lead the way for us

Journal: Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems	Publication Date: Jun 20, 2021
Citations: 1

Similar Papers

Zooming, a Practical Strategy for Improving the Quality of Multidimensional NMR Spectra
Zsolt Zolnai ... Nenad Juranić
Journal of Magnetic Resonance, Series A | VOL. 119
Zsolt Zolnai, et. al.Zsolt Zolnai ... Nenad Juranić
01 Mar 1996
Journal of Magnetic Resonance, Series A | VOL. 119

Spectroscopic optical coherence tomography with graphics processing unit based analysis of three dimensional data sets
Volker Jaedicke ... Martin R Hofmann
-
Volker Jaedicke, et. al.Volker Jaedicke ... Martin R Hofmann
21 Feb 2013
21 Feb 2013

OpenCyto: an open source infrastructure for scalable, robust, reproducible, and automated, end-to-end flow cytometry data analysis.
Greg Finak ... Raphael Gottardo
PLoS computational biology | VOL. 10
Greg Finak, et. al.Greg Finak ... Raphael Gottardo
28 Aug 2014
PLoS computational biology | VOL. 10

Targeted Projection Pursuit for Interactive Exploration of High- Dimensional Data Sets
Joe Faith
-
Joe FaithJoe Faith
01 Jul 2007
01 Jul 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Subspace exploration: Bounds on Projected Frequency Estimation.

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems