Sign-Full Random Projections

Ping Li

doi:10.1609/aaai.v33i01.33014205

Abstract

The method of 1-bit (“sign-sign”) random projections has been a popular tool for efficient search and machine learning on large datasets. Given two D-dim data vectors u, v ∈ ℝD, one can generate x = ∑i=1D uiri, and y = ∑i=1D viri, where ri ∼ N(0, 1) iid. Then one can estimate the cosine similarity ρ from sgn(x) and sgn(y). In this paper, we study a series of estimators for “sign-full” random projections. First we prove E(sgn(x)y) = √2/πρ, which provides an estimator for ρ. Interestingly this estimator can be substantially improved by normalizing y. Then we study estimators based on E (y−1x≥0 + y+1x<0) and its normalized version. We analyze the theoretical limit (using the MLE) and conclude that, among the proposed estimators, no single estimator can achieve (close to) the theoretical optimal asymptotic variance, for the entire range of ρ. On the other hand, the estimators can be combined to achieve the variance close to that of the MLE. In applications such as near neighbor search, duplicate detection, knn-classification, etc, the training data are first transformed via random projections and then only the signs of the projected data points are stored (i.e., the sgn(x)). The original training data are discarded. When a new data point arrives, we apply random projections but we do not necessarily need to quantize the projected data (i.e., the y) to 1-bit. Therefore, sign-full random projections can be practically useful. This gain essentially comes at no additional cost.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Sign-Full Random Projections

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jul 17, 2019
Citations: 7

Similar Papers

Large-scale machine learning for classification and search
...
-
, et. al. ...
01 Jan 2012
01 Jan 2012

Dimensionality Reduction for Sparse Subspace Clustering

-

01 Jan 2015
01 Jan 2015

Random Projections for Large-Scale Regression
Gian-Andrea Thanei ... Christina Heinze
-
Gian-Andrea Thanei, et. al.Gian-Andrea Thanei ... Christina Heinze
01 Jan 2017
01 Jan 2017

Embedding Random Projections in Regularized Gradient Boosting Machines
Pierluigi Casale ... Petia Radeva
-
Pierluigi Casale, et. al.Pierluigi Casale ... Petia Radeva
01 Jan 2010
01 Jan 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sign-Full Random Projections

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence