Randomized Matrix Decompositions Using R

N Benjamin Erichson,Steven L Brunton,J Nathan Kutz,Sergey Voronin

doi:10.18637/jss.v089.i11

Abstract

Matrix decompositions are fundamental tools in the area of applied mathematics, statistical computing, and machine learning. In particular, low-rank matrix decompositions are vital, and widely used for data analysis, dimensionality reduction, and data compression. Massive datasets, however, pose a computational challenge for traditional algorithms, placing significant constraints on both memory and processing power. Recently, the powerful concept of randomness has been introduced as a strategy to ease the computational load. The essential idea of probabilistic algorithms is to employ some amount of randomness in order to derive a smaller matrix from a high-dimensional data matrix. The smaller matrix is then used to compute the desired low-rank approximation. Such algorithms are shown to be computationally efficient for approximating matrices with low-rank structure. We present the \proglang{R} package rsvd, and provide a tutorial introduction to randomized matrix decompositions. Specifically, randomized routines for the singular value decomposition, (robust) principal component analysis, interpolative decomposition, and CUR decomposition are discussed. Several examples demonstrate the routines, and show the computational advantage over other methods implemented in R.

Highlights

In the era of “big data”, vast amounts of data are being collected and curated in the form of arrays across the social, physical, engineering, biological, and ecological sciences
The rpca() function provides an efficient routine for computing the dominant principal components using Algorithm 6
Li, Ma, and Wright (2011) proved that it is possible to exactly separate such a data matrix A ∈ Rm×n into both its low-rank and sparse components, under rather broad assumptions. This is achieved by solving a convenient convex optimization problem called principal component pursuit (PCP)

Summary

Introduction

In the era of “big data”, vast amounts of data are being collected and curated in the form of arrays across the social, physical, engineering, biological, and ecological sciences. Analysis of the data relies on a variety of matrix decomposition methods which seek to exploit low-rank features exhibited by the high-dimensional data. Despite our ever-increasing computational power, the emergence of large-scale datasets has severely challenged our ability to analyze data using traditional matrix algorithms. The computationally expensive singular value decomposition (SVD) is the most ubiquitous method for dimensionality reduction, data processing and compression. The concept of randomness has recently been demonstrated as an effective strategy to easing the computational demands of low-rank approximations from matrix decompositions such as the SVD, allowing for a scalable architecture for modern “big data” applications. Throughout this paper, we make the following assumption: the data matrix to be approximated has low-rank structure, i.e., the rank is smaller than the ambient dimension of the measurement space

Randomness as a computational strategy

Motivation and contributions

Organization

Notation

Probabilistic framework for low-rank approximations

The generic randomized algorithm

Improved randomized algorithm

Random test matrices

Randomized singular value decompositions

Brief historical overview

Conceptual overview

Randomized algorithm

Theoretical performance

Existing functionality for SVD in R

SVD example

Computational performance

Randomized principal component analysis

Existing functionality for PCA in R

PCA example

Randomized robust principal component analysis

The inexact augmented Lagrange multiplier method

Existing functionality for robust PCA in R

Robust PCA example

Additional functionality

Randomized CUR decomposition

Randomized interpolative decomposition

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Statistical Software	Publication Date: Jan 1, 2019
Citations: 80	License type: cc-by

R Discovery Prime

R Discovery Prime

Randomized Matrix Decompositions Using R

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Statistical Software

Lead the way for us

Similar Papers

Speech Denoising via Low-Rank and Sparse Matrix Decomposition
Jianjun Huang
ETRI Journal | VOL. 36
Jianjun HuangJianjun Huang
01 Feb 2014
ETRI Journal | VOL. 36

RCUR: an R package for CUR matrix decomposition
András Bodor ... Norbert Solymosi
BMC Bioinformatics | VOL. 13
András Bodor, et. al.András Bodor ... Norbert Solymosi
17 May 2012
BMC Bioinformatics | VOL. 13

CUR Matrix Decompositions Method for Joint Analysis of Multiple Phenotypes
Fadhila Yosof
-
Fadhila YosofFadhila Yosof
01 Jan 2018
01 Jan 2018

CUR matrix decompositions for improved data analysis
Michael W Mahoney ... Petros Drineas
Proceedings of the National Academy of Sciences | VOL. 106
Michael W Mahoney, et. al.Michael W Mahoney ... Petros Drineas
20 Jan 2009
Proceedings of the National Academy of Sciences | VOL. 106

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Randomized Matrix Decompositions Using R

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Statistical Software