Adaptive random Fourier features with Metropolis sampling

Aku Kammonen,Mattias Sandberg,Jonas Kiessling,Petr Plecháč,Anders Szepessy

doi:10.3934/fods.2020014

Abstract

The supervised learning problem to determine a neural network approximation $ \mathbb{R}^d\ni x\mapsto\sum_{k = 1}^K\hat\beta_k e^{{{\mathrm{i}}}\omega_k\cdot x} $ with one hidden layer is studied as a random Fourier features algorithm. The Fourier features, i.e., the frequencies $ \omega_k\in\mathbb{R}^d $, are sampled using an adaptive Metropolis sampler. The Metropolis test accepts proposal frequencies $ \omega_k' $, having corresponding amplitudes $ \hat\beta_k' $, with the probability $ \min\big\{1, (|\hat\beta_k'|/|\hat\beta_k|)^{\gamma}\big\} $, for a certain positive parameter $ {\gamma} $, determined by minimizing the approximation error for given computational work. This adaptive, non-parametric stochastic method leads asymptotically, as $ K\to\infty $, to equidistributed amplitudes $ |\hat\beta_k| $, analogous to deterministic adaptive algorithms for differential equations. The equidistributed amplitudes are shown to asymptotically correspond to the optimal density for independent samples in random Fourier features methods. Numerical evidence is provided in order to demonstrate the approximation properties and efficiency of the proposed algorithm. The algorithm is tested both on synthetic data and a real-world high-dimensional benchmark.

Full Text