Adaptive random Fourier features training stabilized by resampling with applications in image regression

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Adaptive random Fourier features training stabilized by resampling with applications in image regression

Similar Papers
  • Research Article
  • Cite Count Icon 8
  • 10.1609/aaai.v33i01.33014229
Learning Adaptive Random Features
  • Jul 17, 2019
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Yanjun Li + 3 more

Random Fourier features are a powerful framework to approximate shift invariant kernels with Monte Carlo integration, which has drawn considerable interest in scaling up kernel-based learning, dimensionality reduction, and information retrieval. In the literature, many sampling schemes have been proposed to improve the approximation performance. However, an interesting theoretic and algorithmic challenge still remains, i.e., how to optimize the design of random Fourier features to achieve good kernel approximation on any input data using a low spectral sampling rate? In this paper, we propose to compute more adaptive random Fourier features with optimized spectral samples (wj’s) and feature weights (pj’s). The learning scheme not only significantly reduces the spectral sampling rate needed for accurate kernel approximation, but also allows joint optimization with any supervised learning framework. We establish generalization bounds using Rademacher complexity, and demonstrate advantages over previous methods. Moreover, our experiments show that the empirical kernel approximation provides effective regularization for supervised learning.

  • Research Article
  • Cite Count Icon 13
  • 10.3934/fods.2020014
Adaptive random Fourier features with Metropolis sampling
  • Jan 1, 2020
  • Foundations of Data Science
  • Aku Kammonen + 4 more

The supervised learning problem to determine a neural network approximation $\mathbb{R}^d\ni x\mapsto\sum_{k=1}^K\hatβ_k e^{\mathrm{i}ω_k\cdot x}$ with one hidden layer is studied as a random Fourier features algorithm. The Fourier features, i.e., the frequencies $ω_k\in\mathbb{R}^d$, are sampled using an adaptive Metropolis sampler. The Metropolis test accepts proposal frequencies $ω_k'$, having corresponding amplitudes $\hatβ_k'$, with the probability $\min\big\{1, (|\hatβ_k'|/|\hatβ_k|)^γ\big\}$, for a certain positive parameter $γ$, determined by minimizing the approximation error for given computational work. This adaptive, non-parametric stochastic method leads asymptotically, as $K\to\infty$, to equidistributed amplitudes $|\hatβ_k|$, analogous to deterministic adaptive algorithms for differential equations. The equidistributed amplitudes are shown to asymptotically correspond to the optimal density for independent samples in random Fourier features methods. Numerical evidence is provided in order to demonstrate the approximation properties and efficiency of the proposed algorithm. The algorithm is tested both on synthetic data and a real-world high-dimensional benchmark.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 5
  • 10.1098/rspa.2021.0236
Wind field reconstruction with adaptive random Fourier features
  • Nov 1, 2021
  • Proceedings. Mathematical, Physical, and Engineering Sciences
  • Jonas Kiessling + 2 more

We investigate the use of spatial interpolation methods for reconstructing the horizontal near-surface wind field given a sparse set of measurements. In particular, random Fourier features is compared with a set of benchmark methods including kriging and inverse distance weighting. Random Fourier features is a linear model approximating the velocity field, with randomly sampled frequencies and amplitudes trained to minimize a loss function. We include a physically motivated divergence penalty , as well as a penalty on the Sobolev norm of . We derive a bound on the generalization error and a sampling density that minimizes the bound. We then devise an adaptive Metropolis–Hastings algorithm for sampling the frequencies of the optimal distribution. In our experiments, our random Fourier features model outperforms the benchmark models.

  • Conference Article
  • Cite Count Icon 33
  • 10.1145/2939672.2939794
Revisiting Random Binning Features
  • Aug 13, 2016
  • Lingfei Wu + 3 more

Kernel method has been developed as one of the standard approaches for nonlinear learning, which however, does not scale to large data set due to its quadratic complexity in the number of samples. A number of kernel approximation methods have thus been proposed in the recent years, among which the random features method gains much popularity due to its simplicity and direct reduction of nonlinear problem to a linear one. Different random feature functions have since been proposed to approximate a variety of kernel functions. Among them the Random Binning (RB) feature, proposed in the first random-feature paper [21], has drawn much less attention than the Random Fourier (RF) feature proposed also in [21]. In this work, we observe that the RB features, with right choice of optimization solver, could be orders-of-magnitude more efficient than other random features and kernel approximation methods under the same requirement of accuracy. We thus propose the first analysis of RB from the perspective of optimization, which by interpreting RB as a Randomized Block Coordinate Descent in the infinite-dimensional space, gives a faster convergence rate compared to that of other random features. In particular, we show that by drawing R random grids with at least κ number of non-empty bins per grid in expectation, RB method achieves a convergence rate of O(1/κ R)), which not only sharpens its O(1/√R) rate from Monte Carlo analysis, but also shows a κ times speedup over other random features under the same analysis framework. In addition, we demonstrate another advantage of RB in the L1-regularized setting, where unlike other random features, a RB-based Coordinate Descent solver can be parallelized with guaranteed speedup proportional to κ. Our extensive experiments demonstrate the superior performance of the RB features over other random features and kernel approximation methods.

  • Research Article
  • Cite Count Icon 2
  • 10.1080/00949655.2023.2182304
Iteratively reweighted least square for kernel expectile regression with random features
  • Mar 8, 2023
  • Journal of Statistical Computation and Simulation
  • Yue Cui + 1 more

To overcome the computational burden of quadratic programming in kernel expectile regression (KER), iteratively reweighted least square (IRLS) technique was introduced in literature, resulting in IRLS-KER. However, for nonlinear models, IRLS-KER involves operations with matrices and vectors of the same size as the training set. Thus, as the training set becomes large, nonlinear IRLS-KER needs a long training time and large memory. To further alleviate the training cost, this paper projects the original data into a low-dimensional space via random Fourier feature. The inner product of the random Fourier features of two data points is approximately the same as the kernel function evaluated at these two data points. Hence, it is possible to use a linear model in the new low-dimensional space to approximate the original nonlinear model, and consequently, the time/memory efficient linear training algorithms could be applied. This paper applies the idea of random Fourier features to IRLS-KER, and our testing results on simulated and real-world datasets show that, the introduction of random Fourier features makes IRLS-KER achieve similar prediction accuracy as the original nonlinear version with substantially higher time efficiency.

  • Conference Article
  • Cite Count Icon 40
  • 10.24963/ijcai.2017/354
Large-scale Online Kernel Learning with Random Feature Reparameterization
  • Aug 1, 2017
  • Tu Dinh Nguyen + 3 more

A typical online kernel learning method faces two fundamental issues: the complexity in dealing with a huge number of observed data points (a.k.a the curse of kernelization) and the difficulty in learning kernel parameters, which often assumed to be fixed. Random Fourier feature is a recent and effective approach to address the former by approximating the shift-invariant kernel function via Bocher's theorem, and allows the model to be maintained directly in the random feature space with a fixed dimension, hence the model size remains constant w.r.t. data size. We further introduce in this paper the reparameterized random feature (RRF), a random feature framework for large-scale online kernel learning to address both aforementioned challenges. Our initial intuition comes from the so-called "reparameterization trick" [Kingma et al., 2014] to lift the source of randomness of Fourier components to another space which can be independently sampled, so that stochastic gradient of the kernel parameters can be analytically derived. We develop a well-founded underlying theory for our method, including a general way to reparameterize the kernel, and a new tighter error bound on the approximation quality. This view further inspires a direct application of stochastic gradient descent for updating our model under an online learning setting. We then conducted extensive experiments on several large-scale datasets where we demonstrate that our work achieves state-of-the-art performance in both learning efficacy and efficiency.

  • Research Article
  • Cite Count Icon 13
  • 10.1145/2611378
On the Sample Complexity of Random Fourier Features for Online Learning
  • Jun 1, 2014
  • ACM Transactions on Knowledge Discovery from Data
  • Ming Lin + 2 more

We study the sample complexity of random Fourier features for online kernel learning—that is, the number of random Fourier features required to achieve good generalization performance. We show that when the loss function is strongly convex and smooth, online kernel learning with random Fourier features can achieve an O (log T / T ) bound for the excess risk with only O (1/λ 2 ) random Fourier features, where T is the number of training examples and λ is the modulus of strong convexity. This is a significant improvement compared to the existing result for batch kernel learning that requires O ( T ) random Fourier features to achieve a generalization bound O (1/√T). Our empirical study verifies that online kernel learning with a limited number of random Fourier features can achieve similar generalization performance as online learning using full kernel matrix. We also present an enhanced online learning algorithm with random Fourier features that improves the classification performance by multiple passes of training examples and a partial average.

  • Research Article
  • Cite Count Icon 10
  • 10.1016/j.asoc.2021.107724
Random Fourier feature-based fuzzy clustering with [formula omitted]-Laplacian regularization
  • Jul 24, 2021
  • Applied Soft Computing
  • Yingxu Wang + 5 more

Random Fourier feature-based fuzzy clustering with [formula omitted]-Laplacian regularization

  • Research Article
  • Cite Count Icon 16
  • 10.1609/aaai.v34i04.5920
Random Fourier Features via Fast Surrogate Leverage Weighted Sampling
  • Apr 3, 2020
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Fanghui Liu + 4 more

In this paper, we propose a fast surrogate leverage weighted sampling strategy to generate refined random Fourier features for kernel approximation. Compared to the current state-of-the-art method that uses the leverage weighted scheme (Li et al. 2019), our new strategy is simpler and more effective. It uses kernel alignment to guide the sampling process and it can avoid the matrix inversion operator when we compute the leverage function. Given n observations and s random features, our strategy can reduce the time complexity for sampling from O(ns2+s3) to O(ns2), while achieving comparable (or even slightly better) prediction performance when applied to kernel ridge regression (KRR). In addition, we provide theoretical guarantees on the generalization performance of our approach, and in particular characterize the number of random features required to achieve statistical guarantees in KRR. Experiments on several benchmark datasets demonstrate that our algorithm achieves comparable prediction performance and takes less time cost when compared to (Li et al. 2019).

  • Research Article
  • 10.1609/aaai.v36i8.20803
TRF: Learning Kernels with Tuned Random Features
  • Jun 28, 2022
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Alistair Shilton + 4 more

Random Fourier features (RFF) are a popular set of tools for constructing low-dimensional approximations of translation-invariant kernels, allowing kernel methods to be scaled to big data. Apart from their computational advantages, by working in the spectral domain random Fourier features expose the translation invariant kernel as a density function that may, in principle, be manipulated directly to tune the kernel. In this paper we propose selecting the density function from a reproducing kernel Hilbert space to allow us to search the space of all translation-invariant kernels. Our approach, which we call tuned random features (TRF), achieves this by approximating the density function as the RKHS-norm regularised least-squares best fit to an unknown ``true'' optimal density function, resulting in a RFF formulation where kernel selection is reduced to regularised risk minimisation with a novel regulariser. We derive bounds on the Rademacher complexity for our method showing that our random features approximation method converges to optimal kernel selection in the large N,D limit. Finally, we prove experimental results for a variety of real-world learning problems, demonstrating the performance of our approach compared to comparable methods.

  • Research Article
  • 10.1016/j.neunet.2024.107091
Improved analysis of supervised learning in the RKHS with random features: Beyond least squares.
  • Apr 1, 2025
  • Neural networks : the official journal of the International Neural Network Society
  • Jiamin Liu + 2 more

Improved analysis of supervised learning in the RKHS with random features: Beyond least squares.

  • Research Article
  • 10.1609/aaai.v32i1.11674
Alternating Circulant Random Features for Semigroup Kernels
  • Apr 29, 2018
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Yusuke Mukuta + 2 more

The random features method is an efficient method to approximate the kernel function. In this paper, we propose novel random features called "alternating circulant random features,'' which consist of a random mixture of independent random structured matrices. Existing fast random features exploit random sign flipping to reduce the correlation between features. Sign flipping works well on random Fourier features for real-valued shift-invariant kernels because the corresponding weight distribution is symmetric. However, this method cannot be applied to random Laplace features directly because the distribution is not symmetric. The method proposed herein yields alternating circulant random features, with the correlation between features being reduced through the random sampling of weights from multiple independent random structured matrices instead of via random sign flipping. The proposed method facilitates rapid calculation by employing structured matrices. In addition, the weight distribution is preserved because sign flipping is not implemented. The performance of the proposed alternating circulant random features method is theoretically and empirically evaluated.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/ijcnn52387.2021.9533863
Towards Unbiased Random Features with Lower Variance For Stationary Indefinite Kernels
  • Jul 18, 2021
  • Qin Luo + 3 more

Random Fourier Features (RFF) demonstrate well-appreciated performance in kernel approximation for large-scale situations but restrict kernels to be stationary and positive definite. And for non-stationary kernels, the corresponding RFF could be converted to that for stationary indefinite kernels when the inputs are restricted to the unit sphere. Numerous methods provide accessible ways to approximate stationary but indefinite kernels. However, they are either biased or possess large variance. In this article, we propose the generalized orthogonal random features, an unbiased estimation with lower variance. Experimental results on various datasets and kernels verify that our algorithm achieves lower variance and approximation error compared with the existing kernel approximation methods. With better approximation to the originally selected kernels, improved classification accuracy and regression ability is obtained with our approximation algorithm in the framework of support vector machine and regression.

  • Research Article
  • 10.17485/ijst/2016/v9i47/97564
Polarity Identification of Aspect based Sentiment Reviews
  • Dec 28, 2016
  • Indian journal of science and technology
  • S Thara + 1 more

Objectives: To show the effectiveness of using random Fourier features in detecting sentiment polarities. Methods/ Statistical Analysis: The paper proposes on identifying the sentiment polarity of laptop and restaurant dataset towards three different polarity categories- positive, negative, and neutral. It provides experimental comparisons on conventional machine learning methods for detecting review polarities. Several articles have shown the effectiveness of random Fourier features are used in classification problems. The present paper prepares random Fourier features corresponding to the polarity data. A regularized least square strategy is adopted to fit a model and to perform the polarity detection task. Findings: Experiments were performed with 10 cross-validations. The proposed method with random Fourier features gives 80% accuracy over conventional classifiers. Initially the features are mapped to a lower dimension (chosen manually) and corresponding random Fourier features are obtained. The experiments are evaluated using Precision, recall, and F-measure. Application/Improvements: The method presented in this paper shows that aspect based polarity detection can be improved by choosing suitable features and mapping to lower dimension.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/ijcnn.2018.8489498
A Unified Framework of Random Feature KLMS Algorithms and Convergence Analysis
  • Jul 1, 2018
  • Jiyao Dong + 2 more

Random feature kernel least mean square RF KLMS) algorithms, like the random Fourier feature KLMS (RFF-KLMS), can effectively reduce the computation and storage burdens of the KLMS algorithm in the process of update. However, little work has been done to perform the convergence analysis for such algorithms. To this end, in this paper, we present a unified framework of RF-KLMS algorithms, and based on which, a universal model for convergence analysis is given. As two examples, the RFF-KLMS and the random Gaussian feature KLMS (RGF-KLMS) are discussed detailedly. Simulations demonstrate the validity of the theoretical analysis. Index Terms–Kernel least mean square, random feature, universal model, convergence analysis

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.