Abstract

The density estimation is one of the core problems in statistics. Despite this, existing techniques like maximum likelihood estimation are computationally inefficient in case of complex parametric families due to the intractability of the normalizing constant. For this reason, an interest in score matching has increased, being independent on the normalizing constant. However, such an estimator is consistent only for distributions with the full space support. One of the approaches to make it consistent is to add noise to the input data called Denoising Score Matching. In this work we build computationally efficient algorithm for density estimate using kernel exponential family as a model distribution. The usage of the kernel exponential family is motivated by the richness of this class of densities. To avoid calculating an intractable normalizing constant we use Denoising Score Matching objective. The computational complexity issue is approached by applying Random Fourier Features-based approximation of the kernel function. We derive an exact analytical expression for this case which allows dropping additional regularization terms based on the higher-order derivatives as they are already implicitly included. Moreover, the obtained expression explicitly depends on the noise variance, so that the validation loss can be straightforwardly used to tune the noise level. Along with benchmark experiments, the method was tested on various synthetic distributions to study the behavior of the method in different cases. The empirical study shows comparable quality to the competing approaches, while the proposed method being computationally faster. The latter one enables scaling up to complex high-dimensional data.

Highlights

  • One of the core problems in statistics is density estimation

  • We start by rewriting the expression for the Denoising Score Matching objective (4) and follow the same logic in derivation as in paper [11] with the difference that our objective function is the convolution of the usual score matching objective with the noise distribution

  • The error bounds for score matching with Random Fourier Features (RFF) is given by the following theorem

Read more

Summary

INTRODUCTION

One of the core problems in statistics is density estimation. The most well-known approach is the Maximum Likelihood Estimation (MLE). There are numerous developments of the idea [2]–[6] Another important part of the density estimation is the class of models to search the solution in. To approach the computational complexity issue, the authors of [13], [14] propose to use Nyström-type approximation of the kernel function. To tackle the convergence issue we optimize the convolution of the score matching objective with symmetric noise. The derived expression of the loss function explicitly contains the noise parameters that allows us to use simple gradient-based approaches to tune these parameters.

SCORE MATCHING
KERNEL EXPONENTIAL FAMILY
DENOISING SCORE MATCHING IN RKHS
RFF FOR DENOISING SCORE MATCHING
5: Compute ordinary score matching loss on the validation data set
DISCUSSION
RESULTS
CONCLUSION
Fisher divergence
PROOF FOR THE ERROR BOUNDS OF SCORE MATCHING WITH RFF
DERIVATION OF H AND HOR GAUSSIAN NOISE
DERIVATION OF H AND HOR ARC-COSINE KERNEL
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.