Abstract

We implement the simple method to accelerate the convergence speed to the steady state and enhance the mixing rate to the stochastic gradient Langevin method. The ordinary stochastic gradient method is based on mini-batch learning for reducing the computational cost when the amount of data is extraordinary large. The stochasticity of the gradient can be mitigated by the injection of Gaussian noise, which yields the stochastic Langevin gradient method; this method can be used for Bayesian posterior sampling. However, the performance of the stochastic Langevin gradient method depends on the mixing rate of the stochastic dynamics. In this study, we propose violating the detailed balance condition to enhance the mixing rate. Recent studies have revealed that violating the detailed balance condition accelerates the convergence to a stationary state and reduces the correlation time between the samplings. We implement this violation of the detailed balance condition in the stochastic gradient Langevin method and test our method for a simple model to demonstrate its performance.

Highlights

  • Since massive amounts of data can be acquired from various sources, the importance of the socalled big-data analysis is rapidly increasing

  • Summary In this study, we proposed the application of the Ohzeki-Ichiki method in the stochastic gradient Langevin method

  • The Ohzeki-Ichiki method violates the detailed balance condition (DBC), which is a sufficient condition to ensure convergence to a stationary state, and shows remarkable performance for attaining a stationary state compared to the standard equilibrium case under the DBC

Read more

Summary

Introduction

Since massive amounts of data can be acquired from various sources, the importance of the socalled big-data analysis is rapidly increasing. The former stochasticity is a resultant property to reduce the computational cost for large-scale data In the latter, the noise makes the trajectory of the parameters converge to full posterior distribution rather than just the maximum a posteriori mode. Welling and Teh have proposed the combination of Langevin dynamics with the stochastic gradient method, i.e., the stochastic gradient Langevin method, to generate the posterior distribution for learning from large-scale data [16]. By decreasing the step size gradually, the injected noise will become dominant and the effective dynamics will converge to the Langevin equation with the exact gradient. The increase in the convergence speed improves the performance and reduces the computational cost of the method This fact motivates the study of the stochastic gradient Langevin method from a point of view of nonequilibrium statistical physics. We introduce the accelerated stochastic dynamics with faster convergence to a stationary state

Langevin equation and its corresponding Fokker-Planck equation
Ohzeki-Ichiki method for replicate system

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.