Abstract
We implement the simple method to accelerate the convergence speed to the steady state and enhance the mixing rate to the stochastic gradient Langevin method. The ordinary stochastic gradient method is based on mini-batch learning for reducing the computational cost when the amount of data is extraordinary large. The stochasticity of the gradient can be mitigated by the injection of Gaussian noise, which yields the stochastic Langevin gradient method; this method can be used for Bayesian posterior sampling. However, the performance of the stochastic Langevin gradient method depends on the mixing rate of the stochastic dynamics. In this study, we propose violating the detailed balance condition to enhance the mixing rate. Recent studies have revealed that violating the detailed balance condition accelerates the convergence to a stationary state and reduces the correlation time between the samplings. We implement this violation of the detailed balance condition in the stochastic gradient Langevin method and test our method for a simple model to demonstrate its performance.
Highlights
Since massive amounts of data can be acquired from various sources, the importance of the socalled big-data analysis is rapidly increasing
Summary In this study, we proposed the application of the Ohzeki-Ichiki method in the stochastic gradient Langevin method
The Ohzeki-Ichiki method violates the detailed balance condition (DBC), which is a sufficient condition to ensure convergence to a stationary state, and shows remarkable performance for attaining a stationary state compared to the standard equilibrium case under the DBC
Summary
Since massive amounts of data can be acquired from various sources, the importance of the socalled big-data analysis is rapidly increasing. The former stochasticity is a resultant property to reduce the computational cost for large-scale data In the latter, the noise makes the trajectory of the parameters converge to full posterior distribution rather than just the maximum a posteriori mode. Welling and Teh have proposed the combination of Langevin dynamics with the stochastic gradient method, i.e., the stochastic gradient Langevin method, to generate the posterior distribution for learning from large-scale data [16]. By decreasing the step size gradually, the injected noise will become dominant and the effective dynamics will converge to the Langevin equation with the exact gradient. The increase in the convergence speed improves the performance and reduces the computational cost of the method This fact motivates the study of the stochastic gradient Langevin method from a point of view of nonequilibrium statistical physics. We introduce the accelerated stochastic dynamics with faster convergence to a stationary state
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.