Numerical aspects of large deviations
An introduction to numerical large-deviation sampling is provided. First, direct biasing with a known distribution is explained. As simple example, the Bernoulli process is used throughout the text. Next, Markov chain Monte Carlo (MCMC) simulations are introduced. In particular, the Metropolis-Hastings algorithm is explained. As first implementation of MCMC, sampling of the plain Bernoulli model is shown. Next, an exponential bias is used for the same model, which allows one to obtain the tails of the distribution of a measurable quantity. This approach is generalized to MCMC simulations, where the states are vectors of U(0,1)U(0,1) random entries. This allows one to use the exponential or any other bias to access the large-deviation properties of rather arbitrary random processes. Finally, some recent research applications to study more complex models are discussed.
- Research Article
60
- 10.1016/j.geoderma.2011.03.011
- May 25, 2011
- Geoderma
Confronting uncertainty in model-based geostatistics using Markov Chain Monte Carlo simulation
- Research Article
53
- 10.1111/j.1365-2966.2008.14385.x
- May 23, 2008
- Monthly Notices of the Royal Astronomical Society
Retrieval of orbital parameters of extrasolar planets poses considerable statistical challenges. Due to sparse sampling, measurement errors, parameters degeneracy and modelling limitations, there are no unique values of basic parameters, such as period and eccentricity. Here, we estimate the orbital parameters from radial velocity data in a Bayesian framework by utilizing Markov Chain Monte Carlo (MCMC) simulations with the Metropolis–Hastings algorithm. We follow a methodology recently proposed by Gregory and Ford. Our implementation of MCMC is based on the object-oriented approach outlined by Graves. We make our resulting code, exofit, publicly available with this paper. It can search for either one or two planets as illustrated on mock data. As an example we re-analysed the orbital solution of companions to HD 187085 and HD 159868 from the published radial velocity data. We confirm the degeneracy reported for orbital parameters of the companion to HD 187085, and show that a low-eccentricity orbit is more probable for this planet. For HD 159868, we obtained slightly different orbital solution and a relatively high ‘noise’ factor indicating the presence of an unaccounted signal in the radial velocity data. exofit is designed in such a way that it can be extended for a variety of probability models, including different Bayesian priors.
- Research Article
54
- 10.1111/2041-210x.13727
- Oct 15, 2021
- Methods in Ecology and Evolution
Posterior distributions are commonly approximated by samples produced from a Markov chain Monte Carlo (MCMC) simulation. Every MCMC simulation has to be checked for convergence, that is, that sufficiently many samples have been obtained and that these samples indeed represent the true posterior distribution. Here we develop and test different approaches for convergence assessment in phylogenetics. We analytically derive a threshold for a minimum effective sample size (ESS) of 625. We observe that only the initial sequence estimator provides robust ESS estimates for common types of MCMC simulations (autocorrelated samples, adaptive MCMC, Metropolis‐coupled MCMC). We show that standard ESS computation can be applied to phylogenetic trees if the tree samples are converted into traces of absence/presence of splits. Convergence in distribution between replicated MCMC runs can be assessed with the Kolmogorov–Smirnov test. The commonly used potential scale reduction factor (PSRF) is biased when applied to skewed posterior distribution. Additionally, we provide how the distribution of differences in split frequencies can be computed exactly akin to standard exact tests and show that it depends on the true frequency of a split. Hence, the average standard deviation of split frequencies is too simplistic and the expected difference based on the 95% quantile should be used instead to check for convergence in split frequencies. We implemented the methods described here in the open‐source R package Convenience ( https://github.com/lfabreti/convenience ), which allows users to easily test for convergence using output from standard phylogenetic inference software.
- Research Article
2
- 10.1080/03610910600591917
- Jul 1, 2006
- Communications in Statistics - Simulation and Computation
Models for geostatistical data introduce spatial dependence in the covariance matrix of location-specific random effects. This is usually defined to be a parametric function of the distances between locations. Bayesian formulations of such models overcome asymptotic inference and estimation problems involved in maximum likelihood-based approaches and can be fitted using Markov chain Monte Carlo (MCMC) simulation. The MCMC implementation, however, requires repeated inversions of the covariance matrix which makes the problem computationally intensive, especially for large number of locations. In the present work, we propose to convert the spatial covariance matrix to a sparse matrix and compare a number of numerical algorithms especially suited within the MCMC framework in order to accelerate large matrix inversion. The algorithms are assessed empirically on simulated datasets of different size and sparsity. We conclude that the band solver applied after ordering the distance matrix reduces the computational time in inverting covariance matrices substantially.
- Research Article
38
- 10.1016/j.jsv.2014.10.002
- Feb 16, 2015
- Journal of Sound and Vibration
The estimation of time-invariant parameters of noisy nonlinear oscillatory systems
- Research Article
21
- 10.1016/j.gca.2016.12.040
- Jan 11, 2017
- Geochimica et Cosmochimica Acta
An introduction of Markov chain Monte Carlo method to geochemical inverse problems: Reading melting parameters from REE abundances in abyssal peridotites
- Research Article
37
- 10.1103/physreve.101.053312
- May 28, 2020
- Physical Review E
The autoregressive neural networks are emerging as a powerful computational tool to solve relevant problems in classical and quantum mechanics. One of their appealing functionalities is that, after they have learned a probability distribution from a dataset, they allow exact and efficient sampling of typical system configurations. Here we employ a neural autoregressive distribution estimator (NADE) to boost Markov chain Monte Carlo (MCMC) simulations of a paradigmatic classical model of spin-glass theory, namely, the two-dimensional Edwards-Anderson Hamiltonian. We show that a NADE can be trained to accurately mimic the Boltzmann distribution using unsupervised learning from system configurations generated using standard MCMC algorithms. The trained NADE is then employed as smart proposal distribution for the Metropolis-Hastings algorithm. This allows us to perform efficient MCMC simulations, which provide unbiased results even if the expectation value corresponding to the probability distribution learned by the NADE is not exact. Notably, we implement a sequential tempering procedure, whereby a NADE trained at a higher temperature is iteratively employed as proposal distribution in a MCMC simulation run at a slightly lower temperature. This allows one to efficiently simulate the spin-glass model even in the low-temperature regime, avoiding the divergent correlation times that plague MCMC simulations driven by local-update algorithms. Furthermore, we show that the NADE-driven simulations quickly sample ground-state configurations, paving the way to their future utilization to tackle binary optimization problems.
- Research Article
192
- 10.1093/ije/dyt043
- Apr 1, 2013
- International Journal of Epidemiology
Markov Chain Monte Carlo (MCMC) methods are increasingly popular among epidemiologists. The reason for this may in part be that MCMC offers an appealing approach to handling some difficult types of analyses. Additionally, MCMC methods are those most commonly used for Bayesian analysis. However, epidemiologists are still largely unfamiliar with MCMC. They may lack familiarity either with he implementation of MCMC or with interpretation of the resultant output. As with tutorials outlining the calculus behind maximum likelihood in previous decades, a simple description of the machinery of MCMC is needed. We provide an introduction to conducting analyses with MCMC, and show that, given the same data and under certain model specifications, the results of an MCMC simulation match those of methods based on standard maximum-likelihood estimation (MLE). In addition, we highlight examples of instances in which MCMC approaches to data analysis provide a clear advantage over MLE. We hope that this brief tutorial will encourage epidemiologists to consider MCMC approaches as part of their analytic tool-kit.
- News Article
- 10.1136/bmj.a708
- Jul 8, 2008
- BMJ
<h3>Summary</h3> Posterior distributions are commonly approximated by samples produced from a Markov chain Monte Carlo (MCMC) simulation. Every MCMC simulation has to be checked for convergence, i.e., that sufficiently many...
- Research Article
14
- 10.1063/1.3519056
- Feb 17, 2011
- The Journal of Chemical Physics
Characterizing the conformations of protein in the transition state ensemble (TSE) is important for studying protein folding. A promising approach pioneered by Vendruscolo et al. [Nature (London) 409, 641 (2001)] to study TSE is to generate conformations that satisfy all constraints imposed by the experimentally measured φ values that provide information about the native likeness of the transition states. Faísca et al. [J. Chem. Phys. 129, 095108 (2008)] generated conformations of TSE based on the criterion that, starting from a TS conformation, the probabilities of folding and unfolding are about equal through Markov Chain Monte Carlo (MCMC) simulations. In this study, we use the technique of constrained sequential Monte Carlo method [Lin et al., J. Chem. Phys. 129, 094101 (2008); Zhang et al. Proteins 66, 61 (2007)] to generate TSE conformations of acylphosphatase of 98 residues that satisfy the φ-value constraints, as well as the criterion that each conformation has a folding probability of 0.5 by Monte Carlo simulations. We adopt a two stage process and first generate 5000 contact maps satisfying the φ-value constraints. Each contact map is then used to generate 1000 properly weighted conformations. After clustering similar conformations, we obtain a set of properly weighted samples of 4185 candidate clusters. Representative conformation of each of these cluster is then selected and 50 runs of Markov chain Monte Carlo (MCMC) simulation are carried using a regrowth move set. We then select a subset of 1501 conformations that have equal probabilities to fold and to unfold as the set of TSE. These 1501 samples characterize well the distribution of transition state ensemble conformations of acylphosphatase. Compared with previous studies, our approach can access much wider conformational space and can objectively generate conformations that satisfy the φ-value constraints and the criterion of 0.5 folding probability without bias. In contrast to previous studies, our results show that transition state conformations are very diverse and are far from nativelike when measured in cartesian root-mean-square deviation (cRMSD): the average cRMSD between TSE conformations and the native structure is 9.4 Å for this short protein, instead of 6 Å reported in previous studies. In addition, we found that the average fraction of native contacts in the TSE is 0.37, with enrichment in native-like β-sheets and a shortage of long range contacts, suggesting such contacts form at a later stage of folding. We further calculate the first passage time of folding of TSE conformations through calculation of physical time associated with the regrowth moves in MCMC simulation through mapping such moves to a Markovian state model, whose transition time was obtained by Langevin dynamics simulations. Our results indicate that despite the large structural diversity of the TSE, they are characterized by similar folding time. Our approach is general and can be used to study TSE in other macromolecules.
- Research Article
2
- 10.1088/1755-1315/580/1/012030
- Oct 1, 2020
- IOP Conference Series: Earth and Environmental Science
Reliability assessment plays a vital roles in bridge health monitoring (BHM) technique. The analysis results of inspection data and monitoring data, such as numerical data, image data and video data, are not well due to there is no efficient reliability assessment method. This paper analysed the applied effect of Markov Chain Monte Carlo (MCMC) simulation method. The subset simulation method is used to analyse small failure probability events. Furthermore, the reliability assessment process based on Markov Chain Monte Carlo (MCMC) simulation method with Metropolis-Hasting Algorithm (MHA) is proposed. The advantage of this method is to improve the application efficiency and accuracy of reliability assessment based on BHM data.
- Conference Article
5
- 10.1109/ipdpsw.2010.5470689
- Apr 1, 2010
The increasing availability of multi-core and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov Chain Monte Carlo (MCMC) simulations are widely used for approximate counting problems, Bayesian inference and as a means for estimating very high-dimensional integrals. As such MCMC has had a wide variety of applications in fields including computational biology and physics, financial econometrics, machine learning and image processing. One method for improving the performance of Markov Chain Monte Carlo simulations is to use SMP machines to perform ‘speculative moves’, reducing the runtime whilst producing statistically identical results to conventional sequential implementations. In this paper we examine the circumstances under which the original speculative moves method performs poorly, and consider how some of the situations can be addressed by refining the implementation. We extend the technique to perform Markov Chains speculatively, expanding the range of algorithms that maybe be accelerated by speculative execution to those with non-uniform move processing times. By simulating program runs we can predict the theoretical reduction in runtime that may be achieved by this technique. We compare how efficiently different architectures perform in using this method, and present experiments that demonstrate a runtime reduction of up to 35–42% where using conventional speculative moves would result in execution as slow, if not slower, than sequential processing.
- Book Chapter
9
- 10.1007/978-3-319-09507-3_47
- Nov 30, 2014
There are over 10,000 rail bridges in Australia that were made of different materials and constructed at different years. Managing thousands of bridges has become a real challenge for rail bridge engineers without having a systematic approach for decision making. Developing best suitable deterioration models is essential in order to implement a comprehensive Bridge Management System (BMS). In State Based Markov Deterioration (SBMD) modeling, the main task is to estimate Transition Probability Matrixes (TPMs). In this study, Markov Chain Monte Carlo (MCMC) simulation method is utilized to estimate TPMs of railway bridge elements by overcoming some limitations of conventional and nonlinear optimization-based TPM estimation methods. The bridge inventory data over 15 years of 1,000 Australian railway bridges were reviewed and contribution factors for railway bridge deterioration were identified. MCMC simulation models were applied at bridge network level. Results show that TPMs corresponding to critical bridge elements can be obtained by Metropolis-Hasting Algorithm (MHA) coded in MATLAB program until it converges to stationary transition probability distributions. The predicted condition state distributions of selected bridge element group were tested by statistical hypothesis tests to validate the suitability of bridge deterioration models developed.
- Research Article
- 10.17977/um055v2i2p7-13
- Jan 1, 2021
The determination of the correct prediction of claims frequency and claims severity is very important in the insurance business to determine the outstanding claims reserve which should be prepared by an insurance company. One approach which may be used to predict a future value is the Bayesian approach. This approach combines the sample and the prior information The information is used to construct the posterior distribution and to determine the estimate of the parameters. However, in this approach, integrations of functions with high dimensions are often encountered. In this Thesis, a Markov Chain Monte Carlo (MCMC) simulation is used using the Gibbs Sampling algorithm to solve the problem. The MCMC simulation uses ergodic chain property in Markov Chain. In Ergodic Markov Chain, a stationary distribution, which is the target distribution, is obtained. The MCMC simulation is applied in Hierarchical Poisson Model. The OpenBUGS software is used to carry out the tasks. The MCMC simulation in Hierarchical Poisson Model can predict the claims frequency.
- Research Article
- 10.17977/um055v2i22021p7-13
- Jun 11, 2021
- Jurnal Kajian Matematika dan Aplikasinya (JKMA)
The determination of the correct prediction of claims frequency and claims severity is very important in the insurance business to determine the outstanding claims reserve which should be prepared by an insurance company. One approach which may be used to predict a future value is the Bayesian approach. This approach combines the sample and the prior information The information is used to construct the posterior distribution and to determine the estimate of the parameters. However, in this approach, integrations of functions with high dimensions are often encountered. In this Thesis, a Markov Chain Monte Carlo (MCMC) simulation is used using the Gibbs Sampling algorithm to solve the problem. The MCMC simulation uses ergodic chain property in Markov Chain. In Ergodic Markov Chain, a stationary distribution, which is the target distribution, is obtained. The MCMC simulation is applied in Hierarchical Poisson Model. The OpenBUGS software is used to carry out the tasks. The MCMC simulation in Hierarchical Poisson Model can predict the claims frequency.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.