A note on auxiliary mixture sampling for Bayesian Poisson models

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Abstract Bayesian hierarchical Poisson models are an essential tool for analyzing count data. However, designing efficient algorithms to sample from the posterior distribution of the target parameters remains a challenging task. Auxiliary mixture sampling algorithms have been proposed to this aim. They involve two steps of data augmentation: the first leverages the theory of Poisson processes, and the second approximates the residual distribution of the resulting model through a mixture of Gaussian distributions. In this way, an approximate Gibbs sampler can be implemented. This strategy is particularly beneficial for latent Gaussian models, as it allows one to exploit the sparsity of the precision matrix associated with the random effects and to efficiently incorporate linear constraints. In this paper, we focus on the accuracy of the approximation step, highlighting scenarios where the mixture fails to represent accurately the true underlying distribution, leading to a lack of convergence in the algorithm. We outline key features to monitor, in order to assess if the approximation performs as intended. Building on this, we propose a robust version of the auxiliary mixture sampling algorithm. Our approach includes mechanisms for detecting approximation failures and introduces an enhanced approximation of the right tail of the auxiliary variable distribution, supplemented by a Metropolis-Hastings correction step when needed. Finally, we evaluate the proposed algorithm together with the original mixture sampling algorithms on both simulated and real datasets.

Similar Papers
  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.jspi.2020.01.007
On classical and Bayesian asymptotics in stochastic differential equations with random effects having mixture normal distributions
  • Feb 3, 2020
  • Journal of Statistical Planning and Inference
  • Trisha Maitra + 1 more

On classical and Bayesian asymptotics in stochastic differential equations with random effects having mixture normal distributions

  • Research Article
  • Cite Count Icon 59
  • 10.1002/2013wr014372
Probabilistic prediction of cyanobacteria abundance in a Korean reservoir using a Bayesian Poisson model
  • Mar 1, 2014
  • Water Resources Research
  • Yoonkyung Cha + 4 more

There have been increasing reports of harmful algal blooms (HABs) worldwide. However, the factors that influence cyanobacteria dominance and HAB formation can be site‐specific and idiosyncratic, making prediction challenging. The drivers of cyanobacteria blooms in Lake Paldang, South Korea, the summer climate of which is strongly affected by the East Asian monsoon, may differ from those in well‐studied North American lakes. Using the observational data sampled during the growing season in 2007–2011, a Bayesian hurdle Poisson model was developed to predict cyanobacteria abundance in the lake. The model allowed cyanobacteria absence (zero count) and nonzero cyanobacteria counts to be modeled as functions of different environmental factors. The model predictions demonstrated that the principal factor that determines the success of cyanobacteria was temperature. Combined with high temperature, increased residence time indicated by low outflow rates appeared to increase the probability of cyanobacteria occurrence. A stable water column, represented by low suspended solids, and high temperature were the requirements for high abundance of cyanobacteria. Our model results had management implications; the model can be used to forecast cyanobacteria watch or alert levels probabilistically and develop mitigation strategies of cyanobacteria blooms.

  • Research Article
  • 10.3389/fpubh.2025.1563392
Daily meal frequency and its associated factors among children aged 6–23 months in Ethiopia: a Bayesian hierarchical Poisson model
  • Jul 18, 2025
  • Frontiers in Public Health
  • Dejen Kahsay Asgedom + 2 more

BackgroundInadequate feeding frequency during the early childhood period is responsible for more than two-thirds of global child deaths. Evidence on the rate of daily meal frequency among infants and young children at the national level is crucial for developing targeted interventions to improve feeding practices. Hence, this study aimed to identify factors associated with the rate of daily meal frequency (DMF) among children aged 6–23 months in Ethiopia.MethodsWe retrieved secondary data from the Kids record (KR) of the Ethiopian Mini Demographic and Health Survey (MDHS) dataset. A total of 1,264 children aged 6–23 months were included in the study. A Bayesian hierarchical Poisson model was employed. Model convergence was checked via Rhat, effective sample size, density plots, terrace plots, and autocorrelation plots, and all the results were confirmed. We used the widely applicable information criterion (WAIC) and leave-one-out cross-validation (LOO) for model comparison. The model parameters were estimated via special Markov chain Monte Carlo (MCMC) simulation techniques called Hamiltonian Monte Carlo (HMC) and its extension, the no-U-turn sampler (NUTS). An adjusted incidence rate ratio (AIRR) with a 95% credible interval (CrI) in the multivariable model was used to select variables that had a significant association with the rate of daily meal frequency. The data were analyzed via R software version 4.3.1.ResultsThe mean and standard deviation of the DMF were 3.36 and 1.60, respectively. The rate of DMF was 1.17 times greater (AIRR = 1.17, 95% CrI: 0.997, 1.381) in children whose mothers had a secondary/higher educational level than in those whose mothers had no education. Kids currently being breastfed have a lower rate of DMF (AIRR = 0.88, 95% CI: 0.798, 0.979) by 10% than those who are not currently breastfeeding. Compared with children between the ages of 6–8 months, those between 9 and 11 months (AIRR = 1.55 95% CrI: 1.374, 1.754), 12–17 months (AIRR = 1.72, 95% CrI: 1.543, 1.911), and 18–23 months (AIRR = 95% CrI: 1.90, 1.692, 2.125) had 55, 72 and 90% higher rates of DMF, respectively. In the Afar region (IRR = 0.77, 95% CI: 0.615, 0.982), Somalia (AIRR = 0.83, 95% CrI: 0.682, 1.01), Benishangul (AIRR = 0.8, 95% CrI: 0.639, 0.994), Southern nation nationality and people’s region (SNNPR) (AIRR = 0.73, 95% CrI: 0.596, 0.894), and (AIRR = 0.73, 95% CrI: 0.572, 0.925) decrease the daily meal frequency by 33, 17, 20, 27 and 27%, respectively, compared with that of children from Tigray.Conclusion and recommendationThe rate of DMF was low in Ethiopia and exhibited a significant clustering pattern across the country. These findings stress the need for tailored interventions addressing regional inequities, promoting age-specific nutrition, supporting maternal education, and empowering working women to improve children’s nutritional intake and ensure more equitable access to meals across Ethiopia.

  • Research Article
  • Cite Count Icon 25
  • 10.1177/0962280211414853
Bayesian hierarchical Poisson models with a hidden Markov structure for the detection of influenza epidemic outbreaks
  • Aug 25, 2011
  • Statistical Methods in Medical Research
  • D Conesa + 3 more

Considerable effort has been devoted to the development of statistical algorithms for the automated monitoring of influenza surveillance data. In this article, we introduce a framework of models for the early detection of the onset of an influenza epidemic which is applicable to different kinds of surveillance data. In particular, the process of the observed cases is modelled via a Bayesian Hierarchical Poisson model in which the intensity parameter is a function of the incidence rate. The key point is to consider this incidence rate as a normal distribution in which both parameters (mean and variance) are modelled differently, depending on whether the system is in an epidemic or non-epidemic phase. To do so, we propose a hidden Markov model in which the transition between both phases is modelled as a function of the epidemic state of the previous week. Different options for modelling the rates are described, including the option of modelling the mean at each phase as autoregressive processes of order 0, 1 or 2. Bayesian inference is carried out to provide the probability of being in an epidemic state at any given moment. The methodology is applied to various influenza data sets. The results indicate that our methods outperform previous approaches in terms of sensitivity, specificity and timeliness.

  • Research Article
  • Cite Count Icon 17
  • 10.1080/02640414.2015.1039462
Relative age and birthplace effect in Japanese professional sports: a quantitative evaluation using a Bayesian hierarchical Poisson model
  • Apr 28, 2015
  • Journal of Sports Sciences
  • Hideaki Ishigami

Relative age effect (RAE) in sports has been well documented. Recent studies investigate the effect of birthplace in addition to the RAE. The first objective of this study was to show the magnitude of the RAE in two major professional sports in Japan, baseball and soccer. Second, we examined the birthplace effect and compared its magnitude with that of the RAE. The effect sizes were estimated using a Bayesian hierarchical Poisson model with the number of players as dependent variable. The RAEs were 9.0% and 7.7% per month for soccer and baseball, respectively. These estimates imply that children born in the first month of a school year have about three times greater chance of becoming a professional player than those born in the last month of the year. Over half of the difference in likelihoods of becoming a professional player between birthplaces was accounted for by weather conditions, with the likelihood decreasing by 1% per snow day. An effect of population size was not detected in the data. By investigating different samples, we demonstrated that using quarterly data leads to underestimation and that the age range of sampled athletes should be set carefully.

  • Research Article
  • 10.1016/0022-4375(83)90029-4
Relationships between road accidents and hourly traffic flow — II. Probabilistic approach
  • Sep 1, 1983
  • Journal of Safety Research
  • A Cedar

Relationships between road accidents and hourly traffic flow — II. Probabilistic approach

  • Research Article
  • Cite Count Icon 127
  • 10.1016/j.aap.2013.04.025
Multi-level Bayesian analyses for single- and multi-vehicle freeway crashes
  • May 10, 2013
  • Accident Analysis & Prevention
  • Rongjie Yu + 1 more

Multi-level Bayesian analyses for single- and multi-vehicle freeway crashes

  • Research Article
  • Cite Count Icon 89
  • 10.1016/j.csda.2006.10.006
Auxiliary mixture sampling with applications to logistic models
  • Nov 2, 2006
  • Computational Statistics & Data Analysis
  • Sylvia Frühwirth-Schnatter + 1 more

Auxiliary mixture sampling with applications to logistic models

  • Research Article
  • Cite Count Icon 20
  • 10.1002/sim.5457
Bayesian spatial modeling of HIV mortality via zero‐inflated Poisson models
  • Jul 16, 2012
  • Statistics in Medicine
  • Muzaffer Musal + 1 more

In this paper, we investigate the effects of poverty and inequality on the number of HIV-related deaths in 62 New York counties via Bayesian zero-inflated Poisson models that exhibit spatial dependence. We quantify inequality via the Theil index and poverty via the ratios of two Census 2000 variables, the number of people under the poverty line and the number of people for whom poverty status is determined, in each Zip Code Tabulation Area. The purpose of this study was to investigate the effects of inequality and poverty in addition to spatial dependence between neighboring regions on HIV mortality rate, which can lead to improved health resource allocation decisions. In modeling county-specific HIV counts, we propose Bayesian zero-inflated Poisson models whose rates are functions of both covariate and spatial/random effects. To show how the proposed models work, we used three different publicly available data sets: TIGER Shapefiles, Census 2000, and mortality index files. In addition, we introduce parameter estimation issues of Bayesian zero-inflated Poisson models and discuss MCMC method implications.

  • Research Article
  • Cite Count Icon 19
  • 10.1177/1471082x14524676
A zero-inflated overdispersed hierarchical Poisson model
  • Aug 26, 2014
  • Statistical Modelling
  • Wondwosen Kassahun + 4 more

Count data are most commonly modeled using the Poisson model, or by one of its many extensions. Such extensions are needed for a variety of reasons: (1) a hierarchical structure in the data, e.g., due to clustering, the collection of repeated measurements of the outcome, etc.; (2) the occurrence of overdispersion (or underdispersion), meaning that the variability encountered in the data is not equal to the mean, as prescribed by the Poisson distribution; and (3) the occurrence of extra zeros beyond what a Poisson model allows. The first issue is often accommodated through the inclusion of random subject-specific effects. Though not always, one conventionally assumes such random effects to be normally distributed. Overdispersion is often dealt with through a model developed for this purpose, such as, for example, the negative-binomial model for count data. This can be conceived through a random Poisson parameter. Excess zeros are regularly accounted for using so-called zero-inflated models, which combine either a Poisson or negative-binomial model with an atom at zero. The novelty of this article is that it combines all these features. The work builds upon the modelling framework defined by Molenberghs et al. ( 2010 ) in which clustering and overdispersion are accommodated for through two separate sets of random effects in a generalized linear model.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 55
  • 10.1371/journal.pcbi.0020006
Seriation in Paleontological Data Using Markov Chain Monte Carlo Methods
  • Feb 1, 2006
  • PLoS Computational Biology
  • Kai Puolamäki + 2 more

Given a collection of fossil sites with data about the taxa that occur in each site, the task in biochronology is to find good estimates for the ages or ordering of sites. We describe a full probabilistic model for fossil data. The parameters of the model are natural: the ordering of the sites, the origination and extinction times for each taxon, and the probabilities of different types of errors. We show that the posterior distributions of these parameters can be estimated reliably by using Markov chain Monte Carlo techniques. The posterior distributions of the model parameters can be used to answer many different questions about the data, including seriation (finding the best ordering of the sites) and outlier detection. We demonstrate the usefulness of the model and estimation method on synthetic data and on real data on large late Cenozoic mammals. As an example, for the sites with large number of occurrences of common genera, our methods give orderings, whose correlation with geochronologic ages is 0.95.

  • Research Article
  • Cite Count Icon 2
  • 10.1109/access.2022.3209232
Bayesian Inference for Thermal Model of Synchronous Generator—Part I: Parameter Estimation
  • Jan 1, 2022
  • IEEE Access
  • Madhusudhan Pandey + 1 more

Due to the increasing injection of intermittent power sources (solar+wind) into a common grid, dispatchable sources such as hydro power should be able to help reduce the variability in load and the variability in generation caused by the intermittent sources. A hydro generator should be able to operate short-term beyond its thermal capability limit. This requires the monitoring of internal temperatures in the hydro generator. In this paper, a thermal model of an air-cooled synchronous generator is presented, emphasizing the various aspects of parameter estimation and identifiability using Bayesian inference. Inferences are drawn from the posterior distributions of the parameters and initial conditions, dispersion (spreading) of particles and sampling efficiency, practical parameter identifiability, and model mismatch with experiments. Results show extremely narrow parameter distributions. It is early to generalize about the posterior distribution of air-related and metal-related parameters of the air-cooled synchronous generator based on the single experimental data presented here.

  • Research Article
  • Cite Count Icon 89
  • 10.2136/sssaj2002.1740
Validity of First‐Order Approximations to Describe Parameter Uncertainty in Soil Hydrologic Models
  • Nov 1, 2002
  • Soil Science Society of America Journal
  • Jasper A Vrugt + 1 more

Model nonlinearity and parameter interdependence violate the use of a first‐order approximation to obtain exact confidence intervals of parameters in soil hydrologic models. In this study, the posterior distribution of parameters in soil water retention and hydraulic conductivity functions is examined using observed water retention data and a laboratory transient multistep outflow experiment. Parameter uncertainties obtained with traditional first‐order approximations and uniform grid sampling strategies were compared with those obtained using the Metropolis algorithm, a Markov Chain Monte Carlo (MCMC) sampler. A diagnostic measure, based on multiple sequences generated in parallel, was used to test whether convergence of the Metropolis sampler to the posterior distribution had been achieved. Most significantly, as the Metropolis algorithm can cope with rough response surfaces generated by the objective function used, it not only successfully infers the multivariate posterior probability distribution of the model parameters, but also provides valuable insights in parameter interdependence in the full parameter space.

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.sigpro.2019.02.020
An augmented sequential MCMC procedure for particle based learning in dynamical systems
  • Feb 19, 2019
  • Signal Processing
  • Muhammad Javvad Ur Rehman + 2 more

An augmented sequential MCMC procedure for particle based learning in dynamical systems

  • Conference Article
  • 10.1109/apct55107.2022.00010
Characterization of Parameter Uncertainty in SWAT Model using MCMC Bayesian Framework: The Case of Naryn River Basin
  • Jan 1, 2022
  • C Chen + 3 more

Analysis of parameter uncertainty in distributed watershed model is a worldwide challenge. In this study, The Differential Evolution Adaptive Metropolis (DREAM) technique is developed to analyse the uncertainty of Soil and Water Assessment Tool (SWAT) model parameters. SWAT is used for providing the basic hydrologic simulation, DREAM algorithm is employed to approximate the posterior distributions of model parameters with Bayesian inference. DREAM is then used to capture the uncertainty and implications of parameters in the Naryn River Basin (in Central Asia). The posterior distribution of parameters is obtained. Results shows that: (i) the posterior sampling results of DREAM algorithm are satisfactory; (ii) concentrated precipitation during rainy season generates more runoff; (iii) more precipitation exists in the form of snowfall.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.