Simultaneous transformation and rounding (STAR) models for integer-valued data

Daniel R Kowal,Antonio Canale

doi:10.1214/20-ejs1707

Abstract

We propose a simple yet powerful framework for modeling integer-valued data, such as counts, scores, and rounded data. The data-generating process is defined by Simultaneously Transforming and Rounding (STAR) a continuous-valued process, which produces a flexible family of integer-valued distributions capable of modeling zero-inflation, bounded or censored data, and over- or underdispersion. The transformation is modeled as unknown for greater distributional flexibility, while the rounding operation ensures a coherent integer-valued data-generating process. An efficient MCMC algorithm is developed for posterior inference and provides a mechanism for adaptation of successful Bayesian models and algorithms for continuous data to the integer-valued data setting. Using the STAR framework, we design a new Bayesian Additive Regression Tree model for integer-valued data, which demonstrates impressive predictive distribution accuracy for both synthetic data and a large healthcare utilization dataset. For interpretable regression-based inference, we develop a STAR additive model, which offers greater flexibility and scalability than existing integer-valued models. The STAR additive model is applied to study the recent decline in Amazon river dolphins.

Highlights

MSC 2010 subject classifications: Primary 62F15, 62G08; secondary 62M20
We focus on conditionally Gaussian regression models, but the star framework applies more broadly
Post hoc rounding ignores the discrete nature of the data in model-fitting, and introduces a disconnect between the fitted model and the model used for prediction. star clearly avoids this issue, and maintains the benefits of using well-known models for continuous data while producing a coherent integer-valued predictive distribution

Summary

Simultaneously transforming and rounding

The star framework of (1)-(2) defines an integervalued process for y, in which the transformation g may be modeled as unknown for greater distributional flexibility. The transformation g, which we model as unknown, endows the integer-valued process y with greater distributional flexibility, yet leaves model (3) unchanged. This construction allows seamless integration of Bayesian models and algorithms for continuous data of the common form (3) into the integer-valued star framework, with efficient posterior inference available via a general MCMC algorithm (Section 2.3). Star clearly avoids this issue, and maintains the benefits of using well-known models for continuous data while producing a coherent integer-valued predictive distribution Post hoc rounding ignores the discrete nature of the data in model-fitting, and introduces a disconnect between the fitted model and the model used for prediction. star clearly avoids this issue, and maintains the benefits of using well-known models for continuous data while producing a coherent integer-valued predictive distribution

The transformation g

Model properties

Posterior inference

Regression modeling with STAR

Additive models

Bayesian additive regression trees

Simulation studies

Linear mean functions

Nonlinear mean functions

Predicting the demand for healthcare utilization

Modeling the decline in Amazon river dolphins

Evaluating point accuracy for synthetic data

Model and MCMC diagnostics for the dolphins data

Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2020
Citations: 14	License type: cc-by

R Discovery Prime

R Discovery Prime

Simultaneous transformation and rounding (STAR) models for integer-valued data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

Bayesian Additive Regression Tree for Seemingly Unrelated Regression with Automatic Tree Selection
S Chakraborty
-
S ChakrabortyS Chakraborty
01 Jan 2015
01 Jan 2015

BAYESIAN ADDITIVE REGRESSION TREE APPLICATION FOR PREDICTING MATERNITY RECOVERY RATE OF GROUP LONG-TERM DISABILITY INSURANCE
Felivia Kusnadi ... Stevanny Budiana
BAREKENG: Jurnal Ilmu Matematika dan Terapan | VOL. 17
Felivia Kusnadi, et. al.Felivia Kusnadi ... Stevanny Budiana
15 Apr 2023
BAREKENG: Jurnal Ilmu Matematika dan Terapan | VOL. 17

Disaggregating census data for population mapping using a Bayesian Additive Regression Tree model
Ortis Yankey ... Andrew J Tatem
Applied Geography | VOL. 172
Ortis Yankey, et. al.Ortis Yankey ... Andrew J Tatem
14 Sep 2024
Applied Geography | VOL. 172

Hourly streamflow forecasting using a Bayesian additive regression tree model hybridized with a genetic algorithm
Duc Hai Nguyen ... Deg-Hyo Bae
Journal of Hydrology | VOL. 606
Duc Hai Nguyen, et. al.Duc Hai Nguyen ... Deg-Hyo Bae
15 Jan 2022
Journal of Hydrology | VOL. 606

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Simultaneous transformation and rounding (STAR) models for integer-valued data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics