Sequential adaptive design for jump regression estimation

Chiwoo Park,Peihua Qiu,Jennifer Carpena-Núñez,Rahul Rao,Michael Susner,Benji Maruyama

doi:10.1080/24725854.2021.1988770

Abstract

Selecting input variables or design points for statistical models has been of great interest in adaptive design and active learning. Motivated by two scientific examples, this article presents a strategy of selecting the design points for a regression model when the underlying regression function is discontinuous. The first example we undertook was to accelerate imaging speed in high-resolution material imaging, and the second was to use sequential design for mapping a chemical phase diagram. In both examples, the underlying regression functions have discontinuities, and thus many existing design optimization approaches cannot be used, as they assume a continuous regression function. Although some existing adaptive design strategies developed from the treed regression models can handle the discontinuities, the related Bayesian model estimation approaches come with computationally expensive Markov Chain Monte Carlo algorithms for posterior inferences and the subsequent design point selections, which may not be applicable for the first motivating example that requires the computation to be faster than the original imaging speed. In addition, the treed models are based on domain partitioning and are inefficient in cases when the discontinuities occur at complex sub-domain boundaries. In this article, we propose a simple and effective adaptive design strategy for regression analysis with discontinuities. After some statistical properties of the estimated regression model are derived in cases with a fixed design, a new criterion for sequentially selecting the design points is suggested. The suggested sequential design selection procedure is then evaluated using a comprehensive simulation study and demonstrated using two motivating examples.

Full Text