A key goal of exoplanet spectroscopy is to measure atmospheric properties, such as abundances of chemical species, in order to connect them to our understanding of atmospheric physics and planet formation. In this new era of high-quality JWST data, it is paramount that these measurement methods are robust. When comparing atmospheric models to observations, multiple candidate models may produce reasonable fits to the data. Typically, conclusions are reached by selecting the best-performing model according to some metric. This ignores model uncertainty in favor of specific model assumptions, potentially leading to measured atmospheric properties that are overconfident and/or incorrect. In this paper, we compare three ensemble methods for addressing model uncertainty by combining posterior distributions from multiple analyses: Bayesian model averaging, a variant of Bayesian model averaging using leave-one-out predictive densities, and stacking of predictive distributions. We demonstrate these methods by fitting the Hubble Space Telescope (HST) + Spitzer transmission spectrum of the hot Jupiter HD 209458b using models with different cloud and haze prescriptions. All of our ensemble methods lead to uncertainties on retrieved parameters that are larger but more realistic and consistent with physical and chemical expectations. Since they have not typically accounted for model uncertainty, uncertainties of retrieved parameters from HST spectra have likely been underreported. We recommend stacking as the most robust model combination method. Our methods can be used to combine results from independent retrieval codes and from different models within one code. They are also widely applicable to other exoplanet analysis processes, such as combining results from different data reductions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call