Abstract

Abstract. Scientific records of temperature and precipitation have been kept for several hundred years, but for many areas, only a shorter record exists. To understand climate change, there is a need for rigorous statistical reconstructions of the paleoclimate using proxy data. Paleoclimate proxy data are often sparse, noisy, indirect measurements of the climate process of interest, making each proxy uniquely challenging to model statistically. We reconstruct spatially explicit temperature surfaces from sparse and noisy measurements recorded at historical United States military forts and other observer stations from 1820 to 1894. One common method for reconstructing the paleoclimate from proxy data is principal component regression (PCR). With PCR, one learns a statistical relationship between the paleoclimate proxy data and a set of climate observations that are used as patterns for potential reconstruction scenarios. We explore PCR in a Bayesian hierarchical framework, extending classical PCR in a variety of ways. First, we model the latent principal components probabilistically, accounting for measurement error in the observational data. Next, we extend our method to better accommodate outliers that occur in the proxy data. Finally, we explore alternatives to the truncation of lower-order principal components using different regularization techniques. One fundamental challenge in paleoclimate reconstruction efforts is the lack of out-of-sample data for predictive validation. Cross-validation is of potential value, but is computationally expensive and potentially sensitive to outliers in sparse data scenarios. To overcome the limitations that a lack of out-of-sample records presents, we test our methods using a simulation study, applying proper scoring rules including a computationally efficient approximation to leave-one-out cross-validation using the log score to validate model performance. The result of our analysis is a spatially explicit reconstruction of spatio-temporal temperature from a very sparse historical record.

Highlights

  • There is a need for accurate estimates of paleoclimate, especially temperature and precipitation, to better understand how climate has changed in the past

  • Student’s t specifications of principal component regression (PCR) and principal component regression (pPCR) models that accommodate potentially outlying measurements of mid-day July temperature in the historical observer data that may have arisen from the non-standardized data collection

  • In our paleoclimate reconstruction method, inclusion of lower-order principal components is important, especially if there are climate signals that are slowly varying or show up occasionally. If these uncommon processes appear in the lesser eigenvectors of the current-era analog data, these signals would be discarded by truncation as high-frequency noise

Read more

Summary

Introduction

There is a need for accurate estimates of paleoclimate, especially temperature and precipitation, to better understand how climate has changed in the past. Aligning the data sources in time was more complicated because the historical observer station data are highly irregular, whereas the current-era analog data are monthly mean mid-day temperatures. Like data, are generally represented by Latin letters and parameters are written in Greek letters Using this notation, the linear mixed model for estimating daily historical observer mean mid-day July temperature is yitj (s) = li β + b(s) α + ηit + ηi + ηt + εitj (s) ,. To facilitate parameter estimation in the presence of sparse data, the calibration model borrows strength among days, sites, and years within the historical observer data for the month of July, reducing the influence of measurement error and improving prediction of the mid-day diurnal temperature curve.

Modeling outline
Principal component regression
Probabilistic principal component regression
Robust regression
Posterior distribution
Scoring rules
Simulation
Observer station data reconstruction
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call