Abstract

Abstract. It has been almost 50 years since York published an exact and general solution for the best-fit straight line to independent points with normally distributed errors in both x and y. York's solution is highly cited in the geophysical literature but almost unknown outside of it, so that there has been no ebb in the tide of books and papers wrestling with the problem. Much of the post-1969 literature on straight-line fitting has sown confusion not merely by its content but by its very existence. The optimal least-squares fit is already known; the problem is already solved. Here we introduce the non-specialist reader to York's solution and demonstrate its application in the interesting case of the isotopic mixing line, an analytical tool widely used to determine the isotopic signature of trace gas sources for the study of biogeochemical cycles. The most commonly known linear regression methods – ordinary least-squares regression (OLS), geometric mean regression (GMR), and orthogonal distance regression (ODR) – have each been recommended as the best method for fitting isotopic mixing lines. In fact, OLS, GMR, and ODR are all special cases of York's solution that are valid only under particular measurement conditions, and those conditions do not hold in general for isotopic mixing lines. Using Monte Carlo simulations, we quantify the biases in OLS, GMR, and ODR under various conditions and show that York's general – and convenient – solution is always the least biased.

Highlights

  • A common analytical task in the physical sciences is to find the true straight-line relationship underlying independently measured points with normally distributed measurement errors in both the ordinate y and abscissa x

  • If the points are independent of one another and their errors are normally distributed, the problem can be treated by least-squares estimation (LSE), which is equivalent to maximum likelihood estimation (MLE) (Myung, 2003) in this situation

  • We have omitted our Miller–Tans results from Table 1 because there were no significant differences between the Miller– Tans and Keeling results for the York and ordinary least-squares regression (OLS) methods and because the geometric mean regression (GMR) results were very poor for both plot types

Read more

Summary

Introduction

A common analytical task in the physical sciences is to find the true straight-line relationship underlying independently measured points with normally distributed measurement errors in both the ordinate y and abscissa x. The literature on this topic is profuse, but much of it was outdated even before it was written. The nature of that error might vary from point to point (a situation known as heteroscedasticity), and it might be that the error in xi is correlated with that in yi, where the subscript i specifies a measurement pair (i.e., a data point) Such correlation can arise, for example, if both xand yare derived from the same quantity or measured by the same apparatus, and can be described by a (Pearson’s) correlation coefficient r that might vary from point to point. Most of the literature on straight-line fitting concerns LSE, as it is appropriate to the vast majority of straight-line fitting problems

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call