Abstract

The calculation of error bars for quantities of interest in computational chemistry comes in two forms: (1) Determining the confidence of a prediction, for instance of the property of a molecule; (2) Assessing uncertainty in measuring the difference between properties, for instance between performance metrics of two or more computational approaches. While a former paper in this series concentrated on the first of these, this second paper focuses on comparison, i.e. how do we calculate differences in methods in an accurate and statistically valid manner. Described within are classical statistical approaches for comparing widely used metrics such as enrichment, area under the curve and Pearson’s product-moment coefficient, as well as generic measures. These are considered of over single and multiple sets of data and for two or more methods that evince either independent or correlated behavior. General issues concerning significance testing and confidence limits from a Bayesian perspective are discussed, along with size-of-effect aspects of evaluation.

Highlights

  • Part One of this paper [1] focused on the calculation of error bars, or confidence limits for measured or calculated quantities

  • Procedures are presented for comparing metrics in common use in computational chemistry both when covariance, i.e. correlation, is important and when it is not

  • The reason this paper has focused on parametric statistics is: (1) they are more powerful than non-parametric statistics, e.g. they give you better, tighter error bounds, (2) for items we are interested in, such as the differences in properties, they are often more robustly applicable than is commonly thought

Read more

Summary

Introduction

Part One of this paper [1] focused on the calculation of error bars, or confidence limits for measured or calculated quantities. If error bars do not overlap we can say two methods are statistically different (left panel) at the same level of the significance represented by the error bars but the converse is not correct. In the statistical literature this approach to accounting for the correlation between two methods is often referred to as the paired Student t test [2] This is described more regarding small sample sizes. Suppose we ignore any knowledge of the expected variance of the experiment, i.e. we have to estimate the standard deviation from the three measurements; from Part One we know that for N = 3 we are required to use the Student t-distribution In this case the correct t statistic would be:.

NB ðNA À1Þvarð AÞþðNB À1ÞvarðBÞ NA þNB À2
Method
M hxiliocalÀhxiglobal
Method C
Correlated methods
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.