AbstractIn machine learning or scientific computing, model performance is measured with an objective function. But why choose one objective over another? According to the information‐theoretic paradigm, the “best” objective function is whichever minimizes information loss. To evaluate different objectives, transform them into likelihoods. The ratios of these likelihoods represent how strongly we should prefer one objective versus another, and the log of that ratio represents the relative information loss (or gain) from one objective to another. In plain terms, minimizing information loss is equivalent to minimizing uncertainty, as well as maximizing probability and general utility. We argue that this paradigm is well‐suited to models that have many uses and no definite utility like the complex Earth system models used to understand the effects of climate change. Furthermore, the benefits of “maximizing information and general utility” extend beyond model accuracy to other important considerations including how efficiently the model calibrates, how well it generalizes, and how well it compresses data.
Read full abstract