Abstract

Information criteria have had a profound impact on modern ecological science. They allow researchers to estimate which probabilistic approximating models are closest to the generating process. Unfortunately, information criterion comparison does not tell how good the best model is. In this work, we show that this shortcoming can be resolved by extending the geometric interpretation of Hirotugu Akaike’s original work. Standard information criterion analysis considers only the divergences of each model from the generating process. It is ignored that there are also estimable divergence relationships amongst all of the approximating models. We then show that using both sets of divergences and an estimator of the negative self entropy, a model space can be constructed that includes an estimated location for the generating process. Thus, not only can an analyst determine which model is closest to the generating process, she/he can also determine how close to the generating process the best approximating model is. Properties of the generating process estimated from these projections are more accurate than those estimated by model averaging. We illustrate in detail our findings and our methods with two ecological examples for which we use and test two different neg-selfentropy estimators. The applications of our proposed model projection in model space extend to all areas of science where model selection through information criteria is done.

Highlights

  • Recent decades have witnessed a remarkable growth of statistical ecology as a discipline, and today, stochastic models of complex ecological processes are the hallmark of the most salient publications in ecology (e.g., Leibold et al, 2004; Gravel et al, 2016; Zeng and Rodrigo, 2018)

  • Aikaike’s fourth critical insight was to note that a Law of Large Numbers (LLN) approximation of the Kullback-Leibler divergence between the true, generating stochastic process and a statistical model is minimized by evaluating the candidate model at its maximum likelihood estimates

  • Once all these components are computed, the system of Equation (9) can be solved with non-linear optimization. We coded such solution in the R function MP.coords found in the file MPcalctools.R. This function takes as input the estimated negcrossentropies between all models, an estimate of Sgg or the neg-selfentropy of the generating process, and the vectors of estimated neg-crossentropies Sgfi and Sfig to output the matrix of dimension (r + 1) × (r + 1) of symmetrized KL divergences, and the results of the Non-Metric Multidimensional (NMDS) with the coordinates of every model in a two-dimensional space, the estimated location of the orthogonal projection of g in such plane, M, and the estimate of h

Read more

Summary

INTRODUCTION

Recent decades have witnessed a remarkable growth of statistical ecology as a discipline, and today, stochastic models of complex ecological processes are the hallmark of the most salient publications in ecology (e.g., Leibold et al, 2004; Gravel et al, 2016; Zeng and Rodrigo, 2018). Our solution was motivated by the conceptualization of models as objects in a multi-dimensional space as well as an extension of the geometrical thinking that Akaike used so brilliantly in his 1973 paper introducing the AIC. Thinking of models and the generating mechanism as objects with a specific location in space is mathematically challenging, this exercise may prove to be of use to study the adequacy of another common statistical practice in multi-model inference: model averaging. The answer to both questions above (i.e., the error rates of multi-model selection under misspecification and when should an analyst perform model averaging) could be readily explored. These questions are the main motivation behind the work presented here

THE AIC AND A NATURAL GEOMETRIC EXTENSION
Insight 2
Insight 3
Insight 4
Insight 5
Insight 6
The Problem of Multiple Models
EXAMPLES
An Application in Animal Behavior
An Ecosystems Ecology Application
Testing the Non-parametric Estimation of Sgg
Findings
DISCUSSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.