Abstract

Variable selection is one of the most important and controversial issues in modern data analysis. In the study of relationships between biological communities and environmental conditions, variable selection is especially important as it guides decisions about environmental management. Using a case study from the Great Lakes we use Bayesian Model Averaging (BMA) to select interesting subsets of environmental variables (such as water chemistry, depth, etc), that can affect the abundance of benthic microinvertebrates taxa. We implement BMA for a multivariate technique called Canonical Correspondence Analysis (CCA) and use its results to represent sites, species and selected environmental variables on ordination diagrams (biplots) along with “error bars” representing uncertainty due to both sampling variability and model selection. BMA provides data analysts with an efficient tool for discovering promising models and obtaining estimates of their posterior probabilities via Markov chain Monte Carlo (MCMC). These probabilities are further used as weights for model averaged predictions and estimates of the parameters of interest. As a result, variance components due to model selection can be estimated and accounted for, contrary to the practice of conventional data analysis. In our study we adopt an approach to BMA called Model Composition Markov Chain Monte Carlo (Madigan and Raffcery, 1994) and we implement the BMA methodology by treating CCA within a general framework of reduced rank regression for which we develop a Bayes Information Criterion (BIC) approximation to posterior model probabilities in the spirit of MC3.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call