Understanding variation in mollusk shell size through time can be useful for interpreting changes in past marine environments and human behavior related to subsistence. Recent studies have provided means to estimate total shell length for California mussel (Mytilus californianus) by using a single morphological measurement in a simple linear regression approach; however, shell morphology may differ between environmentally diverse areas and through time, in turn potentially influencing the reliability of single variable models generated from measurements of shells harvested from a different geographic region to predict total shell length for archaeological samples. In this study, we examined the performance of models built from three site specific samples from contemporary mussel beds from Santa Rosa Island, southern California and models built from a pooled sample. The data was split into training (80%) and testing (20%) subsets to construct and validate eight multivariable models using a best subset regression approach. We then examined the performance of our models on each subset, evaluating measures of bias, accuracy, coverage, and precision. We also evaluated the performance of our models using an archaeological sample (CA-SRI-138) as an additional external validation sample. For each sample, we consider a model with all measurements and a model using only those measurements likely observed in fragmentary assemblages. We also developed single variable models for comparison to the performance of multivariable models. Multivariable models consistently demonstrate better performance than single variable models. The models using the pooled sample were also more accurate, precise, and exhibited less bias than the site-specific models. Each site-specific model selected a different subset of variables and indicated a range of variable importance, which is captured in the increased generalizability of the multivariable, pooled sample models when estimating total shell length for both contemporary and archaeological samples.
Read full abstract