Abstract
Soils and sediments can be distinguished based on “composite fingerprints”, i.e., sets of physical and chemical properties that are suitable for discrimination. At present, statistical stepwise variable selection methods are frequently applied to identify composite fingerprints, although they have been seriously criticized. Here, we test regularized logistic regression (RLR) as an alternative approach in the context of a reservoir siltation study where the post-dam facies is to be distinguished from the pre-dam facies. The pre- and post-dam facies of four reservoirs located in the Kruger National Park were examined with respect to grain size composition, color, and content of calcium-lactate leachable phosphorus (P CAL). A composite fingerprint was identified applying RLR to training data. The fitted regression model was used for the classification of samples not involved in the training dataset. For comparison, variable selection was performed with stepwise discriminant function analysis (DFA) and samples were classified by applying linear discriminant analysis (LDA). Both approaches were validated by comparing field interpretation and classification results. The analysis was extended based on Monte Carlo simulations and synthetic datasets to quantify uncertainties and to enhance the method comparison. RLR and stepwise DFA identify grain size parameters and P CAL content to be particularly useful for the facies discrimination. Neglecting and taking into account a potential sampling bias, both approaches lead to ≤3 and 5% misclassifications, respectively. RLR outperforms stepwise DFA/LDA in Monte Carlo simulations, although misclassification rates do not significantly differ (p = 0.84). RLR uses on average 12% less fingerprint properties. Moreover, RLR-derived probabilities of group membership represent a more reliable measure for classification conclusiveness than probabilities calculated from LDA, which is evident in significantly lower (p < 0.001) probability residuals for misclassified samples. Stepwise DFA/LDA reveals lower misclassification rates than RLR when data fulfill multivariate normality in each group and equal within-group covariance matrices. RLR is an innovative tool for the discrimination of sediment facies in reservoirs and, more generally, for studies requiring the discrimination of soils and sediments. Although stepwise procedures will in practice often perform similarly well, we discourage their use for the identification of composite fingerprints due to the risk of suboptimal variable selection involving variables with spurious discriminatory power.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.