Abstract

AbstractIn QSAR analysis in environmental sciences adverse effects of chemicals released to the environment are modelled and predicted as a function of the chemical properties of the pollutants. Usually, the set of compounds under study contains several classes of substances, i.e., a more or less strongly clustered set. It is then needed to ensure that the selected training set comprises compounds representing all those chemical classes. Multivariate design in the principal properties of the compound classes is usually appropriate for selecting a meaningful training set. However, with clustered data, often seen in environmental chemistry and toxicology, a single multivariate design may be suboptimal. This because of the risk of ignoring small classes with few members and only selecting training set compounds from the largest classes. In this paper, a procedure for training set selection recognizing clustering is proposed. Here, when non‐selective biological or environmental responses are modelled, local multivariate designs are constructed within each cluster (class). The chosen compounds arising from the local designs are finally united in the overall training set, which thus will contain members from all clusters. Our illustration deals with a set of 66 compounds, categorized into five classes, for which the soil sorption coefficient is available. The training set selection is discussed, followed by multivariate QSAR modelling, model validation and interpretation, and predictions for the test set.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.