Abstract
The paper highlighted the use of advanced nonlinear modeling and subset selection techniques in the construction of a good, predictive model for genotoxicity study of amines. Essentials accounting for a reliable model were all considered carefully. Chemicals were represented by a large number of CODESSA descriptors. Division of a whole sample into the training set and the test set was performed by principal component analysis (PCA). Six descriptors selected by the best multi-linear regression (BMLR) method in CODESSA program were used as inputs to build nonlinear models, using advanced statistical learning methods such as support vector machine (SVM) and projection pursuit regression (PPR). The models were validated through three ways, i.e. internal cross-validation (CV), a test set and an independent validation set. Analysis shows that nonlinear models produced better results than linear models and PPR model outperforms the rest in the following order: PPR > SVM > linear SVM ≥ BMLR. In addition, the relationships between the descriptors and the mutagenic behavior of compounds are well discussed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Collection of Czechoslovak Chemical Communications
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.