Abstract

The major problem associated with application of principal component regression (PCR) in QSAR studies is that this model extracts the eigenvectors solely from the matrix of descriptors, which might not have essentially good relationship with the biological activity. This article describes a novel segmentation approach to PCR (SPCAR), in which the descriptors are firstly segmented to different blocks and then principal component analysis (PCA) is applied on each segment to extract significant principal components (PCs). In this way, the PCs having useful and redundant information are separated. A linear regression analysis based on stepwise selection of variables is then employed to connect a relationship between the informative extracted PCs and biological activity. The proposed method was first applied to model the aqueous toxicity of aliphatic compounds. The effect of the number of segments on the prediction ability of the method was investigated. Finally, a correlation analysis was achieved to identify those descriptors having significant contribution in the selected PCs and in aqueous toxicity. The proposed method was further validated by the analysis of Selwood data set consisting of 31 compounds and 53 descriptors. A comparison between the conventional PCR algorithm and SPCAR reveals the superiority of the latter. For external prediction set, SPCAR represented all requirements to be considered as predicted model whereas PCR did not. In addition, a comparison was made between the models obtained by SPCAR and those reported previously.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.