Abstract
Three types of chemotherapeutic agents, antibacterials, antineoplastics, and antifungals, which are registered in the MDL drug data report (MDDR) database, were used as training data set, and the classification study was performed using the following seven methods: principal component analysis–linear discriminant analysis (PCA-LDA), soft independent modeling by class analogy (SIMCA), partial least-squares2 (PLS2), artificial neural networks (ANNs), nearest neighbor method (NN), combined method of Ward clustering and NN (W-NN), and combined method of genetic algorithms (GAs) and NN (GA-NN). The number of correctly classified samples for each method was decreased by the following order: NN, ANNs, GA-NN, SIMCA, PLS2, W-NN, and PCA-LDA. Using these models, prediction study was then performed for the test set which consists of the drugs registered in the comprehensive medicinal chemistry (CMC) database. The number of correctly predicted samples for each method was decreased by the following order: NN, GA-NN, W-NN, SIMCA, PCA-LDA, ANNs, and PLS2. NN gave the best model from view points of the classification and prediction while overfitting was observed in ANNs and PLS2. Although the fitness and predictiveness of GA-NN and W-NN were inferior to those of NN, the predictiveness of the two methods were superior to PCA-LDA, SIMCA, ANNs, and PLS2.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.