Abstract
The rodent carcinogenicity dataset was compiled from the Carcinogenic Potency Database (CPDBAS) and was applied for the classification of quantitative structure-activity relationship (QSAR) models for the prediction of carcinogenicity based on the counter-propagation artificial neural network (CP ANN) algorithm. The models were developed within EU-funded project CAESAR for regulatory use. The dataset contains the following information: common information about chemicals (ID, chemical name, and their CASRN), molecular structure information (SDF files and SMILES), and carcinogenic (toxicological) properties information: carcinogenic potency (TD50_Rat_mg; carcinogen/noncarcinogen) and structural alert (SA) for carcinogenicity based on mechanistic data. Molecular structure information was used to get chemometrics information to calculate molecular descriptors (254 MDL and 784 Dragon descriptors), which were further used in predictive QSAR modeling. The dataset presented in the paper can be used in future research in oncology, ecology, or chemicals' risk assessment.
Highlights
Rodent carcinogenicity datasets were used to build models to predict carcinogenicity within EC-funded project CAESAR (Project no. 022674 (SSPI)) [1]
E rodent carcinogenicity dataset was compiled from the Carcinogenic Potency Database (CPDBAS) and was applied for the classi cation of quantitative structure-activity relationship (QSAR) models for the prediction of carcinogenicity based on the counter-propagation arti cial neural network (CP arti cial neural networks (ANNs)) algorithm. e models were developed within EU-funded project CAESAR for regulatory use. e dataset contains the following information: common information about chemicals (ID, chemical name, and their CASRN), molecular structure information (SDF les and Simpli ed Molecular Input Line Entry System (SMILES)), and carcinogenic properties information: carcinogenic potency (TD50_Rat_mg; carcinogen/noncarcinogen) and structural alert (SA) for carcinogenicity based on mechanistic data
Molecular structure information was used to get chemometrics information to calculate molecular descriptors (254 MDL and 784 Dragon descriptors), which were further used in predictive QSAR modeling. e dataset presented in the paper can be used in future research in oncology, ecology, or chemicals’ risk assessment
Summary
Laboratory of Chemometrics, National Institute of Chemistry, Hajdrihova 19, 1001 Ljubljana, Slovenia. E rodent carcinogenicity dataset was compiled from the Carcinogenic Potency Database (CPDBAS) and was applied for the classi cation of quantitative structure-activity relationship (QSAR) models for the prediction of carcinogenicity based on the counter-propagation arti cial neural network (CP ANN) algorithm. E models were developed within EU-funded project CAESAR for regulatory use. E dataset contains the following information: common information about chemicals (ID, chemical name, and their CASRN), molecular structure information (SDF les and SMILES), and carcinogenic (toxicological) properties information: carcinogenic potency (TD50_Rat_mg; carcinogen/noncarcinogen) and structural alert (SA) for carcinogenicity based on mechanistic data. Molecular structure information was used to get chemometrics information to calculate molecular descriptors (254 MDL and 784 Dragon descriptors), which were further used in predictive QSAR modeling. Molecular structure information was used to get chemometrics information to calculate molecular descriptors (254 MDL and 784 Dragon descriptors), which were further used in predictive QSAR modeling. e dataset presented in the paper can be used in future research in oncology, ecology, or chemicals’ risk assessment
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have