Abstract

Valid and predictive models for classifying Ames mutagenicity have been developed using conformal prediction. The models are Random Forest models using signature molecular descriptors. The investigation indicates, on excluding not-strongly mutagenic compounds (class B), that the validity for mutagenic compounds is increased for the predictions based on both public and the Division of Genetics and Mutagenesis, National Institute of Health Sciences of Japan (DGM/NIHS) data while less so when using only the latter data source. The former models only result in valid predictions for the majority, non-mutagenic, class whereas the latter models are valid for both classes, i.e. mutagenic and non-mutagenic compounds. These results demonstrate the importance of data consistency manifested through the superior predictive quality and validity of the models based only on DGM/NIHS generated data compared to a combination of this data with public data sources.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call