In this study, we utilized an innovative quantitative read-across (RA) structure-activity relationship (q-RASAR) approach to predict the bioconcentration factor (BCF) values of a diverse range of organic compounds, based on a dataset of 575 compounds tested using Organisation for Economic Co-operation and Development Test Guideline 305 for bioaccumulation in fish. Initially, we constructed the q-RASAR model using the partial least squares regression method, yielding promising statistical results for the training set (R2 =0.71, Q2LOO=0.68, mean absolute error [MAE]training=0.54). The model was further validated using the test set (Q2F1=0.77, Q2F2=0.75, MAEtest=0.51). Subsequently, we explored the q-RASAR method using other regression-based supervised machine-learning algorithms, demonstrating favourable results for the training and test sets. All models exhibited R2 and Q2F1 values exceeding 0.7, Q2LOO values greater than 0.6, and low MAE values, indicating high model quality and predictive capability for new, unidentified chemical substances. These findings represent the significance of the RASAR method in enhancing predictivity for new unknown chemicals due to the incorporation of similarity functions in the RASAR descriptors, independent of a specific algorithm.
Read full abstract