The rate and extent of biodegradation of petroleum hydrocarbons in the different aquatic environments is an important element to address. The major avenue for removing petroleum hydrocarbons from the environment is thought to be biodegradation. The present study involves the development of predictive quantitative structure-property relationship (QSPR) models for the primary biodegradation half-life of petroleum hydrocarbons that may be used to forecast the biodegradation half-life of untested petroleum hydrocarbons within the established models' applicability domain. These models use easily computable two-dimensional (2D) descriptors to investigate important structural characteristics needed for the biodegradation of petroleum hydrocarbons in freshwater (dataset 1), temperate seawater (dataset 2), and arctic seawater (dataset 3). All the developed models follow OECD guidelines. We have used double cross-validation, best subset selection, and partial least squares tools for model development. In addition, the small dataset modeler tool has been successfully used for the dataset with very few compounds (dataset 3 with 17 compounds), where dataset division was not possible. The resultant models are robust, predictive, and mechanisticallyinterpretable based on both internal and external validation metrics (R2 range of 0.605-0.959. Q2(Loo) range of 0.509-0.904, and Q2F1 range of 0.526-0.959). The intelligent consensus predictor tool has been used for theimprovement of the prediction quality for test set compounds which provided superior outcomes to those from individual partial least squares models based on several metrics (Q2F1 = 0.808 and Q2F2 = 0.805 for dataset 1 in freshwater). Molecular size and hydrophilic factor for freshwater, frequency of two carbon atoms at topological distance 4 for temperate seawater, and electronegative atom count relative to size for arctic seawater were found to be the most significant descriptors responsible for the regulation of biodegradation half-life of petroleum hydrocarbons.
Read full abstract