High-throughput screening (HTS) is widely applied in many fields ranging from drug discovery to clinical diagnostics and toxicity assessment. Firefly luciferase is commonly used as a reporter to monitor the effect of chemical compounds on the activity of a specific target or pathway in HTS. However, the false positive rate of luciferase-based HTS is relatively high because many artifacts or promiscuous compounds that have direct interaction with the luciferase reporter enzyme are usually identified as active compounds (hits). Therefore, it is necessary to develop a rapid screening method to identify these compounds that can inhibit the luciferase activity directly. In this study, a virtual screening (VS) classification model called MIEC-GBDT (MIEC: Molecular Interaction Energy Components; GBDT: Gradient Boosting Decision Tree) was developed to distinguish luciferase inhibitors from non-inhibitors. The MIECs calculated by Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) free energy decomposition were used to energetically characterize the binding pattern of each small molecule at the active site of luciferase, and then the GBDT algorithm was employed to construct the classifiers based on MIECs. The predictions to the test set show that the optimized MIEC-GBDT model outperformed molecular docking and MM/GBSA rescoring. The best MIEC-GBDT model based on the MIECs with the energy terms of ΔGele, ΔGvdW, ΔGGB, and ΔGSA achieves the prediction accuracies of 87.2% and 90.3% for the inhibitors and non-inhibitors in the test sets, respectively. Moreover, the energetic analysis of the vital residues suggests that the energetic contributions of the vital residues to the binding of inhibitors are quite different from those to the binding of non-inhibitors. These results suggest that the MIEC-GBDT model is reliable and can be used as a powerful tool to identify potential interference compounds in luciferase-based HTS experiments.
Read full abstract