Cytochrome P450 3A4 (CYP3A4), a prominent member of the P450 enzyme superfamily, plays a crucial role in metabolizing various xenobiotics, including over 50% of clinically significant drugs. Evaluating CYP3A4 inhibition before drug approval is essential to avoiding potentially harmful pharmacokinetic drug-drug interactions (DDIs) and adverse drug reactions (ADRs). Despite the development of several CYP inhibitor prediction models, the primary approach for screening CYP inhibitors still relies on experimental methods. This might stem from the limitations of existing models, which only provide deterministic classification outcomes instead of precise inhibition intensity (e.g., IC50) and often suffer from inadequate prediction reliability. To address this challenge, we propose an uncertainty-guided regression model to accurately predict the IC50 values of anti-CYP3A4 activities. First, a comprehensive data set of CYP3A4 inhibitors was compiled, consisting of 27,045 compounds with classification labels, including 4395 compounds with explicit IC50 values. Second, by integrating the predictions of the classification model trained on a larger data set and introducing an evidential uncertainty method to rank prediction confidence, we obtained a high-precision and reliable regression model. Finally, we use the evidential uncertainty values as a trustworthy indicator to perform a virtual screening of an in-house compound set. The in vitro experiment results revealed that this new indicator significantly improved the hit ratio and reduced false positives among the top-ranked compounds. Specifically, among the top 20 compounds ranked with uncertainty, 15 compounds were identified as novel CYP3A4 inhibitors, and three of them exhibited activities less than 1 μM. In summary, our findings highlight the effectiveness of incorporating uncertainty in compound screening, providing a promising strategy for drug discovery and development.
Read full abstract