Renal secretion plays an important role in excretion of drug from the kidney. Two major transporters known to be highly involved in renal secretion are MATE1/2 K and OCT2, the former of which is highly related to drug-drug interactions. Among published in silico models for MATE inhibitors, a previous model obtained a ROC-AUC value of 0.78 using high throughput percentage inhibition data [J. Med. Chem. 2013, 56(3), 781-795] which we aimed to improve upon here using a combined fingerprint and physics-based approach. To this end, we collected 225 publicly available compounds with pIC50 values against MATE1. Subsequently, on the one hand, we performed a physics-based approach using an Alpha-Fold protein structure, from which we obtained MM-GB/SA scores for those compounds. On the other hand, we built Random Forest (RF) and message passing neural network models using extended-connectivity fingerprints with radius 4 (ECFP4) and chemical structures as graphs, respectively, which also included MM-GB/SA scores as input variables. In a five-fold cross-validation with a separate test set, we found that the best predictivity for the hold-out test was observed in the RF model (including ECFP4 and MM-GB/SA data) with an ROC-AUC of 0.833 ± 0.036; while that of the MM-GB/SA regression model was 0.742. However, the MM-GB/SA model did not show a dependency of the performance on the particular chemical space being predicted. Additionally, via structural interaction fingerprint analysis, we identified interacting residues with inhibitor as identical for those with noninhibitors, including substrates, such as Gln49, Trp274, Tyr277, Tyr299, Ile303, and Tyr306. The similar binding modes are consistent with the observed similar IC50 value inhibitor when using different substrates experimentally, and practically, this can release the experimental scientists from bothering of selecting substrates for MATE1. Hence, we were able to build highly predictive classification models for MATE1 inhibitory activity with both ECFP4 and MM-GB/SA score as input features, which is fit-for-purpose for use in the drug discovery process.
Read full abstract