Abstract

Sodium-dependent glucose co-transporter 1 (SGLT1) is a solute carrier responsible for active glucose absorption. SGLT1 is present in both the renal tubules and small intestine. In contrast, the closely related sodium-dependent glucose co-transporter 2 (SGLT2), a protein that is targeted in the treatment of diabetes type II, is only expressed in the renal tubules. Although dual inhibitors for both SGLT1 and SGLT2 have been developed, no drugs on the market are targeted at decreasing dietary glucose uptake by SGLT1 in the gastrointestinal tract. Here we aim at identifying SGLT1 inhibitors in silico by applying a machine learning approach that does not require structural information, which is absent for SGLT1. We applied proteochemometrics by implementation of compound- and protein-based information into random forest models. We obtained a predictive model with a sensitivity of 0.64 ± 0.06, specificity of 0.93 ± 0.01, positive predictive value of 0.47 ± 0.07, negative predictive value of 0.96 ± 0.01, and Matthews correlation coefficient of 0.49 ± 0.05. Subsequent to model training, we applied our model in virtual screening to identify novel SGLT1 inhibitors. Of the 77 tested compounds, 30 were experimentally confirmed for SGLT1-inhibiting activity in vitro, leading to a hit rate of 39% with activities in the low micromolar range. Moreover, the hit compounds included novel molecules, which is reflected by the low similarity of these compounds with the training set (< 0.3). Conclusively, proteochemometric modeling of SGLT1 is a viable strategy for identifying active small molecules. Therefore, this method may also be applied in detection of novel small molecules for other transporter proteins.

Highlights

  • Sodium-dependent glucose co-transporters, or sodiumglucose linked transporters (SGLTs), are solute carriers (SLCs) that are responsible for glucoseabsorption

  • The public dataset encompassed 2063 data points and 1683 unique compounds, of which 886 compounds had measured human SGLT1 (hSGLT1) activities. This set was supplemented with an in-house dataset of 2007 molecules previously screened for hSGLT1 and human SGLT2 (hSGLT2) inhibition [Oranje et al manuscript in preparation]

  • The data derived from ChEMBL was compared to the in-house dataset: the in-house dataset contained an additional 2005 hSGLT1 activities and 140 hSGLT2 activities, which were not present in the public dataset

Read more

Summary

Introduction

Sodium-dependent glucose co-transporters, or sodiumglucose linked transporters (SGLTs), are solute carriers (SLCs) that are responsible for glucose (re)absorption. The publicly available compound database ChEMBL includes ligand–protein binding information for multiple SGLTs [13,14,15], allowing the use of statistical modeling techniques such as quantitative structure–activity relationship analysis (QSAR) and proteochemometrics (PCM) [16] These techniques, which make use of machine learning, do not require protein structural information and can be applied in the context of SLCs. ligand-based pharmacophore modeling, QSAR, and PCM have only been applied to a few SLCs [17, 18], these techniques are well established on other drug targets including membrane proteins such as G protein-coupled receptors [19,20,21]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call