Abstract

Recently, drug toxicity has become a critical problem with heavy medical and economic burdens. Acquired long QT syndrome (acLQTS) is an acquired cardiac ion channel disease caused by drugs blocking the hERG channel. Therefore, it is necessary to avoid cardiotoxicity in drug design, and computer models have been widely used to fix this predicament. In this study, we collected a hERG inhibitor dataset containing 8671 compounds, and then, these compounds were featurized by traditional molecular fingerprints (including Baseline2D, ECFP4, PropertyFP, and 3DFP) and the newly proposed molecular dynamics fingerprint (MDFP). Subsequently, regression prediction models were established by using four machine learning algorithms based on these fingerprints and the combined multi-dimensional molecular fingerprints (MultiFP). After cross-validation and independent test dataset validation, the results show that the best model was built by the consensus of four algorithms with MultiFP, and this model bests recently published methods in terms of hERG cardiotoxicity prediction with a RMSE of 0.531 and a R2 of 0.653 on the test dataset. Feature importance analysis and correlation analysis identified some novel structural features and molecular dynamics features that are highly associated with the hERG inhibition of compounds. Our findings provide new insight into multi-dimensional molecular fingerprints and consensus models for hERG cardiotoxicity prediction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call