Abstract
The process of virtual screening relies heavily on the databases, but it is disadvantageous to conduct virtual screening based on commercial databases with patent-protected compounds, high compound toxicity and side effects. Therefore, this paper utilizes generative recurrent neural networks (RNN) containing long short-term memory (LSTM) cells to learn the properties of drug compounds in the DrugBank, aiming to obtain a new and virtual screening compounds database with drug-like properties. Ultimately, a compounds database consisting of 26,316 compounds is obtained by this method. To evaluate the potential of this compounds database, a series of tests are performed, including chemical space, ADME properties, compound fragmentation, and synthesizability analysis. As a result, it is proved that the database is equipped with good drug-like properties and a relatively new backbone, its potential in virtual screening is further tested. Finally, a series of seedling compounds with completely new backbones are obtained through docking and binding free energy calculations.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have