Abstract

De novo drug design refers to the process of designing new drug molecules from scratch using computational methods. In contrast to other computational methods that primarily focus on modifying existing molecules, designing from scratch enables the exploration of new chemical space and the potential discovery of novel molecules with enhanced properties. In this research, we proposed a model that utilizes Generative Pre-trained Transformer (GPT) architecture and relative attention for de novo drug design. GPT is a language model that utilizes transformer architecture to predict the next word or token in a given sequence. Representation of molecules using SMILES notation has enabled the use of next-token prediction techniques in de novo drug design. GPT uses attention mechanisms to capture the dependencies and relationships between different tokens in a sequence and allows the model to focus on the most important information when processing the input.Relative attention is a variant of the attention mechanism, which allows the model to capture the relative distances and relationships between tokens in the input sequence. In the standard attention mechanism, positional information is typically encoded using fixed-position embeddings. In relative attention, positional information is supplied dynamically during attention calculation by incorporating relative positional encodings, enabling the model to quickly learn the syntax of new unseen tokens.Relative attention enables the GPT model to better understand the relative positions of tokens in the sequence, which can be particularly useful when dealing with limited dataset sizes or generating target-specific drugs. The proposed model was trained on benchmark datasets, and performance was compared with other generative models. We show that relative attention and transfer learning could enable the GPT model to generate molecules with improved validity, uniqueness, and novelty in the context of de novo drug design. To illustrate the effectiveness of relative attention, the model was trained using transfer learning on three target-specific datasets, and the performance was compared with standard attention.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call