Abstract
Artificial Neural Networks can be applied for the identification and classification of prospective drug candidates such as complex compounds, including lipopeptide, based on their SMILES string representation. The training of neural networks is done with SMILES strings, which are predictive of structural identification; the ANNs are efficient of correctly classifying all compounds, substructures and their analogues distinguishing the drugs based upon atomic organization to obtain lead optimization in drug discovery. The proficiency of the trained ANN models in recognizing and classifying the analogous compounds was tested for analysis of similar compounds, which were not taken previously for training and achieved results with correct classification in the validation set. The best result was achieved with 10 numbers of hidden layers. The R2 value for training is 0.90586; the R2 value for testing is 0.99508; the R2 value after validation is 0.94151; the final value of R2 for total sets is 0.89456. The graphs are plotted between 21 epochs and mean square error (MSE) to report the performance of the model. The value of 798.1735 for the gradient of the curve after 21 iterations and 6 validation checks was obtained. A successful model was developed for the identification and classification of lipopeptides from their SMILES annotation that efficiently classifies similar compounds and supports in decision making for analogue-based drug discovery. This will help in appropriate lead optimization studies for the prediction of potential anticancer and antimicrobial lipopeptide-based therapeutics.
Highlights
Along with in vitro and in vivo, in silico analysis such as machine learning for the prediction of chemical properties of compounds has become an efficient way in chemical analysis
The artificial neural network for (ANN) is performed with 70%, 15% and 15% of the total 22 lipopeptides being used for training, validation and testing, respectively
The results have shown that the experimental results are found closer to the predicted neural network results
Summary
Along with in vitro and in vivo, in silico analysis such as machine learning for the prediction of chemical properties of compounds has become an efficient way in chemical analysis. One such example is the prediction of protein-ligand interaction, which facilitates the identification of novel compounds through screening of lead compounds in the process of drug discovery [1]. SDF is used to write more than one compound in a single file and is an extension of MOL format, or it can be said its extended version [1].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.