Artificial Neural Networks can be applied for the identification and classification of prospective drug candidates such as complex compounds, including lipopeptide, based on their SMILES string representation. The training of neural networks is done with SMILES strings, which are predictive of structural identification; the ANNs are efficient of correctly classifying all compounds, substructures and their analogues distinguishing the drugs based upon atomic organization to obtain lead optimization in drug discovery. The proficiency of the trained ANN models in recognizing and classifying the analogous compounds was tested for analysis of similar compounds, which were not taken previously for training and achieved results with correct classification in the validation set. The best result was achieved with 10 numbers of hidden layers. The R2 value for training is 0.90586; the R2 value for testing is 0.99508; the R2 value after validation is 0.94151; the final value of R2 for total sets is 0.89456. The graphs are plotted between 21 epochs and mean square error (MSE) to report the performance of the model. The value of 798.1735 for the gradient of the curve after 21 iterations and 6 validation checks was obtained. A successful model was developed for the identification and classification of lipopeptides from their SMILES annotation that efficiently classifies similar compounds and supports in decision making for analogue-based drug discovery. This will help in appropriate lead optimization studies for the prediction of potential anticancer and antimicrobial lipopeptide-based therapeutics.
Read full abstract