Abstract

Quantifying the similarity of molecules is considered one of the major tasks in virtual screening. There are many similarity measures that have been proposed for this purpose, some of which have been derived from document and text retrieving areas as most often these similarity methods give good results in document retrieval and can achieve good results in virtual screening. In this work, we propose a similarity measure for ligand-based virtual screening, which has been derived from a text processing similarity measure. It has been adopted to be suitable for virtual screening; we called this proposed measure the Adapted Similarity Measure of Text Processing (ASMTP). For evaluating and testing the proposed ASMTP we conducted several experiments on two different benchmark datasets: the Maximum Unbiased Validation (MUV) and the MDL Drug Data Report (MDDR). The experiments have been conducted by choosing 10 reference structures from each class randomly as queries and evaluate them in the recall of cut-offs at 1% and 5%. The overall obtained results are compared with some similarity methods including the Tanimoto coefficient, which are considered to be the conventional and standard similarity coefficients for fingerprint-based similarity calculations. The achieved results show that the performance of ligand-based virtual screening is better and outperforms the Tanimoto coefficients and other methods.

Highlights

  • The past few years have witnessed more attention to chemoinformatics and it has become an active multidisciplinary research area that covers wide aspects of chemistry and drug discovery using different tools and technology

  • The proposed Adapted Similarity Measure of Text Processing (ASMTP) algorithm has been derived for the text area, as we found that most of the algorithms developed for textual database processing can be used for processing chemical structure databases [1,19]

  • The average retrieved output of the ten references’ query results mean are calculated in the 1% and 5% cutoffs of the recall data, while the procedure is repeated for the all databases

Read more

Summary

Introduction

The past few years have witnessed more attention to chemoinformatics and it has become an active multidisciplinary research area that covers wide aspects of chemistry and drug discovery using different tools and technology. VS is one of the important processes of discovering new ligands on the bases of biological structure and it has many definitions, one of which is: “Use of high-performance computing to analyze large databases of chemical compounds in order to identify possible drug candidates” [3]. The experiments are not really done in a chemical laboratory, as HTS, and the compounds do not need to physically exist as they are virtually done by computers programs and methods. They rely on computational methods that are used to search molecular databases and identify molecular structures that are most likely to bind to a drug target, typically a protein

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call