Abstract
BackgroundIt is known that any individual similarity measure will not always give the best recall of active molecule structure for all types of activity classes. Recently, the effectiveness of ligand-based virtual screening approaches can be enhanced by using data fusion. Data fusion can be implemented using two different approaches: group fusion and similarity fusion. Similarity fusion involves searching using multiple similarity measures. The similarity scores, or ranking, for each similarity measure are combined to obtain the final ranking of the compounds in the database.ResultsThe Condorcet fusion method was examined. This approach combines the outputs of similarity searches from eleven association and distance similarity coefficients, and then the winner measure for each class of molecules, based on Condorcet fusion, was chosen to be the best method of searching. The recall of retrieved active molecules at top 5% and significant test are used to evaluate our proposed method. The MDL drug data report (MDDR), maximum unbiased validation (MUV) and Directory of Useful Decoys (DUD) data sets were used for experiments and were represented by 2D fingerprints.ConclusionsSimulated virtual screening experiments with the standard two data sets show that the use of Condorcet fusion provides a very simple way of improving the ligand-based virtual screening, especially when the active molecules being sought have a lowest degree of structural heterogeneity. However, the effectiveness of the Condorcet fusion was increased slightly when structural sets of high diversity activities were being sought.
Highlights
It is known that any individual similarity measure will not always give the best recall of active molecule structure for all types of activity classes
Data fusion has been used to combine the results of the structure and ligandbased approaches to virtual screening [15], their results outperforming any single method in ranking of activities
The first screening system was based on the Tanimoto (TAN) coefficient, which has been used in ligand-based virtual screening for many years and is considered a reference standard
Summary
It is known that any individual similarity measure will not always give the best recall of active molecule structure for all types of activity classes. The effectiveness of ligand-based virtual screening approaches can be enhanced by using data fusion. The similarity scores, or ranking, for each similarity measure are combined to obtain the final ranking of the compounds in the database. Many virtual screening (VS) approaches have been implemented for searching chemical databases, such as substructure search, similarity, docking and QSAR. A more realistic approach to enhancing the effectiveness of ligand-based virtual screening approaches is the use of data fusion [10] or consensus scoring in the structure-based virtual screening literature [11]. Data fusion has been used to combine the results of the structure and ligandbased approaches to virtual screening [15], their results outperforming any single method in ranking of activities. The latest reviews on using fusion in ligand-based virtual screening can be found in [16,17]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.