Abstract
BackgroundLigand-based virtual screening using molecular shape is an important tool for researchers who wish to find novel chemical scaffolds in compound libraries. The Ultrafast Shape Recognition (USR) algorithm is capable of screening millions of compounds and is therefore suitable for usage in a web service. The algorithm however is agnostic of atom types and cannot discriminate compounds with similar shape but distinct pharmacophoric features. To solve this problem, an extension of USR called USRCAT, has been developed that includes pharmacophoric information whilst retaining the performance benefits of the original method.ResultsThe USRCAT extension is shown to outperform the traditional USR method in a retrospective virtual screening benchmark. Also, a relational database implementation is described that is capable of screening a million conformers in milliseconds and allows the inclusion of complex query parameters.ConclusionsUSRCAT provides a solution to the lack of atom type information in the USR algorithm. Researchers, particularly those with only limited resources, who wish to use ligand-based virtual screening in order to discover new hits, will benefit the most. Online chemical databases that offer a shape-based similarity method might also find advantage in using USRCAT due to its accuracy and performance. The source code is freely available and can easily be modified to fit specific needs.
Highlights
Ligand-based virtual screening using molecular shape is an important tool for researchers who wish to find novel chemical scaffolds in compound libraries
The comparison based on a heterogeneous data set shows the benefits of the Ultrafast Shape Recognition with CREDO Atom Types (USRCAT) extension immediately
Aromaticity was implemented in USRCAT as a pharmacophoric subset because Ultrafast Shape Recognition (USR) was unable to discriminate between long, chain-like molecules such as certain heteropeptides and long alkyl chains in particular
Summary
Ligand-based virtual screening using molecular shape is an important tool for researchers who wish to find novel chemical scaffolds in compound libraries. Alignment-based methods relying on molecular superposition retain almost all of the shape information of a molecule but do not encode shape in a numerical form. They are computationally expensive, they enable precise geometric comparison of surface features such as polarity and hydrophobicity as well as chirality at the same time. Time-consuming molecular alignments can be omitted (in most cases) and since numerical representations can be conveniently stored in a database, the screening of very large conformer databases becomes feasible Such moment-based methods do not usually retain any of the pharmacophoric information of the encoded molecule; additional measures are often necessary to ensure that retrieved molecules have chemical properties that are similar to the reference geometry. Ligand-based virtual screening relying on shape alone was found to deliver performances comparable to protein-ligand docking [2,3]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have