Abstract

It has been reported recently that consensus scoring, which combines multiple scoring functions in binding affinity estimation, leads to higher hit-rates in virtual library screening studies. This method seems quite independent to the target receptor, the docking program, or even the scoring functions under investigation. Here we present an idealized computer experiment to explore how consensus scoring works. A hypothetical set of 5000 compounds is used to represent a chemical library under screening. The binding affinities of all its member compounds are assigned by mimicking a real situation. Based on the assumption that the error of a scoring function is a random number in a normal distribution, the predicted binding affinities were generated by adding such a random number to the "observed" binding affinities. The relationship between the hit-rates and the number of scoring functions employed in scoring was then investigated. The performance of several typical ranking strategies for a consensus scoring procedure was also explored. Our results demonstrate that consensus scoring outperforms any single scoring for a simple statistical reason: the mean value of repeated samplings tends to be closer to the true value. Our results also suggest that a moderate number of scoring functions, three or four, are sufficient for the purpose of consensus scoring. As for the ranking strategy, both the rank-by-number and the rank-by-rank strategy work more effectively than the rank-by-vote strategy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call