Abstract

A new benchmark 20-bead HP model protein sequence (on a square lattice), which has 17 distinct but degenerate global minimum (GM) energy structures, has been studied using a genetic algorithm (GA). The relative probabilities of finding particular GM conformations are determined and related to the theoretical probability of generating these structures using a recoil growth constructor operator. It is found that for longer successful GA runs, the GM probability distribution is generally very different from the constructor probability, as other GA operators have had time to overcome any initial bias in the originally generated population of structures. Structural and metric relationships (e.g., Hamming distances) between the 17 distinct GM are investigated and used, in conjunction with data on the connectivities of the GM and the pathways that link them, to explain the GM probability distributions obtained by the GA. A comparison is made of searches where the sequence is defined in the normal (forward) and reverse directions. The ease of finding mirror image solutions are also compared. Finally, this approach is applied to rationalize the ease or difficulty of finding the GM for a number of standard benchmark HP sequences on the square lattice. It is shown that the relative probabilities of finding particular members of a set of degenerate global minima depend critically on the topography of the energy landscape in the vicinity of the GM, the connections and distances between the GM, and the nature of the operators used in the chosen search method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call