Abstract

The notion of measure representation of protein sequences is introduced based on the detailed HP model. Multifractal analysis and detrended fluctuation analysis are then performed on the measure representations of a large number of long protein sequences. It is concluded that these protein sequences are not completely random sequences through the measure representations and the values of the D q spectra and related C q curves. The values of the exponent from the detrended fluctuation analysis show that the K-strings with the ordering in the measure representation exhibit strong long-range correlation. For substrings with length K=5, the D q spectra of all proteins studied are multifractal-like and sufficiently smooth for the C q curves to be meaningful. The C q curves of all proteins resemble a classical phase transition at a critical point. An IFS model is found to simulate the measure representation of protein sequences very well. From the estimated values of parameters in the IFS model, we think the non-polar residues and uncharged polar residues play a more important role than other kinds of residues in the protein folding process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call