Abstract

The nature of changes involved in crossed-sequence scale and inner-sequence scale is very challenging in protein biology. This study is a new attempt to assess with a phenomenological approach the non-stationary and nonlinear fluctuation of changes encountered in protein sequence. We have computed fluctuations from an encoded amino acid index dataset using cumulative sum technique and extracted the departure from the linear trend found in each protein sequence. For inner-sequence analysis, we found that the fluctuations of changes statistically follow a −5/3 Kolmogorov power and behave like an incremental Brownian process. The pattern of the changes in the inner sequence seems to be monofractal in essence and to be bounded between Hurst exponent [1/3,1/2] range, which respectively corresponds to the Kolmogorov and Brownian monofractal process. In addition, the changes in the inner sequence exhibit moderate complexity and chaos, which seems to be coherent with the monofractal and stochastic process highlighted previously in the study. The crossed-sequence changes analysis was achieved using an external parameter, which is the activity available for each protein sequence, and some results obtained for the inner sequence, specifically the drift and Kolmogorov complexity spectrum. We found a significant linear relationship between activity changes and drift changes, and also between activity and Kolmogorov complexity. An analysis of the mean square displacement of trajectories in the bivariate space (drift, activity) and (Kolmogorov complexity spectrum, activity) seems to present a superdiffusive law with a 1.6 power law value.

Highlights

  • From the information viewpoint, a protein sequence can be considered as a distribution of successive symbols extracted with a rule from a dictionary

  • We found for each protein sequence a power law between a range of [0.69 0.99] and a coefficient of variation less than 7%, which reveals that the fluctuations of normalized detrended cumulative sum (NDCS)

  • Following the comparison with the surrogated and shuffled data generated from the original data, we found that the NDCS of D PRIFT index changes for 242 protein sequences used in this study include stochastic and moderate chaotic processes and show apparent embedding between the Kolmogorov (H = 1/3) and Brownian (H = 1/2) monofractal processes

Read more

Summary

Introduction

A protein sequence can be considered as a distribution of successive symbols extracted with a rule from a dictionary. It means that the protein sequence is encoded to a set of symbol combinations. There is a huge variety of combinations of symbols to encode a protein sequence in the real world. It is well-known that the molecular mechanism (stability, structure function, disorder) is often triggered by complex interactions [1,2,3]. There are numerous encoder models that try to reflect the reality accurately

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call