Abstract

BackgroundThe covariation of two sites in a protein is often used as the degree of their coevolution. To quantify the covariation many methods have been developed and most of them are based on residues position-specific frequencies by using the mutual information (MI) model.ResultsIn the paper, we proposed several new measures to incorporate new biological constraints in quantifying the covariation. The first measure is the mutual information with the amino acid background distribution (MIB), which incorporates the amino acid background distribution into the marginal distribution of the MI model. The modification is made to remove the effect of amino acid evolutionary pressure in measuring covariation. The second measure is the mutual information of residues physicochemical properties (MIP), which is used to measure the covariation of physicochemical properties of two sites. The third measure called MIBP is proposed by applying residues physicochemical properties into the MIB model. Moreover, scores of our new measures are applied to a robust indicator conn(k) in finding the covariation signal of each site.ConclusionsWe find that incorporating amino acid background distribution is effective in removing the effect of evolutionary pressure of amino acids. Thus the MIB measure describes more biological background information for the coevolution of residues. Besides, our analysis also reveals that the covariation of physicochemical properties is a new aspect of coevolution information.

Highlights

  • The covariation of two sites in a protein is often used as the degree of their coevolution

  • In order to compare our new methods with existing ones, all chosen methods, namely mutual information (MI)’, H2r, ELSC, MI with amino acid background distribution (MIB)’, mutual information of residues physicochemical properties (MIP)’ and MIBP’ are tested on four multiple sequence alignment (MSA)

  • Some sites predicted by MIP’ and MIBP’ are predicted by MI’, but not by MIB’. These results demonstrate that the classification of amino acids and physicochemical properties are different in depicting the MSA

Read more

Summary

Introduction

The covariation of two sites in a protein is often used as the degree of their coevolution. In order to quantify the covariation of two sites in a given MSA, many computational methods have been developed in recent years. These methods can be divided into two groups: parametric methods and assumptions strictly matching known biological constraints. Another widely used nonparametric method is ELSC which applies a perturbation-based method to calculate explicitly the likelihood of evolutionary covariance in MSAs [26]. Many biological constraints have been used in measuring covariation, the amino acid background distribution and their physicochemical properties are ignored in previous methods

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call