Prediction of protein inter-domain linkers using compositional index and simulated annealing

Maad Shatnawi,Nazar Zaki

doi:10.1145/2464576.2482740

Abstract

Protein chains are typically large and consist of multiple domains which are difficult and computationally expensive to characterize using experimental methods. Therefore, accurate and reliable prediction of protein domain boundaries is often the initial step in both experimental and computational protein research. In this paper, we propose a straightforward yet effective method to predict inter-domain linker segments by using the amino acid compositional index from the amino acid sequence information. Each amino acid in the protein sequence is represented by a compositional index which is deduced from a combination of the difference in amino acid occurrences in domains and linker segments in training protein sequences and the amino acid composition information. Further, we employ simulated annealing to improve the prediction by finding the optimal set of threshold values that separate domains from inter-domain linkers. The performance of the proposed method is compared to the current approaches on two protein sequence datasets. Experimental results show superior performance by the proposed method when compared to the state-of-the-art methods for inter-domain linker prediction.

Full Text