Abstract
Protein chains are typically large and consist of multiple domains which are difficult and computationally expensive to characterize using experimental methods. Therefore, accurate and reliable prediction of protein domain boundaries is often the initial step in both experimental and computational protein research. In this paper, we propose a straightforward yet effective method to predict inter-domain linker segments by using the amino acid compositional index from the amino acid sequence information. Each amino acid in the protein sequence is represented by a compositional index which is deduced from a combination of the difference in amino acid occurrences in domains and linker segments in training protein sequences and the amino acid composition information. Further, we employ simulated annealing to improve the prediction by finding the optimal set of threshold values that separate domains from inter-domain linkers. The performance of the proposed method is compared to the current approaches on two protein sequence datasets. Experimental results show superior performance by the proposed method when compared to the state-of-the-art methods for inter-domain linker prediction.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.