Abstract

In this paper, we use network approaches to analyze the relations between protein sequence features for the top hierarchical classes of CATH and SCOP. We use fundamental connectivity measures such as correlation (CR), normalized mutual information rate (nMIR), and transfer entropy (TE) to analyze the pairwise-relationships between the protein sequence features, and use centrality measures to analyze weighted networks constructed from the relationship matrices. In the centrality analysis, we find both commonalities and differences between the different protein 3D structural classes. Results show that all top hierarchical classes of CATH and SCOP present strong non-deterministic interactions for the composition and arrangement features of Cystine (C), Methionine (M), Tryptophan (W), and also for the arrangement features of Histidine (H). The different protein 3D structural classes present different preferences in terms of their centrality distributions and significant features.

Highlights

  • Proteins are varied with their sequences, structures, and functions, the structures are encoded by their sequences, while the functions are decided by their structures [1,2,3,4,5,6,7,8]

  • We use classic centrality measures to analyze weighted networks constructed from the pairwise-relations between the sequence features, and use Welch T-tests to identify the significant features for the different types of protein 3D structures, where we find both similarities and differences between the different types of structures

  • We test the CATH and SCOP data with 0

Read more

Summary

Introduction

Proteins are varied with their sequences, structures, and functions, the structures are encoded by their sequences, while the functions are decided by their structures [1,2,3,4,5,6,7,8]. Many studies have used protein sequence homology to predict the spatial structures of proteins [1]. Other spatial structural prediction methods include homology modelling, threading, and ab initio methods [1]. Popular protein structural prediction servers are such as the SWISS-MODEL [15], RaptorX [16], ROBETTA [17], I-TASSER [18]. These methods predict the protein 3D structures providing their sequences. Edler and Grassmann have proposed a new protein fold classification method based on the feed forward neural networks (FFN) [20]. Huang et al have introduced three novel ideas for multiclass

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.