Abstract

Protein-Protein Interaction (PPI) networks are important as they provide clues about the functions of individual proteins as well as enable system level analyses of cellular processes. Predicting hub proteins, the highly connected proteins in PPI networks, is a challenging computational problem. This paper proposes a method for predicting hub proteins from sequence information with 76% accuracy, 84% sensitivity and 71% specificity. In this method, a biodiversity measure, Shannon Index, is used along with an amino acid attribute Transfer Free Energy to Surface (TFES) to distinguish hub proteins from non-hub proteins. Also an analysis of disorderliness in hub proteins revealed that some amino acids have higher composition in hub than in non-hub.

Highlights

  • In Bioinformatics, one of the important data set is protein

  • For predicting whether a target is a hub protein or not in a Protein-Protein Interactions (PPI) networks, a method is proposed which gives more than 76% accuracy for data set

  • The method was applied to a random set of data from eukaryotes and prokaryotes, obtained from literature survey, was found to have almost similar accuracy. It was primarily Shannon index which was used in the proposed method to map a protein sequence to a numerical value

Read more

Summary

Introduction

In Bioinformatics, one of the important data set is protein. They are the work horse of life whose importance can be understood from the following sentence:” Right time, Right place and Right quantity of protein production makes one healthy”. The last column is the difference between the percentage values of non-hub and hub satisfying the required condition It can be seen from the table that for non-hub proteins the percentage of sequences satisfying the given condition is highest for the amino acid ‘Y’ and lowest for ‘V’. The columns of this table are similar to that of the previous table It can be seen from the table that for both non-hub and hub proteins the percentage of sequences satisfying the given condition is highest for the amino acid ‘H’ and lowest for ‘V’.

Conclusions
Findings
Proposed Method Data from hub classifier
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call