Abstract

BackgroundFunctional annotation of rapidly amassing nucleotide and protein sequences presents a challenging task for modern bioinformatics. This is particularly true for protein families sharing extremely low sequence identity, as for lipocalins, a family of proteins with varied functions and great diversity at the sequence level, yet conserved structures.ResultsIn the present study we propose a SVM based method for identification of lipocalin protein sequences. The SVM models were trained with the input features generated using amino acid, dipeptide and secondary structure compositions as well as PSSM profiles. The model derived using both PSSM and secondary structure emerged as the best model in the study. Apart from achieving a high prediction accuracy (>90% in leave-one-out), lipocalinpred correctly differentiates closely related fatty acid-binding proteins and triabins as non-lipocalins.ConclusionThe method offers a promising approach as a lipocalin prediction tool, complementing PROSITE, Pfam and homology modelling methods.

Highlights

  • Functional annotation of rapidly amassing nucleotide and protein sequences presents a challenging task for modern bioinformatics

  • Each sequence served as the query sequence once while the remaining formed the database, with the procedure iterating on each sequence

  • With the tremendous increase in the number of sequences accumulating from different sequencing projects, the number of such sequences may be substantially high because of the absence of any lipocalin hit in similaritybased searches

Read more

Summary

Introduction

Functional annotation of rapidly amassing nucleotide and protein sequences presents a challenging task for modern bioinformatics. The lipocalins belong to the calcyin superfamily, along with fatty acid binding proteins (FABPs), avidins, metallo-protease inhibitors and triabins In contrast to their poor sequence similarity (identity falling below 20% for paralogs), lipocalins share a highly conserved three dimensional structure. The space between the two b-sheets of the barrel defines an internal apolar binding cavity with high structural plasticity, consisting of four structurally hypervariable peptide loops, mounted on the barrel. These are divided into two groups according to the presence of three structurally conserved regions (SCRs). The more divergent, outlier lipocalins, share only one or two SCRs [1]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.