Abstract

BackgroundBioluminescent proteins (BLPs) widely exist in many living organisms. As BLPs are featured by the capability of emitting lights, they can be served as biomarkers and easily detected in biomedical research, such as gene expression analysis and signal transduction pathways. Therefore, accurate identification of BLPs is important for disease diagnosis and biomedical engineering. In this paper, we propose a novel accurate sequence-based method named PredBLP (Prediction of BioLuminescent Proteins) to predict BLPs.ResultsWe collect a series of sequence-derived features, which have been proved to be involved in the structure and function of BLPs. These features include amino acid composition, dipeptide composition, sequence motifs and physicochemical properties. We further prove that the combination of four types of features outperforms any other combinations or individual features. To remove potential irrelevant or redundant features, we also introduce Fisher Markov Selector together with Sequential Backward Selection strategy to select the optimal feature subsets. Additionally, we design a lineage-specific scheme, which is proved to be more effective than traditional universal approaches.ConclusionExperiment on benchmark datasets proves the robustness of PredBLP. We demonstrate that lineage-specific models significantly outperform universal ones. We also test the generalization capability of PredBLP based on independent testing datasets as well as newly deposited BLPs in UniProt. PredBLP is proved to be able to exceed many state-of-art methods. A web server named PredBLP, which implements the proposed method, is free available for academic use.

Highlights

  • Bioluminescent proteins (BLPs) widely exist in many living organisms

  • To investigate the amino acid preference of BLPs, we calculate the features of amino acid composition (AAC) for BLPs and non-BLPs respectively

  • We empirically demonstrate that amino acid compositions with relative difference higher than 0.25% are discriminatory

Read more

Summary

Introduction

Bioluminescent proteins (BLPs) widely exist in many living organisms. As BLPs are featured by the capability of emitting lights, they can be served as biomarkers and detected in biomedical research, such as gene expression analysis and signal transduction pathways. We propose a novel accurate sequence-based method named PredBLP (Prediction of BioLuminescent Proteins) to predict BLPs. Bioluminescence is a special process of chemiluminescence, which is common in many living organisms across the lineages of bacteria, eukaryota and archaea [1]. Bioluminescent proteins (BLPs), with the capability of emitting light by converting chemical energy to light energy, play a critical role in bioluminescence [2, 3]. Employed as highly sensitive labels, they are enormously useful in non-invasive in-vivo biomedical research, such as gene expression analyses [4] and signal transduction pathways [5].

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.