Abstract
BackgroundBioluminescent proteins (BLPs) widely exist in many living organisms. As BLPs are featured by the capability of emitting lights, they can be served as biomarkers and easily detected in biomedical research, such as gene expression analysis and signal transduction pathways. Therefore, accurate identification of BLPs is important for disease diagnosis and biomedical engineering. In this paper, we propose a novel accurate sequence-based method named PredBLP (Prediction of BioLuminescent Proteins) to predict BLPs.ResultsWe collect a series of sequence-derived features, which have been proved to be involved in the structure and function of BLPs. These features include amino acid composition, dipeptide composition, sequence motifs and physicochemical properties. We further prove that the combination of four types of features outperforms any other combinations or individual features. To remove potential irrelevant or redundant features, we also introduce Fisher Markov Selector together with Sequential Backward Selection strategy to select the optimal feature subsets. Additionally, we design a lineage-specific scheme, which is proved to be more effective than traditional universal approaches.ConclusionExperiment on benchmark datasets proves the robustness of PredBLP. We demonstrate that lineage-specific models significantly outperform universal ones. We also test the generalization capability of PredBLP based on independent testing datasets as well as newly deposited BLPs in UniProt. PredBLP is proved to be able to exceed many state-of-art methods. A web server named PredBLP, which implements the proposed method, is free available for academic use.
Highlights
Bioluminescent proteins (BLPs) widely exist in many living organisms
To investigate the amino acid preference of BLPs, we calculate the features of amino acid composition (AAC) for BLPs and non-BLPs respectively
We empirically demonstrate that amino acid compositions with relative difference higher than 0.25% are discriminatory
Summary
Bioluminescent proteins (BLPs) widely exist in many living organisms. As BLPs are featured by the capability of emitting lights, they can be served as biomarkers and detected in biomedical research, such as gene expression analysis and signal transduction pathways. We propose a novel accurate sequence-based method named PredBLP (Prediction of BioLuminescent Proteins) to predict BLPs. Bioluminescence is a special process of chemiluminescence, which is common in many living organisms across the lineages of bacteria, eukaryota and archaea [1]. Bioluminescent proteins (BLPs), with the capability of emitting light by converting chemical energy to light energy, play a critical role in bioluminescence [2, 3]. Employed as highly sensitive labels, they are enormously useful in non-invasive in-vivo biomedical research, such as gene expression analyses [4] and signal transduction pathways [5].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.