Abstract

BackgroundProtein subcellular localization plays a crucial role in understanding cell function. Proteins need to be in the right place at the right time, and combine with the corresponding molecules to fulfill their functions. Furthermore, prediction of protein subcellular location not only should be a guiding role in drug design and development due to potential molecular targets but also be an essential role in genome annotation. Taking the current status of image-based protein subcellular localization as an example, there are three common drawbacks, i.e., obsolete datasets without updating label information, stereotypical feature descriptor on spatial domain or grey level, and single-function prediction algorithm’s limited capacity of handling single-label database.ResultsIn this paper, a novel human protein subcellular localization prediction model MIC_Locator is proposed. Firstly, the latest datasets are collected and collated as our benchmark dataset instead of obsolete data while training prediction model. Secondly, Fourier transformation, Riesz transformation, Log-Gabor filter and intensity coding strategy are employed to obtain frequency feature based on three components of monogenic signal with different frequency scales. Thirdly, a chained prediction model is proposed to handle multi-label instead of single-label datasets. The experiment results showed that the MIC_Locator can achieve 60.56% subset accuracy and outperform the existing majority of prediction models, and the frequency feature and intensity coding strategy can be conducive to improving the classification accuracy.ConclusionsOur results demonstrate that the frequency feature is more beneficial for improving the performance of model compared to features extracted from spatial domain, and the MIC_Locator proposed in this paper can speed up validation of protein annotation, knowledge of protein function and proteomics research.

Highlights

  • Protein subcellular localization plays a crucial role in understanding cell function

  • To generally capture the essential local property of IHC image, Fourier transformation, Riesz transformation, Log-Gabor filter and intensity coding strategy are employed to obtain frequency feature based on three components of monogenic signal with several frequency scales. 2-dimension fast Fourier transform is employed to convert target protein channel from spatial domain into the frequency domain, and the Riesz transformation [46] is employed to obtain two frequency responses in orthogonal directions [47]

  • The numbers of subcellular locations involved in benchmark are seven, i.e., “Cytosol”, “Endoplasmic reticulum”, “Golgi apparatus”, “Nucleoli”, “Mitochondria”, “Nucleus” and “Vesicles”

Read more

Summary

Introduction

Protein subcellular localization plays a crucial role in understanding cell function. Prediction of protein subcellular location should be a guiding role in drug design and development due to potential molecular targets and be an essential role in genome annotation. Identifying the subcellular locations of proteins can improve our understanding of their functions, mechanisms of molecular interaction, genome annotation and identification of drug targets [1, 2]. Protein synthesized from ribosome must be transported to their corresponding subcellular locations to fulfill their functions. Understanding of protein subcellular localization can greatly improve target identification during drug discovery. It is well known that the traditional protein subcellular location annotation is derived from biological experiments in wet laboratory, computational models offer an attractive complement to time-consuming and laborious experimental methods [6, 7]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call