Abstract
The Golgi apparatus is a significant membrane-bound organelle of eukaryotic cells that is made up of a series of flattened, stacked pouches (called cisternae). The Golgi apparatus packages proteins into membrane-bound vesicles, and so it is responsible for transporting, modifying, and packaging proteins and lipids into vesicles for delivery to targeted destinations. It belongs to the central organelle mediating system of eukaryotic cells. Functional defects of the Golgi apparatus are associated with many kinds of neurodegenerative diseases, such as Parkinson's and Alzheimer's diseases. Golgi-resident proteins play an important role in the Golgi apparatus' processing, which includes storing, packaging, and dispatching proteins. Identifying sub-Golgi protein types can help researchers to develop more effective therapies and drugs for diseases that result from disorders of Golgi-resident proteins. In this paper, we propose a computational model to discriminate cis-Golgi proteins from trans-Golgi proteins using a machine learning method. First, we use PseKNC, K-separated Bigrams, and PsePSSM as feature extraction techniques, and then we select the optimal features among those identified by PseKNC with the AdaBoost classifier. To create a balanced dataset out of the imbalanced set of Golgi proteins, we used the Random-SMOTE oversampling approach. Finally, we employed the SVM algorithm to distinguish cis-Golgi proteins from trans-Golgi proteins. The proposed method achieves promising performance, with accuracy of 96.5%, 96.5%, and 96.9% in the experiments with jackknife cross-validation, independent testing, and 10-fold cross-validation, respectively, which exceeds the performance of previous related work.
Highlights
The Golgi apparatus, known as the Golgi complex or Golgi body, is the central organelle that mediates protein and lipid transport within eukaryotic cells [1]
Typical animal cells may have fewer and larger Golgi apparatus units, while plant cells may contain as many as hundreds of smaller ones
None of the protein sequences in the training set have more than 40% pairwise identity with any other protein sequence in the training dataset, and none of the protein sequences in the test dataset have more than 25% pairwise identity with any other protein sequence in the test dataset
Summary
The Golgi apparatus, known as the Golgi complex or Golgi body, is the central organelle that mediates protein and lipid transport within eukaryotic cells [1]. It is located very near the rough endoplasmic reticulum (ER) and very near the nucleus. The number of Golgi apparatus bodies within a single eukaryotic cell varies. The Golgi apparatus receives proteins and lipids from the rough ER and modifies, sorts, concentrates, and packs them into sealed droplets called vesicles before sending them out to the cytoplasm. The Golgi apparatus is composed of a series of compartments called cisternae, which are fused and
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.