Abstract

Many organelles inside and outside a living cell depend on the perfect behavior of Golgi apparatus for smooth and normal functioning. Its poor performance may lead to many inheritable diseases like diabetes and cancer. Therefore, it is highly crucial to detect any strange behavior of Golgi apparatus in advance. Accurate discrimination of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">cis</i> -Golgi from <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">trans</i> -Golgi proteins surely helps researchers identify the role of Golgi proteins in various diseases and assist pharmacists in drug development. In this work, various hybrid models of Bi-Profile Bayes, Bigram PSSM, Di-Peptide Composition, and Split Amphiphilic Pseudo Amino Acid Composition with SMOTE oversampling technique have been employed to discriminate Golgi protein types. Multiple linear Support Vector Machines have been used to exploit the discrimination power of these models. The proposed prediction system: <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Golgi-predictor</i> has shown significant performance and achieved promising results compared to other existing state-of-the-art techniques. Through the 10-fold cross-validation, the proposed system achieved an accuracy value of 97.6%, sensitivity value of 98.8%, specificity value of 96.5%, G-mean value of 97.6%, MCC value of 0.95, and F-score value of 0.97. Similarly, through the jackknife cross-validation, the achieved values for accuracy, sensitivity, specificity, G-mean, MCC, and F-score are respectively, 96.5%, 97.8%, 95.2%, 96.4%, 0.93, and 0.96. Moreover, through the independent dataset testing, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Golgi-predictor</i> demonstrated significant enhancement in performance over other techniques. The proposed methodology aims at supporting drug designers in pharmaceutical industry and assisting researchers from the fields of bioinformatics and computational biology towards better innovation in predicting the behavior of Golgi proteins.

Highlights

  • C ELLS in different organisms, as well as in different parts of the same organism, perform unique functions, and possess distinctive features

  • This paper presents a majority voting based ensemble of multiple Support Vector Machine (SVM) that exploits the discriminative power of various hybrid models constructed from Bi-Profile Bayes (BPB), Bigram Position Specific Scoring Matrix (PSSM), DiPeptide Composition (DPC), and split Amphiphilic Pseudo Amino Acid Composition (PseAAC) (Amph-PseAAC) features

  • We proposed the utilization of hybrid models in conjunction with SMOTE oversampling technique that are exploited by Linear-SVM for classification

Read more

Summary

Introduction

C ELLS in different organisms, as well as in different parts of the same organism, perform unique functions, and possess distinctive features. The eukaryotic cell holds a defined nucleus, genetic material in the form of DNA, and several other organelles, including cytoplasm, mitochondria, lysosome, endoplasmic reticulum, and Golgi apparatus that help it to carry out different activities such as digestion, movement, and reproduction [1]. Golgi apparatus is among the essential proteins that is composed of flattened sacs [2]. It further processes proteins and lipids received from endoplasmic reticulum [3] and package them for transportation to the exterior of cell or other locations in the same cell through secretory vesicles. Further processing of proteins and lipids inside Golgi apparatus happens systematically.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.