Deep Learning Algorithm for Detection of Protein Remote Homology

Fahriye Gemci,Ulus Cevik,Turgay Ibrikci

doi:10.32604/csse.2023.032706

Abstract

The study aims to find a successful solution by using computer algorithms to detect remote homologous proteins, which is a significant problem in the bioinformatics field. In this experimental study, structural classification of proteins (SCOP) 1.53, SCOP benchmark, and the newly created SCOP protein database from the structural classification of proteins—extended (SCOPe) 2.07 were used to detect remote homolog proteins. N-gram method and then Term Frequency-Inverse Document Frequency (TF-IDF) weighting were performed to extract features of the protein sequences taken from these databases. Next, a smoothing process on the obtained features was performed to avoid misclassification. Finally, the proteins with balanced features were classified into remote homologs using the built deep learning architecture. As a result, remote homologous proteins have been detected with novel deep learning architecture using both negative and positive protein instances with a mean accuracy of 89.13% and a mean relative operating characteristic (ROC) score of 88.39%. This experiment demonstrated the following: 1) The successful outcome of this study in detecting remote homology is auspicious in discovering new proteins and thus in drug discovery in medicine. 2) Natural language processing (NLP) techniques were used successfully in bioinformatics, 3) the importance of choosing the correct n-value in the n-gram process, 4) the necessity of using not only positive but negative instances in a classification problem, and 5) how effective the processes, such as smoothing, are in the classification accuracy in an imbalanced dataset. 6) The deep learning architecture gives better results than the support vector machine (SVM) model on the smoothed data to detect proteins’ remote homology.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer Systems Science and Engineering	Publication Date: Jan 1, 2023
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Deep Learning Algorithm for Detection of Protein Remote Homology

Abstract

Talk to us

Similar Papers

More From: Computer Systems Science and Engineering

Lead the way for us

Similar Papers

Remote protein homology detection and fold recognition using two-layer support vector machine classifiers
Hilmi M Muda ... Razib M Othman
Computers in Biology and Medicine | VOL. 41
Hilmi M Muda, et. al.Hilmi M Muda ... Razib M Othman
25 Jun 2011
Computers in Biology and Medicine | VOL. 41

Some Effects of Negative Instances on the Formation of Simple Concepts
Janellen Huttenlocher
Psychological Reports | VOL. 11
Janellen HuttenlocherJanellen Huttenlocher
01 Aug 1962
Psychological Reports | VOL. 11

Effects of negative instances in concept acquisition using a verbal learning task.
Robert D Tennyson
Journal of Educational Psychology | VOL. 64
Robert D TennysonRobert D Tennyson
01 Jan 1973
Journal of Educational Psychology | VOL. 64

Latent Semantic Analysis- and Hierarchical Clustering-Based Method for Detecting Remote Protein Homology
Tianjiao Zhang ... Yadong Wang
-
Tianjiao Zhang, et. al.Tianjiao Zhang ... Yadong Wang
13 Jun 2016
13 Jun 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Learning Algorithm for Detection of Protein Remote Homology

Abstract

Talk to us

Similar Papers

More From: Computer Systems Science and Engineering