Prediction of protein secondary structure based on an improved channel attention and multiscale convolution module.

  • Abstract
  • References
  • Citations
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Prediction of the protein secondary structure is a key issue in protein science. Protein secondary structure prediction (PSSP) aims to construct a function that can map the amino acid sequence into the secondary structure so that the protein secondary structure can be obtained according to the amino acid sequence. Driven by deep learning, the prediction accuracy of the protein secondary structure has been greatly improved in recent years. To explore a new technique of PSSP, this study introduces the concept of an adversarial game into the prediction of the secondary structure, and a conditional generative adversarial network (GAN)-based prediction model is proposed. We introduce a new multiscale convolution module and an improved channel attention (ICA) module into the generator to generate the secondary structure, and then a discriminator is designed to conflict with the generator to learn the complicated features of proteins. Then, we propose a PSSP method based on the proposed multiscale convolution module and ICA module. The experimental results indicate that the conditional GAN-based protein secondary structure prediction (CGAN-PSSP) model is workable and worthy of further study because of the strong feature-learning ability of adversarial learning.

ReferencesShowing 10 of 50 papers
  • Open Access Icon
  • Cite Count Icon 139
  • 10.1073/pnas.0703700104
Protein folding by zipping and assembly.
  • Jul 17, 2007
  • Proceedings of the National Academy of Sciences
  • S Banu Ozkan + 3 more

  • Open Access Icon
  • Cite Count Icon 144
  • 10.1093/nar/gkm937
Remediation of the protein data bank archive
  • Dec 11, 2007
  • Nucleic Acids Research
  • Kim Henrick + 25 more

  • Open Access Icon
  • Cite Count Icon 2787
  • 10.1038/nsb1203-980
Announcing the worldwide Protein Data Bank.
  • Dec 1, 2003
  • Nature Structural & Molecular Biology
  • Helen Berman + 2 more

  • Open Access Icon
  • PDF Download Icon
  • Cite Count Icon 341
  • 10.1038/srep11476
Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning
  • Jun 22, 2015
  • Scientific Reports
  • Rhys Heffernan + 8 more

  • Cite Count Icon 35
  • 10.1016/j.jvcir.2020.102844
Protein secondary structure prediction based on integration of CNN and LSTM model
  • Jun 20, 2020
  • Journal of Visual Communication and Image Representation
  • Jinyong Cheng + 2 more

  • Cite Count Icon 268
  • 10.48550/arxiv.1411.1784
Conditional Generative Adversarial Nets
  • Nov 6, 2014
  • Mehdi Mirza + 1 more

  • Open Access Icon
  • Cite Count Icon 1689
  • 10.1093/bioinformatics/btg224
PISCES: a protein sequence culling server.
  • Aug 12, 2003
  • Bioinformatics
  • Guoli Wang + 1 more

  • Cite Count Icon 5000
  • 10.1016/0022-2836(78)90297-8
Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins
  • Mar 1, 1978
  • Journal of Molecular Biology
  • J Garnier + 2 more

  • Cite Count Icon 147
  • 10.1002/prot.10181
Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence.
  • Aug 26, 2002
  • Proteins: Structure, Function, and Bioinformatics
  • A Kloczkowski + 3 more

  • Cite Count Icon 5429
  • 10.1006/jmbi.1999.3091
Protein secondary structure prediction based on position-specific scoring matrices
  • Sep 1, 1999
  • Journal of Molecular Biology
  • David T Jones

CitationsShowing 10 of 12 papers
  • Research Article
  • Cite Count Icon 4
  • 10.1080/07391102.2024.2314264
Structural, functional, molecular docking analysis of a hypothetical protein from Talaromyces marneffei and its molecular dynamic simulation: an in-silico approach
  • Feb 3, 2024
  • Journal of Biomolecular Structure and Dynamics
  • Md Masudur Rahman Munna + 3 more

Talaromyces marneffei (formerly Penicillium marneffei) is an endemic pathogenic fungus in Southern China and Southeast Asia. It can cause disease in patients with travel-related exposure to this organism and high morbidity and mortality in acquired immune deficiency syndrome (AIDS). In this study, we analyzed the structure and function of a hypothetical protein from T. marneffei using several bioinformatics tools and servers to unveil novel pharmacological targets and design a peptide vaccine against specific epitopes. A total of seven functional epitopes were screened on the protein, and ‘STGVDMWSV’ was the most antigenic, non-allergenic and non-toxic. Molecular docking showed stronger affinity between the CTL epitope ‘STGVDMWSV’ and the MHC I allele HLA-A*02:01, a higher docking score −234.98 kcal/mol, revealed stable interactions during a 100 ns molecular dynamic simulation. Overall, the results of this study revealed that this hypothetical protein is crucial for comprehending biochemical, physiological pathways and identifying novel therapeutic targets for human health.

  • Research Article
  • 10.1016/j.compbiomed.2025.110457
DCBLSTM-Deep Convolutional Bidirectional Long Short-Term Memory neural network for Q8 secondary protein structure prediction.
  • Sep 1, 2025
  • Computers in biology and medicine
  • Suvidhi Banthia + 3 more

DCBLSTM-Deep Convolutional Bidirectional Long Short-Term Memory neural network for Q8 secondary protein structure prediction.

  • Open Access Icon
  • Research Article
  • Cite Count Icon 1
  • 10.18502/ijpa.v18i3.13753
In Silico Vaccine Design and Expression of the Multi-Component Protein Candidate against the Toxoplasma gondii Parasite from MIC13, GRA1, and SAG1 Antigens.
  • Oct 4, 2023
  • Iranian Journal of Parasitology
  • Zahra Hosseininejad + 7 more

We aimed to design a B and T cell recombinant protein vaccine of Toxoplasma gondii with in silico approach. MIC13 plays an important role in spreading the parasite in the host body. GRA1 causes the persistence of the parasite in the parasitophorous vacuole. SAG1 plays a role in host-cell adhesion and cell invasion. Amino acid positions 73-272 from MIC13, 71-190 from GRA1, and 101-300 from SAG1 were selected and joined with linker A(EAAAK)A. The structures, antigenicity, allergenicity, physicochemical properties, as well as codon optimization and mRNA structure of this recombinant protein called MGS1, were predicted using bioinformatics servers. The designed structure was synthesized and then cloned in pET28a (+) plasmid and transformed into Escherichia coli BL21. The number of amino acids in this antigen was 555, and its antigenicity was estimated to be 0.6340. SDS-PAGE and Western blotting confirmed gene expression and successful production of the protein with a molecular weight of 59.56kDa. This protein will be used in our future studies as an anti-Toxoplasma vaccine candidate in animal models. In silico methods are efficient for understanding information about proteins, selecting immunogenic epitopes, and finally producing recombinant proteins, as well as reducing the time and cost of vaccine design.

  • Research Article
  • Cite Count Icon 3
  • 10.1007/s00521-024-09822-8
An improved multi-scale convolutional neural network with gated recurrent neural network model for protein secondary structure prediction
  • May 13, 2024
  • Neural Computing and Applications
  • Vrushali Bongirwar + 1 more

An improved multi-scale convolutional neural network with gated recurrent neural network model for protein secondary structure prediction

  • Research Article
  • Cite Count Icon 3
  • 10.1007/978-1-0716-4213-9_1
Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences.
  • Nov 14, 2024
  • Methods in molecular biology (Clifton, N.J.)
  • Jian Zhang + 4 more

The secondary structures (SSs) and supersecondary structures (SSSs) underlie the three-dimensional structure of proteins. Prediction of the SSs and SSSs from protein sequences enjoys high levels of use and finds numerous applications in the development of a broad range of other bioinformatics tools. Numerous sequence-based predictors of SS and SSS were developed and published in recent years. We survey and analyze 45 SS predictors that were released since 2018, focusing on their inputs, predictive models, scope of their prediction, and availability. We also review 32 sequence-based SSS predictors, which primarily focus on predicting coiled coils and beta-hairpins and which include five methods that were published since 2018. Substantial majority of these predictive tools rely on machine learning models, including a variety of deep neural network architectures. They also frequently use evolutionary sequence profiles. We discuss details of several modern SS and SSS predictors that are currently available to the users and which were published in higher impact venues.

  • Research Article
  • 10.1038/s41598-025-17513-0
Combining knowledge distillation and neural networks to predict protein secondary structure
  • Aug 31, 2025
  • Scientific Reports
  • Lufei Zhao + 3 more

The secondary structure of a protein serves as the foundation for constructing its three-dimensional (3D) structure, which in turn is critical for determining its function and role in biological processes. Therefore, accurately predicting secondary structure not only facilitates the understanding of a protein’s 3D conformation but also provides essential insights into its interactions, functional mechanisms, and potential applications in biomedical research. Deep learning models are particularly effective in protein secondary structure prediction because of their ability to process complex sequence data and extract meaningful patterns, thereby increasing prediction accuracy and efficiency. This study proposes a combined model, ITBM-KD, which integrates an improved temporal convolutional network (TCN), bidirectional recurrent neural network (BiRNN), and multilayer perceptron (MLP) to increase the accuracy of protein secondary structure prediction for octapeptides and tripeptides. By combining one-hot encoding, word vector representation of physicochemical properties, and knowledge distillation with the ProtT5 model, the proposed model achieves excellent performance on multiple datasets. To evaluate its effectiveness, two classic datasets, TS115 and CB513, containing 115 and 513 protein datasets, respectively, were used. In addition, 15,078 protein data points collected from the PDB database from June 6, 2018, to June 6, 2020, were used to further verify the robustness and generalizability of the model. This study improves prediction accuracy and provides an essential model for understanding protein structure and function, especially in resource-limited settings.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-981-99-9621-6_22
AI-Assisted Methods for Protein Structure Prediction and Analysis
  • Jan 1, 2024
  • Divya Goel + 2 more

AI-Assisted Methods for Protein Structure Prediction and Analysis

  • Research Article
  • Cite Count Icon 1
  • 10.1109/embc40787.2023.10340202
A Reliable Approach for Fabricating Tissue-Mimicking Phantoms with Designated Dielectric Properties from 16 MHz to 3 GHz.
  • Jul 24, 2023
  • Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
  • Guofang Xu + 6 more

Tissue-mimicking dielectric phantoms are widely used to mimic the relative permittivity and conductivity of human tissues in various medical applications. The artificial material combinations determine the characterization of dialectic phantoms. However, a method that reliably determined the composition of artificial materials with designed values of dielectric properties and frequency is still lacking. In this work, we propose a method that easily determine the compositions of phantom to mimic the human tissues from 16 MHz to 3 GHz.

  • Open Access Icon
  • PDF Download Icon
  • Research Article
  • Cite Count Icon 4
  • 10.1038/s41598-024-67403-0
Prediction of protein secondary structure by the improved TCN-BiLSTM-MHA model with knowledge distillation
  • Jul 17, 2024
  • Scientific Reports
  • Lufei Zhao + 4 more

Secondary structure prediction is a key step in understanding protein function and biological properties and is highly important in the fields of new drug development, disease treatment, bioengineering, etc. Accurately predicting the secondary structure of proteins helps to reveal how proteins are folded and how they function in cells. The application of deep learning models in protein structure prediction is particularly important because of their ability to process complex sequence information and extract meaningful patterns and features, thus significantly improving the accuracy and efficiency of prediction. In this study, a combined model integrating an improved temporal convolutional network (TCN), bidirectional long short-term memory (BiLSTM), and a multi-head attention (MHA) mechanism is proposed to enhance the accuracy of protein prediction in both eight-state and three-state structures. One-hot encoding features and word vector representations of physicochemical properties are incorporated. A significant emphasis is placed on knowledge distillation techniques utilizing the ProtT5 pretrained model, leading to performance improvements. The improved TCN, achieved through multiscale fusion and bidirectional operations, allows for better extraction of amino acid sequence features than traditional TCN models. The model demonstrated excellent prediction performance on multiple datasets. For the TS115, CB513 and PDB (2018–2020) datasets, the prediction accuracy of the eight-state structure of the six datasets in this paper reached 88.2%, 84.9%, and 95.3%, respectively, and the prediction accuracy of the three-state structure reached 91.3%, 90.3%, and 96.8%, respectively. This study not only improves the accuracy of protein secondary structure prediction but also provides an important tool for understanding protein structure and function, which is particularly applicable to resource-constrained contexts and provides a valuable tool for understanding protein structure and function.

  • Book Chapter
  • Cite Count Icon 5
  • 10.1016/b978-0-443-22299-3.00014-1
Chapter 14 - Generative adversarial networks in protein and ligand structure generation: a case study
  • Jan 1, 2024
  • Deep Learning Applications in Translational Bioinformatics
  • Syed Aslah Ahmad Faizi + 3 more

Chapter 14 - Generative adversarial networks in protein and ligand structure generation: a case study

Similar Papers
  • Research Article
  • Cite Count Icon 8
  • 10.1007/s00500-022-06783-9
OneHotEncoding and LSTM-based deep learning models for protein secondary structure prediction
  • Feb 12, 2022
  • Soft Computing
  • Vamsidhar Enireddy + 2 more

Protein Secondary Structure (PSS) prediction emerges as a hot topic in the area of bioinformatics.PSS helps to predict the tertiary structure and helps to understand its structures, which in turn helps to design various drugs. The existing PSS prediction techniques are capable of achieving Q3 accuracy of nearly 80% and have no improvement till now. In this paper, we propose a novel technique that uses amino acid sequences alone as an input feature and the respected feature vector matrix is given through the deep learning model (DLM) for PSS prediction. We use OneHotEncoding and LSTM (Long Short Term Memory) technique to forecast PSS that helps to achieve more accuracy. The OneHotEncoder is used to extract the local contexts of amino-acid sequences, and LSTM captures the long-distance interdependencies among aminoacids. The overall implementation is carried in MATLAB 2020a. The performance of this model is evaluated in terms of precision, recall, F1-score, and by the percentage of accuracy of both Q3 and Q8 secondary structure predictions. The Q3 structure of the proposed scheme gained 86.54, 85.2 and 85.7%CullPDB, CASP10, and CASP11 and the accuracy of Q8 is 77.8, 72.5 and 74.9% on the benchmark datasets such as CullPDB, CASP10, and CASP11 respectively. Some of the advantages of the proposed scheme are minimize the computation time and achieves better accuracy when compared to the other baseline models in the prediction of PSS.

  • Book Chapter
  • Cite Count Icon 4
  • 10.1007/978-3-642-04759-6_5
Data Mining for Protein Secondary Structure Prediction
  • Jan 1, 2009
  • Haitao Cheng + 3 more

Accurate protein secondary structure prediction from the amino acid sequence is essential for almost all theoretical and experimental studies on protein structure and function. After a brief discussion of application of data mining for optimization of crystallization conditions for target proteins we show that data mining of structural fragments of proteins from known structures in the protein data bank (PDB) significantly improves the accuracy of secondary structure predictions. The original method was proposed by us a few years ago and was termed fragment database mining (FDM) (Cheng H, Sen TZ, Kloczkowski A, Margaritis D, Jernigan RL (2005) Prediction of protein secondary structure by mining structural fragment database. Polymer 46:4314–4321). This method gives excellent accuracy for predictions if similar sequence fragments are available in our library of structural fragments, but is less successful if such fragments are absent in the fragments database. Recently we have improved secondary structure predictions further by combining FDM with classical GOR V (Kloczkowski A, Ting KL, Jernigan RL, Garnier J (2002a) Combining the GOR V algorithm with evolutionary information for protein secondary structure prediction from amino acid sequence. Proteins 49:154–66; Sen TZ, Jernigan RL, Garnier J, Kloczkowski A (2005) GOR V server for protein secondary structure prediction. Bioinformatics 21:2787–8) predictions to form a combined method, so-called consensus database mining (CDM) (Sen TZ, Cheng H, Kloczkowski A, Jernigan RL (2006) A Consensus Data Mining secondary structure prediction by combining GOR V and Fragment Database Mining. Protein Sci 15:2499–506). FDM mines the structural segments of PDB, and utilizes structural information from the matching sequence fragments for the prediction of protein secondary structures. By combining it with the GOR V secondary structure prediction method, which is based on information theory and Bayesian statistics, coupled with evolutionary information from multiple sequence alignments (MSA), our CDM method guarantees improved accuracies of prediction. Additionally, with the constant growth in the number of new protein structures and folds in the PDB, the accuracy of the CDM method is clearly expected to increase in future. We have developed a publicly available CDM server (Cheng H, Sen TZ, Jernigan RL, Kloczkowski A (2007) Consensus Data Mining (CDM) Protein Secondary Structure Prediction Server: combining GOR V and Fragment Database Mining (FDM). Bioinformatics 23:2628–30) (http://gor.bb.iastate.edu/cdm) for protein secondary structure prediction.

  • Research Article
  • Cite Count Icon 65
  • 10.1016/j.knosys.2016.11.015
Protein secondary structure prediction by using deep learning method
  • Nov 17, 2016
  • Knowledge-Based Systems
  • Yangxu Wang + 2 more

Protein secondary structure prediction by using deep learning method

  • Research Article
  • Cite Count Icon 117
  • 10.1186/1471-2105-8-201
Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information.
  • Jun 14, 2007
  • BMC Bioinformatics
  • Gianluca Pollastri + 3 more

BackgroundStructural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio.ResultsHere we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available.ConclusionThe predictive system are publicly available at the address .

  • Abstract
  • 10.1016/j.bpj.2017.11.2393
Combining Prediction of Protein Aggregation Propensities with Prediction of Other One-Dimensional Properties
  • Feb 1, 2018
  • Biophysical Journal
  • Andrzej Kloczkowski + 4 more

Combining Prediction of Protein Aggregation Propensities with Prediction of Other One-Dimensional Properties

  • Conference Article
  • Cite Count Icon 8
  • 10.23919/indiacom54597.2022.9763114
Improving Prediction of Protein Secondary Structures using Attention-enhanced Deep Neural Networks
  • Mar 23, 2022
  • Mukhtar Ahmad Sofi + 1 more

Protein secondary structure prediction is one of the hot research topics in computation biology. Accurate prediction of protein Secondary structures provide insights into drug discovery and design of enzyme. In addition, it plays an instrumental role in identifying structural-classes, protein-folds, and its three dimensional structure. However, the experimental determination of protein secondary structures is laborious and costly. It, therefore, hinges much on the use of computational techniques for prediction of secondary structures. In recent years, deep neural networks have been used extensively for protein secondary structure prediction. However, the deep learning models focusing on extracting local dependencies of a protein sequence face difficulties in effectively extracting non-local dependencies. Although LSTM recurrent neural network solved the problem of handling long range dependencies, these models suffer from vanishing gradients, exploding gradients and shallow layers. Moreover, these models fail to capture the dependencies that are very long. In this paper, we propose Attention augmented deep CNN-LSTM method to circumvent issues faced in LSTM RNNs. Our proposed model is able to efficiently capture both local and long range dependencies for enhancing the prediction of secondary structures. Experiments were conducted on CB6133, CB513, CASP10 and CASP11 benchmark datasets. The experimental results indicate that the performance of our method is better than the baseline methods.

  • Research Article
  • Cite Count Icon 21
  • 10.1093/bioinformatics/9.2.147
Prediction of protein secondary structures by a neural network.
  • Jan 1, 1993
  • Bioinformatics
  • Fumiyoshi Sasagawa + 1 more

We have studied the prediction of globular protein secondary structures by neural networks. Protein secondary structures are allocated to amino acid residues using Kabsch and Sander's dictionary of protein secondary structures and the neural network is taught the protein secondary structures. The input layer of the neural network allows sequences of residues including 20 amino acids, chain break, B, X and Z. We consider classifying secondary structures into groups of 3, 4 and 8. In each case, we calculate the percentage of correct predictions. We discuss the effect of overlearning on the protein secondary structure prediction. In addition, we include the application of a neural network with a modular architecture to prediction of protein secondary structures. We compare the results from neural networks with a modular architecture and with a simple three-layer structure.

  • Research Article
  • Cite Count Icon 7
  • 10.1371/journal.pone.0254555
The influence of dataset homology and a rigorous evaluation strategy on protein secondary structure prediction.
  • Jul 14, 2021
  • PloS one
  • Teng-Ruei Chen + 3 more

The secondary structure prediction (SSP) of proteins has long been an essential structural biology technique with various applications. Despite its vital role in many research and industrial fields, in recent years, as the accuracy of state-of-the-art secondary structure predictors approaches the theoretical upper limit, SSP has been considered no longer challenging or too challenging to make advances. With the belief that the substantial improvement of SSP will move forward many fields depending on it, we conducted this study, which focused on three issues that have not been noticed or thoroughly examined yet but may have affected the reliability of the evaluation of previous SSP algorithms. These issues are all about the sequence homology between or within the developmental and evaluation datasets. We thus designed many different homology layouts of datasets to train and evaluate SSP prediction models. Multiple repeats were performed in each experiment by random sampling. The conclusions obtained with small experimental datasets were verified with large-scale datasets using state-of-the-art SSP algorithms. Very different from the long-established assumption, we discover that the sequence homology between query datasets for training, testing, and independent tests exerts little influence on SSP accuracy. Besides, the sequence homology redundancy between or within most datasets would make the accuracy of an SSP algorithm overestimated, while the redundancy within the reference dataset for extracting predictive features would make the accuracy underestimated. Since the overestimating effects are more significant than the underestimating effect, the accuracy of some SSP methods might have been overestimated. Based on the discoveries, we propose a rigorous procedure for developing SSP algorithms and making reliable evaluations, hoping to bring substantial improvements to future SSP methods and benefit all research and application fields relying on accurate prediction of protein secondary structures.

  • Book Chapter
  • Cite Count Icon 27
  • 10.1007/978-3-319-12883-2_19
Secondary and Tertiary Structure Prediction of Proteins: A Bioinformatic Approach
  • Nov 30, 2014
  • Minu Kesheri + 3 more

Correct prediction of secondary and tertiary structure of proteins is one of the major challenges in bioinformatics/computational biological research. Predicting the correct secondary structure is the key to predict a good/satisfactory tertiary structure of the protein which not only helps in prediction of protein function but also in prediction of sub-cellular localization. This chapter aims to explain the different algorithms and methodologies, which are used in secondary structure prediction. Similarly, tertiary structure prediction has also emerged as one of developing areas of bioinformatics/computational biological research owing to the large gap between the available number of protein sequences and the known experimentally solved structures. Because of time and cost intensive experimental methods, experimentally determined structures are not available for vast majority of the available protein sequences present in public domain databases. The primary aim of this chapter is to offer a detailed conceptual insight to the algorithms used for protein secondary and tertiary structure prediction. This chapter systematically illustrates flowchart for selecting the most accurate prediction algorithm among different categories for the target sequence against three categories of tertiary structure prediction methods. Out of the three methods, homology modeling which is considered as most reliable method is discussed in detail followed by strengths and limitations for each of these categories. This chapter also explains different practical and conceptual problems, obstructing the high accuracy of the protein structure in each of the steps for all the three methods of tertiary structure prediction. The popular hybrid methodologies which further club together a number of features such as structural alignments, solvent accessibility and secondary structure information are also discussed. Moreover, this chapter elucidates about the Meta-servers that generate consensus result from many servers to build a protein model of high accuracy. Lastly, scope for further research in order to bridge existing gaps and for developing better secondary and tertiary structure prediction algorithms is also highlighted.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/cibcb.2016.7758118
Protein secondary structure prediction through a novel framework of secondary structure transition sites and new encoding schemes
  • Oct 1, 2016
  • Masood Zamani + 1 more

In this paper, we propose an ab initio two-stage protein secondary structure (PSS) prediction model through a novel framework of PSS transition site prediction by using Artificial Neural Networks (ANNs) and Genetic Programming (GP). In the proposed classifier, protein sequences are encoded by new amino acid encoding schemes derived from genetic Codon mappings, Clustering and Information theory. In the first stage, sequence segments are mapped to regions in the Ramachandran map (2D-plot), and weight scores are computed by using statistical information derived from clusters. In addition, score vectors are constructed for the mapped regions using the weight scores and PSS transition sites. The score vectors have fewer dimensions compared to those of commonly used encoding schemes and protein profile. In the second stage, a two-tier classifier is employed based on an ANN and a GP method. The performance of the two-stage classifier is compared to the state-of-the-art cascaded Machine Learning methods which commonly employ ANNs. The prediction method is examined with the latest dataset of nonhomologous protein sequences, PISCES [1]. The experimental results and statistical analyses indicate a significantly higher distribution of Q 3 scores, approximately 7% with p-value < 0.001, in comparison to that of cascaded ANN architectures. PSS transition sites are valuable information about the topological property of protein sequences and incorporating the information improves the overall performance of the PSS prediction model.

  • Research Article
  • Cite Count Icon 3
  • 10.1371/journal.pone.0254555.r004
The influence of dataset homology and a rigorous evaluation strategy on protein secondary structure prediction
  • Jul 14, 2021
  • PLoS ONE
  • Teng-Ruei Chen + 4 more

The secondary structure prediction (SSP) of proteins has long been an essential structural biology technique with various applications. Despite its vital role in many research and industrial fields, in recent years, as the accuracy of state-of-the-art secondary structure predictors approaches the theoretical upper limit, SSP has been considered no longer challenging or too challenging to make advances. With the belief that the substantial improvement of SSP will move forward many fields depending on it, we conducted this study, which focused on three issues that have not been noticed or thoroughly examined yet but may have affected the reliability of the evaluation of previous SSP algorithms. These issues are all about the sequence homology between or within the developmental and evaluation datasets. We thus designed many different homology layouts of datasets to train and evaluate SSP prediction models. Multiple repeats were performed in each experiment by random sampling. The conclusions obtained with small experimental datasets were verified with large-scale datasets using state-of-the-art SSP algorithms. Very different from the long-established assumption, we discover that the sequence homology between query datasets for training, testing, and independent tests exerts little influence on SSP accuracy. Besides, the sequence homology redundancy between or within most datasets would make the accuracy of an SSP algorithm overestimated, while the redundancy within the reference dataset for extracting predictive features would make the accuracy underestimated. Since the overestimating effects are more significant than the underestimating effect, the accuracy of some SSP methods might have been overestimated. Based on the discoveries, we propose a rigorous procedure for developing SSP algorithms and making reliable evaluations, hoping to bring substantial improvements to future SSP methods and benefit all research and application fields relying on accurate prediction of protein secondary structures.

  • Research Article
  • Cite Count Icon 3
  • 10.1007/s13721-021-00304-8
Swarm optimization-based neural network model for secondary structure prediction of proteins
  • Apr 30, 2021
  • Network Modeling Analysis in Health Informatics and Bioinformatics
  • Sana Akbar + 2 more

Proteins form the basis of all major life processes that sustain life. The functionality of a protein is a direct consequence of its underlying structure. Protein structure prediction thus serves to ascertain the function of similar or dissimilar proteins, accordingly. Secondary structure prediction paves way for 3D structures that eventually decides protein properties. It also aims to facilitate probable structures for proteins whose structures remain undiscovered. Although experimental approaches have been quite efficient in extracting protein secondary structure from its amino acid sequence, yet it is often cumbersome and time intensive to achieve it in vitro. Hence, computational approaches are required to predict secondary structures for the diverse amino acids constituting these proteins. However, the available computational models fail to register good prediction accuracy due to inadequate modelling of sequence-structure relationship. Also, the dearth of global exploration-based methods further makes them ineffective in catering to the evolving proteomic data. Accordingly, PSO (Particle swarm optimization) has been explored to propose a neural network model for protein secondary structure prediction (PSSP). Six standard datasets namely- PSS504, RS126, EVA6, CB396, Manesh and CB513 have been utilized for the training and testing of the neural network. The proposed model is evaluated on the basis of its Q3 accuracy, precision, and recall. The 10, 20, 30 and 40 fold cross validation in combination with sensitivity analysis and has been carried out for verification of results. The proposed model is found to outperform most of the existing models by demonstrating a better average Q3 accuracy lying above 81% for PSSP.

  • Conference Article
  • 10.1109/icmlc.2005.1527519
Application of PBIL algorithm to prediction of protein secondary structure
  • Jan 1, 2005
  • Bing-Yao Jin + 3 more

Prediction of protein secondary structure has not been resolved in bioinformatics for over thirty years. Numerous methods have been developed to conquer this problem so far, but the results of most methods are not satisfactory. The Chou-Fasman method is simple, straightforward, and instructive to biologists and chemists, although its prediction accuracy is not as good as some newly developed learning algorithms such as neural network and SVM. This article presents the first attempt to predict protein secondary structure by means of PBIL algorithm. The idea is to predict the secondary structure by statistically optimal functions based on rules derived from the sequence-structure data. These rules, as part of optimal or tabu functions, are quite important to the success of this algorithm. The concept of probability of secondary structure corresponding to amino acids in sequence has been successfully applied to calculating the optimal function, providing a new approach to prediction of protein secondary structure.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/cibcb.2015.7300327
Protein secondary structure prediction using an evolutionary computation method and clustering
  • Aug 1, 2015
  • Masood Zamani + 1 more

In this paper, we evaluated the performance of an evolutionary-based protein secondary structure (PSS) prediction model which uses the information of amino acid sequences extracted by a clustering technique. The dimension of the classifier's inputs is reduced using a k-means clustering method on sequence segments. The proposed PSS classifier is based on a Genetic Programming (GP) approach that uses IF rules for a multi-target classifier. The GP classifier is evaluated by using protein sequences and the sequence information obtained from the k-means clustering. The GP prediction model's performance is compared with those of feed-forward artificial neural networks (ANNs) and support vector machines (SVMs). The prediction methods are examined with two protein datasets RS126 and CB513. The performance of the three classification models are measured according to Q 3 and segment overlap (SOV) scores. The prediction models which use clustered data result in average 2% higher prediction accuracy than those using sequence data. In addition, the experimental results indicate the GP model's prediction scores are in average 3% higher than those of the ANN and SVMs models when amino acid sequences or clustered information are explored.

  • Abstract
  • 10.1016/j.bpj.2016.11.1100
Prediction of Protein Aggregation Propensities using GOR Method
  • Feb 1, 2017
  • Biophysical Journal
  • Maksim Kouza + 5 more

Prediction of Protein Aggregation Propensities using GOR Method

More from: Frontiers in bioengineering and biotechnology
  • New
  • Research Article
  • 10.3389/fbioe.2025.1728779
Editorial: Effect of mechanical loading on the tendon for tissue engineering approaches
  • Nov 6, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Clemens Gögele + 4 more

  • New
  • Research Article
  • 10.3389/fbioe.2025.1703902
Recent progress of nanomaterials for diagnosis and treatment of rejection in heart transplantation
  • Nov 6, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Guangyin Li + 8 more

  • New
  • Research Article
  • 10.3389/fbioe.2025.1655295
A motion capture protocol for the kinematic analysis of transfemoral and transtibial sprinters
  • Nov 4, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Roberto Di Marco + 7 more

  • New
  • Research Article
  • 10.3389/fbioe.2025.1664917
A finite element analysis on the execution effects of two novel distalization sequences in clear aligners
  • Nov 3, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Hongcheng Xing + 2 more

  • New
  • Research Article
  • 10.3389/fbioe.2025.1693678
Parametrized statistical appearance and shape modelling strategy to predict proximal and diaphyseal femoral fractures
  • Nov 3, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Özgür Cebeci + 2 more

  • New
  • Research Article
  • 10.3389/fbioe.2025.1657653
Biofabrication of 3D-printed, pre-cross-linked alginate dialdehyde–gelatin (ADA–GEL) scaffolds for an in vivo metastatic arteriovenous loop tumor model
  • Nov 3, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Evelin Sandor + 13 more

  • New
  • Research Article
  • 10.3389/fbioe.2025.1641709
Umbilical cord mesenchymal stem cell-derived exosomes promote wound healing and skin regeneration via the regulation of inflammation and angiogenesis
  • Nov 3, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Yulin Yang + 10 more

  • Research Article
  • 10.3389/fbioe.2025.1702899
Pickering emulsion loaded with total flavonoids from Dracocephalum moldavica L. potentially promotes angiogenesis in the ischemic penumbra after cerebral ischemia reperfusion
  • Oct 31, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Tianyi Gao + 8 more

  • Research Article
  • 10.3389/fbioe.2025.1646500
Bone regeneration during osteoporosis: a translational in vivo monitoring of callus mechanical parameters
  • Oct 29, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Juan J Toscano-Angulo + 8 more

  • Research Article
  • 10.3389/fbioe.2025.1656421
Parametric bionic hand-inspired optimization of femoral condylar prosthesis attachment surfaces
  • Oct 29, 2025
  • Frontiers in Bioengineering and Biotechnology
  • Lin Wang + 3 more

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon