Machine Learning Framework: Predicting Protein Structural Features

Pramod Kumar,Subarna Roy,Vandana Mishra

doi:10.1007/978-981-10-7455-4_8

Abstract

Structural biology is a challenging scientific discipline that aims to uncover the topologies and shapes of biomolecules and macromolecules—that is, DNA, RNA, and proteins. Proteins are large macromolecules consisting of more than one chain of amino acids joined together in a linear chain by peptide bonds. Proteins are required in organisms; they help in all biological processes of cells. They catalyze biochemical reactions (enzymes), carry out key roles in cellular processes, and act as structural constituents, catalysis agents, signaling molecules, and molecular machines of every biological system. They are responsible for immune responses, can store molecules (e.g., casein and ovalbumin store amino acids), and are even responsible for cell mechanics (e.g., actin and myosin). The structure prediction of proteins is a difficult task with basic problems in computational biology, structural science, and structural biology. The complex structure of protein prediction has four different levels: (1) one-dimensional (1D) prediction of different structural features and linear chain of amino acids; (2) two-dimensional (2D) prediction of spatial arrangements between amino acids; (3) three-dimensional (3D) (tertiary) structural features prediction of a protein; and (4) four-dimensional (4D) (quaternary) structure prediction of multicomplex proteins. Researchers have recently used most of the various data mining methods, different scripting-based tools, and machine learning tools for structure prediction of a protein. In this chapter, we provide a comprehensive overview of proteins structure and use different data mining machine learning algorithms for protein structure prediction.

Full Text