Abstract
BackgroundThe structure conservation in various α-helix subclasses reveals the sequence and context dependent factors causing distortions in the α-helix. The sequence-structure relationship in these subclasses can be used to predict structural variations in α-helix purely based on its sequence. We train support vector machine(SVM) with dot product kernel function to discriminate between regular α-helix and non-regular α-helices purely based on the sequences, which are represented with various overall and position specific propensities of amino acids.ResultsWe characterize the structural distortions in five α-helix subclasses. The sequence structure correlation in the subclasses reveals that the increased propensity of proline, histidine, serine, aspartic acid and aromatic amino acids are responsible for the distortions in regular α-helix. The N-terminus of regular α-helix prefers neutral and acidic polar amino acids, while the C-terminus prefers basic polar amino acid. Proline is preferred in the first turn of regular α-helix , while it is preferred to produce kinked and curved subclasses. The SVM discriminates between regular α-helix and the rest with precision of 80.97% and recall of 88.05%.ConclusionsThe correlation between structural variation in helices and their sequences is manifested by the performance of SVM based on sequence features. The results presented here are useful for computational design of helices. The results are also useful for prediction of structural perturbations in helix sequence purely based on its sequence.
Highlights
The structure conservation in various a-helix subclasses reveals the sequence and context dependent factors causing distortions in the a-helix
The input dataset for Gaussian mixture modeling contains approximately 0.4 million octapeptide helices drawn from ASTRAL 95 dataset [11] based on the criteria defined in [12]
The prediction of structural variations in the helices based on their sequences using Support Vector Machines (SVMs) with an accuracy of 84.51% is the novel feature of the work
Summary
The structure conservation in various a-helix subclasses reveals the sequence and context dependent factors causing distortions in the a-helix. The sequence-structure relationship in these subclasses can be used to predict structural variations in a-helix purely based on its sequence. We train support vector machine(SVM) with dot product kernel function to discriminate between regular a-helix and non-regular a-helices purely based on the sequences, which are represented with various overall and position specific propensities of amino acids. The a-helix is the most important structural element in proteins, first described by Pauling in 1951 [1]. The helices in protein can be classified as left handed and right handed helix based on their handedness. The right handed a-helices are found more frequently in the proteins than their left handed counterparts [1]. The right handed a-helix is a regular structure with backbone torsion angles of = –63 and ψ = –43 [1,2,3]. The perturbations in the helix geometry give rise to different
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have