Abstract
BackgroundAmyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. Recent studies suggest that short protein regions trigger this aggregation. Thus, identifying these short peptides is critical for understanding diseases and finding potential therapeutic targets.ResultsWe propose a method, named Pafig (Prediction of amyloid fibril-forming segments) based on support vector machines, to identify the hexpeptides associated with amyloid fibrillar aggregates. The features of Pafig were obtained by a two-round selection from AAindex. Using a 10-fold cross validation test on Hexpepset dataset, Pafig performed well with regards to overall accuracy of 81% and Matthews correlation coefficient of 0.63. Pafig was used to predict the potential fibril-forming hexpeptides in all of the 64,000,000 hexpeptides. As a result, approximately 5.08% of hexpeptides showed a high aggregation propensity. In the predicted fibril-forming hexpeptides, the amino acids – alanine, phenylalanine, isoleucine, leucine and valine occurred at the higher frequencies and the amino acids – aspartic acid, glutamic acid, histidine, lysine, arginine and praline, appeared with lower frequencies.ConclusionThe performance of Pafig indicates that it is a powerful tool for identifying the hexpeptides associated with fibrillar aggregates and will be useful for large-scale analysis of proteomic data.
Highlights
Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases
We proposed the Pafig (Prediction of amyloid fibril-forming segments) to identify fibril-forming segments in proteins based on a support vector machine (SVM) [23,24]
(1) Every hexpeptide in the Hexpepset dataset was encoded by each physicochemical property
Summary
Amyloid fibrillar aggregates of proteins or polypeptides are known to be associated with many human diseases. As reviewed recently [13], there are two types of computational approaches used to investigate the aggregation propensity of peptides or proteins and to identify the segments most prone to form fibrils (hot spots). The second approach combines atomistic simulations of a protein segment with the microcrystal structure of short fibril-forming peptides to gain insight into aggregation propensity [1,20,21,22]. This approach may help to elucidate the structural details of ordered aggregates. In addition to the approaches described above, a sequence pattern obtained by saturation mutagenesis analysis [11] has been proposed to identify amyloidogenic stretches in proteins
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.