Abstract

The paper introduces a new method designed for high-throughput protein structure determination. It is based on spotting proteins as microarrays at a density of ca. 2000-4000 samples per cm2 and recording Fourier transform infrared (FTIR) spectra by FTIR imaging. It also introduces a new protein library, called cSP92, which contains 92 well-characterized proteins. It has been designed to cover as well as possible the structural space, both in terms of secondary structures and higher level structures. Ascending stepwise linear regression (ASLR), partial least square (PLS) regression, and support vector machine (SVM) have been used to correlate spectral characteristics to secondary structure features. ASLR generally provides better results than PLS and SVM. The observation that secondary structure prediction is as good for protein microarray spectra as for the reference attenuated total reflection spectra recorded on the same samples validates the high throughput microarray approach. Repeated double cross-validation shows that the approach is suitable for the high accuracy determination of the protein secondary structure with root mean square standard error in the cross-validation of 4.9 ± 1.1% for α-helix, 4.6 ± 0.8% for β-sheet, and 6.3 ± 2.2% for the "other" structures when using ASLR.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call