Abstract

Proteomics is a valuable tool for establishing and comparing the protein content of defined tissues, cell types, or subcellular structures. Its use in non-model species is currently limited because the identification of peptides critically depends on sequence databases. In this study, we explored the potential of a preliminary cDNA database for the non-model species Pisum sativum created by a small number of massively parallel pyrosequencing (MPSS) runs for its use in proteomics and compared it to comprehensive cDNA databases from Medicago truncatula and Arabidopsis thaliana created by Sanger sequencing. Each database was used to identify proteins from a pea leaf chloroplast envelope preparation. It is shown that the pea database identified more proteins with higher accuracy, although the sequence quality was low and the sequence contigs were short compared to databases from model species. Although the number of identified proteins in non-species-specific databases could potentially be increased by lowering the threshold for successful protein identifications, this strategy markedly increases the number of wrongly identified proteins. The identification rate with non-species-specific databases correlated with spectral abundance but not with the predicted membrane helix content, and strong conservation is necessary but not sufficient for protein identification with a non-species-specific database. It is concluded that massively parallel sequencing of cDNAs substantially increases the power of proteomics in non-model species.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call