Abstract

Proteins are vital in all biological systems as they constitute the main structural and functional components of cells. Recent advances in mass spectrometry have brought the promise of complete proteomics by helping draft the human proteome. Yet, this commonly used protein sequencing technique has fundamental limitations in sensitivity. Here we propose a method for single-molecule (SM) protein sequencing. A major challenge lies in the fact that proteins are composed of 20 different amino acids, which demands 20 molecular reporters. We computationally demonstrate that it suffices to measure only two types of amino acids to identify proteins and suggest an experimental scheme using SM fluorescence. When achieved, this highly sensitive approach will result in a paradigm shift in proteomics, with major impact in the biological and medical sciences.

Highlights

  • Unique to protein sequencing is that a protein can be identified using incomplete information with reference to proteomic databases

  • As the median length of a protein ranges from 270 to 350 amino acids, it is not difficult to choose two amino acid types that appear more than 15 times in each protein

  • We used a canonical human proteome database based on Uniprot release 2014.04 [11]

Read more

Summary

Introduction

Unique to protein sequencing is that a protein can be identified using incomplete information with reference to proteomic databases. Consider a 2 bit fingerprinting scheme in which only two types of amino acids are labeled (figure 1). A consecutive read of 15 labeled amino acids is sufficient to identify up to 215 = 32 768 unique protein sequences.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call