Global detection of human variants and isoforms by deep proteome sequencing

Jesse G. Meyer,Benjamin J. Blencowe,Alexander S. Hebert,Michael S. Westphall,Joshua J. Coon,Dain R. Brademan,Jürgen Cox,Harald Marx,Pavel Sinitcyn,Alicia L. Richards,Robert J. Weatheritt,Evgenia Shishkova

doi:10.1038/s41587-023-01714-x

Abstract

An average shotgun proteomics experiment detects approximately 10,000 human proteins from a single sample. However, individual proteins are typically identified by peptide sequences representing a small fraction of their total amino acids. Hence, an average shotgun experiment fails to distinguish different protein variants and isoforms. Deeper proteome sequencing is therefore required for the global discovery of protein isoforms. Using six different human cell lines, six proteases, deep fractionation and three tandem mass spectrometry fragmentation methods, we identify a million unique peptides from 17,717 protein groups, with a median sequence coverage of approximately 80%. Direct comparison with RNA expression data provides evidence for the translation of most nonsynonymous variants. We have also hypothesized that undetected variants likely arise from mutation-induced protein instability. We further observe comparable detection rates for exon–exon junction peptides representing constitutive and alternative splicing events. Our dataset represents a resource for proteoform discovery and provides direct evidence that most frame-preserving alternatively spliced isoforms are translated.

Full Text