Abstract

Rigorous validation of amino acid sequence is fundamental in the characterization of original and biosimilar protein biopharmaceuticals. Widely accepted workflows are based on bottom-up mass spectrometry, and they often require multiple techniques and significant manual work. Here, we demonstrate that optimization of a set of tandem mass spectroscopy (MS/MS) collision energies and automated combination of all available information in the measurements can increase the sequence validated by one technique close to the inherent limits. We created a software (called "Serac") that consumes results of the Mascot database search engine and identifies the amino acids validated by bottom-up MS/MS experiments using the most rigorous, industrially acceptable definition of sequence coverage (we term this "confirmed sequence coverage"). The software can combine spectra at the level of amino acids or fragment ions to exploit complementarity, provides full transparency to justify validation, and reduces manual effort. With its help, we investigated collision energy dependence of confirmed sequence coverage of individual peptides and full proteins on trypsin-digested monoclonal antibody samples (rituximab and trastuzumab). We found the energy dependence to be modest, but we demonstrated the benefit of using spectra taken at multiple energies. We describe a workflow based on 2-3 LC-MS/MS runs, carefully selected collision energies, and a fragment ion level combination, which yields ∼85% confirmed sequence coverage, 25%-30% above that from a basic proteomics protocol. Further increase can mainly be expected from alternative digestion enzymes or fragmentation techniques, which can be seamlessly integrated to the processing, thereby allowing effortless validation of full sequences.

Highlights

  • The past decades have witnessed an immense growth in the production and usage of biopharmaceuticals

  • Middle-down workflows seem to be effective, where recent developments have allowed sequence coverages up to 90% to be achieved,[12−16] while the record value for top-down methods remains only ∼40%− 55%.15−20 the protocols routinely used for full sequence validation (100% sequence coverage) in the pharma industry are still exclusively based on bottom-up liquid chromatography/tandem mass spectroscopy (LC-mass spectrometry (MS)/MS) experiments.[2,3]

  • We base our confirmed sequence coverage (CSC) analysis only on the fragment ions considered for scoring by Mascot, i.e., on only those that can be clearly distinguished from experimental noise

Read more

Summary

Introduction

The past decades have witnessed an immense growth in the production and usage of biopharmaceuticals. Middle-down workflows seem to be effective, where recent developments have allowed sequence coverages up to 90% to be achieved,[12−16] while the record value for top-down methods remains only ∼40%− 55%.15−20 the protocols routinely used for full sequence validation (100% sequence coverage) in the pharma industry are still exclusively based on bottom-up liquid chromatography/tandem mass spectroscopy (LC-MS/MS) experiments.[2,3]. The combined analysis of several digests prepared by the use of different proteases usually leads to the full sequence coverage at the amino acid level.[21] Recently, new digestion protocols have been worked out, such as the “extended bottom-up protocol” using secreted aspartic protease 9, and a pepsin-containing membrane for controlled mAb digestion.[22,23] The former may keep artifacts at a lower degree while the latter can vary the size of the resulting peptides

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call