Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

The Next Generation of Protein Sequencing and Analysis Methods.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Advances in protein sequencing and analysis are poised to transform proteomics through an ability to link sequence, structure, and function at scale, thereby accelerating biological discovery and biomedical innovation. However, interrogating proteins is uniquely challenging because they cannot be amplified, are composed of complex chemical structures, and exist across a vast landscape of proteoforms. Techniques such as mass spectrometry typically drive large-scale proteomics studies; however, a new generation of technologies is pushing the boundaries, promising new features such as de novo, single-molecule, and/or higher-throughput sequencing and analysis. While many strategies are still in an early stage, a few modalities, such as fluorosequencing, single-molecule sequencing, digital proteomics mapping, and nanopore-based protein sequencing, have now reached or are thought to be nearing commercial implementation. In this review, we evaluate the mechanisms, current progress, and remaining challenges of these technologies while also highlighting how recent innovations are converging toward a new generation of proteomic technologies.

Similar Papers
  • Research Article
  • Cite Count Icon 1
  • 10.54097/hset.v40i.6699
The Emerging Landscape and Application of Protein Sequencing
  • Mar 29, 2023
  • Highlights in Science, Engineering and Technology
  • Hao Xu

Proteins play an indispensable role in all cells and tissues in all living creatures especially the human body. To determine the protein primary structure, also known as protein sequencing, is an important subject in life science and medicine. Scientists have tried chemical reactions such as Edman degradation, as well as instrument analysis like mass spectrometry. These mainstream methods can effectively determine the sequence of protein and are widely used. On the other hand, the new method like nanopore sequencing shows the single molecule level sequencing sensitivity and has great emerging landscape. The development of protein sequencing technology will help people better understand the laws of life activities and achieve early diagnosis and precise treatment of diseases. This paper will briefly summarize the traditional protein sequencing methods, focus on the introduction of the new generation of sequencing technology represented by nanopore sequencing. Meanwhile, after comparing those advantages and disadvantages, its future research direction could be found. More cheap, high-throughput and highly sensitive protein sequencing methods and instruments will be discovered and popularized in the near future.

  • Research Article
  • Cite Count Icon 2
  • 10.1093/annonc/mdq664
Innovative technology for cancer risk analysis
  • Jan 1, 2011
  • Annals of Oncology
  • S Tommas + 3 more

Innovative technology for cancer risk analysis

  • Research Article
  • Cite Count Icon 5
  • 10.1002/smtd.202000695
Nanopore Confinement for Single‐Molecule Measurement of Proteins
  • Nov 1, 2020
  • Small Methods
  • Yi‐Tao Long

Nanopore Confinement for Single‐Molecule Measurement of Proteins

  • Research Article
  • Cite Count Icon 14
  • 10.18358/np-30-4-i320
ПОКОЛЕНИЯ МЕТОДОВ СЕКВЕНИРОВАНИЯ ДНК (ОБЗОР)
  • Nov 30, 2020
  • NAUCHNOE PRIBOROSTROENIE
  • A G Borodinov + 4 more

Several decades have passed since the development of the revolutionary DNA sequencing method by Frederick Sanger and his colleagues. After the Human Genome Project, the time interval between sequencing technologies began to shrink, while the volume of scientific knowledge continued to grow exponentially. Following Sanger sequencing, considered as the first generation, new generations of DNA sequencing were consistently introduced into practice. Advances in next generation sequencing (NGS) technologies have contributed significantly to this trend by reducing costs and generating massive sequencing data. To date, there are three generations of sequencing technologies. Second generation se-quencing, which is currently the most commonly used NGS technology, consists of library preparation, amplification and sequencing steps, while in third generation sequencing, individual nucleic acids are sequenced directly to avoid bias and have higher throughput. The development of new generations of sequencing has made it possible to overcome the limitations of traditional DNA sequencing methods and has found application in a wide range of projects in molecular biology. On the other hand, with the development of next generation technologies, many technical problems arise that need to be deeply analyzed and solved. Each generation and sequencing platform, due to its methodological approach, has specific advantages and disadvantages that determine suitability for certain applications. Thus, the assessment of these characteristics, limitations and potential applications helps to shape the directions for further research on sequencing technologies.

  • Research Article
  • Cite Count Icon 5
  • 10.1002/cbic.202400824
Towards Next Generation Protein Sequencing.
  • Dec 18, 2024
  • Chembiochem : a European journal of chemical biology
  • Yakun Yi + 3 more

Understanding the structure and function of proteins is a critical objective in the life sciences. Protein sequencing, a central aspect of this endeavor, was first accomplished through Edman degradation in the 1950s. Since the late 20th century, mass spectrometry has emerged as a prominent method for protein sequencing. In recent years, single-molecule technologies have increasingly been applied to this field, yielding numerous innovative results. Among these, nanopore sensing has proven to be a reliable single-molecule technology, enabling advancements in amino acid recognition, short peptide differentiation, and peptide sequence reading. These developments are set to elevate protein sequencing technology to new heights. The next generation of protein sequencing technologies is anticipated to revolutionize our understanding of molecular mechanisms in biological processes and significantly enhance clinical diagnostics and treatments.

  • Research Article
  • 10.4233/uuid:894b72df-6a38-4e6b-b2d2-1dd740ba92db
Peptide Fingerprinting Using Single-Molecule Fluorescence
  • Jan 10, 2017
  • Research Repository (Delft University of Technology)
  • Jetty Van Ginkel

Proteins belong to the most important molecules in living organisms. They function as messengers, transporters and catalysts, and provide cells and tissues with structure. The expression profile of proteins is rich in information, which can be used, for example, in diagnosing diseases. Therefore proteomics, the large scale study of proteins, can give us valuable information on molecular pathways and state of health. As a result, proteomics has the potential to transform personalized medicine. Recent advances in mass spectrometry have led to a draft of the human proteome. With current mass spectrometry based techniques, these types of large scale studies remain an enormous effort. Therefore, there is a great need for breakthrough technologies to push proteomics from fundamental research into the clinic. Genomics has benefitted from fast and inexpensive emerging single-molecule techniques. We envision similar effects for single-molecule protein sequencing. In this thesis we present our technology that will allow us to analyze protein expression profiles of samples as small as a single cell with large dynamic range. Back in 2011, when this project was initiated, there was hardly any literature available on this topic. However, the past years more research groups openly shifted their focus to single-molecule protein sequencing. In Chapter 1, we give an overview of recent efforts to establish single-molecule protein sequencing. The foremost reason for the absence of highly sensitive and high-throughput protein sequencing techniques is the complexity of primary protein structures compared to DNA/RNA molecules. Where DNA and RNA consist of four unique building blocks, proteins are built from 20 distinctive amino acids. Independent of the read out method of choice, this requires the detection of 20 distinguishable signals. A non-trivial challenge. Fortunately, a limited number of proteins occur compared to the theoretical number that could be created using 20 unique building blocks. While the exact number of protein coding genes in the human genome is still under debate, the number is believed to be roughly 20,000, resulting in a number of protein products that is finite. This, together with protein databases such as UniProt, allows for an alternative way of identifying protein sequences. Rather than detecting every single element, as is essential for DNA sequencing, we choose to focus on detecting the sequence of a subset of elements.

  • Research Article
  • 10.1021/acs.jproteome.5c01187
A Selective History of Protein Sequencing and the Impact of Donald F. Hunt on Protein Sequencing by Tandem Mass Spectrometry
  • Apr 28, 2026
  • Journal of proteome research
  • John R Yates

This review is a selective history of protein sequencing through the year 2000, intended to illuminate just how challenging it was to establish the methods that enabled protein analysis, especially protein analysis by mass spectrometry. My goal is to provide context to the development and impact of mass spectrometry-based methods on protein sequencing over the last 75 years and to highlight the significant research critical to the development of protein sequencing methods. Because the chemistry of proteins is so diverse, an easy “one size fits all” protein sequencing method (such as the methods available for sequencing DNA) was not feasible. The original protein sequencing methods pursued by Sanger, Du Vigneaud, and Edman relied on chemical methods to decipher amino acid sequence, but in 1959 Klaus Biemann used mass spectrometry to analyze di-and tripeptides, thus introducing the use of mass spectrometry to the protein sequencing field. Donald F. Hunt of the University of Virginia was the first to develop tandem mass spectrometry-based methods for sequencing peptides and proteins, and many of his methods are in use to this day. Hunt made landmark contributions in mass spectrometry technology development and applied his tandem mass spectrometry methods widely across biology, particularly in the field of immunology. The aim of this review is to provide a historical context for Hunt’s contributions (as well as other mass spectrometrists!) to protein sequencing and proteomics and to honor Dr. Hunt’s contributions upon his retirement.

  • Research Article
  • Cite Count Icon 15
  • 10.1016/0065-2571(84)90023-2
Probing the infra-structure of thymidylate synthase and deoxycytidylate deaminase
  • Jan 1, 1984
  • Advances in Enzyme Regulation
  • Frank Maley + 2 more

Probing the infra-structure of thymidylate synthase and deoxycytidylate deaminase

  • Research Article
  • Cite Count Icon 267
  • 10.1128/ec.3.4.955-965.2004
Proteomic analysis of Candida albicans cell walls reveals covalently bound carbohydrate-active enzymes and adhesins.
  • Aug 1, 2004
  • Eukaryotic Cell
  • Piet W J De Groot + 7 more

Covalently linked cell wall proteins (CWPs) of the dimorphic fungus Candida albicans are implicated in virulence. We have carried out a comprehensive proteomic analysis of the covalently linked CWPs in exponential-phase yeast cells. Proteins were liberated from sodium dodecyl sulfate (SDS)-extracted cell walls and analyzed using immunological and advanced protein sequencing (liquid chromatography-tandem mass spectrometry [LC/MS/MS]) methods. HF-pyridine and NaOH were used to chemically release glycosylphosphatidylinositol-dependent proteins (GPI proteins) and mild alkali-sensitive proteins, respectively. In addition, to release both classes of CWPs simultaneously, cell walls were digested enzymatically with a recombinant beta-1,3-glucanase. Using LC/MS/MS, we identified 14 proteins, of which only 1 protein, Cht2p, has been previously identified in cell wall extracts by using protein sequencing methods. The 14 identified CWPs include 12 GPI proteins and 2 mild alkali-sensitive proteins. Nonsecretory proteins were absent in our cell wall preparations. The proteins identified included several functional categories: (i) five CWPs are predicted carbohydrate-active enzymes (Cht2p, Crh11p, Pga4p, Phr1p, and Scw1p); (ii) Als1p and Als4p are believed to be adhesion proteins. In addition, Pga24p shows similarity to the flocculins of baker's yeast. (iii) Sod4p/Pga2p is a putative superoxide dismutase and is possibly involved in counteracting host defense reactions. The precise roles of the other CWPs (Ecm33.3p, Pir1p, Pga29p, Rbt5p, and Ssr1p) are unknown. These results indicate that a substantial number of the covalently linked CWPs of C. albicans are actively involved in cell wall remodeling and expansion and in host-pathogen interactions.

  • Research Article
  • 10.54097/a43yhk04
Progress in the Application of CRISPR/Cas Family Mediated Third-generation Sequencing Technology
  • Sep 14, 2024
  • Academic Journal of Science and Technology
  • Xizhen Chen

The first generation sequencing technology based on the dideoxynucleotide (ddNTP) chain termination method proposed by Sanger was gradually eliminated due to its high cost, low sequencing read length and cumbersome process. The second generation of sequencing technology, called High-throughput sequencing (HTS), which was then developed, still has the problem of fixed read length. In recent years, the third generation sequencing technology represented by SMRT technology and Nanopore sequencing technology has gradually become popular. Compared with the previous two generations of sequencing technology, the most significant advantage of the third generation sequencing technology is its ability to carry out single molecule sequencing. In this process, the infinite length of nucleic acid sequence can be determined theoretically without the help of PCR amplification. This paper first introduces the basic principles, advantages and disadvantages of third-generation sequencing, and introduces in detail the CRISPR/Cas family-mediated SMRT technology and Nanopore sequencing technology in the third-generation sequencing technology. Finally, the research progress and prospects of the combination of the third generation sequencing technology and gene editing technology in the future are analyzed.

  • Research Article
  • Cite Count Icon 2
  • 10.1371/journal.pone.0238625.r008
Computational assessment of the feasibility of protonation-based protein sequencing
  • Sep 11, 2020
  • PLoS ONE
  • Giles Miclotte + 3 more

Recent advances in DNA sequencing methods revolutionized biology by providing highly accurate reads, with high throughput or high read length. These read data are being used in many biological and medical applications. Modern DNA sequencing methods have no equivalent in protein sequencing, severely limiting the widespread application of protein data. Recently, several optical protein sequencing methods have been proposed that rely on the fluorescent labeling of amino acids. Here, we introduce the reprotonation-deprotonation protein sequencing method. Unlike other methods, this proposed technique relies on the measurement of an electrical signal and requires no fluorescent labeling. In reprotonation-deprotonation protein sequencing, the terminal amino acid is identified through its unique protonation signal, and by repeatedly cleaving the terminal amino acids one-by-one, each amino acid in the peptide is measured. By means of simulations, we show that, given a reference database of known proteins, reprotonation-deprotonation sequencing has the potential to correctly identify proteins in a sample. Our simulations provide target values for the signal-to-noise ratios that sensor devices need to attain in order to detect reprotonation-deprotonation events, as well as suitable pH values and required measurement times per amino acid. For instance, an SNR of 10 is required for a 61.71% proteome recovery rate with 100 ms measurement time per amino acid.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 14
  • 10.1371/journal.pone.0238625
Computational assessment of the feasibility of protonation-based protein sequencing.
  • Sep 11, 2020
  • PLOS ONE
  • Giles Miclotte + 2 more

Recent advances in DNA sequencing methods revolutionized biology by providing highly accurate reads, with high throughput or high read length. These read data are being used in many biological and medical applications. Modern DNA sequencing methods have no equivalent in protein sequencing, severely limiting the widespread application of protein data. Recently, several optical protein sequencing methods have been proposed that rely on the fluorescent labeling of amino acids. Here, we introduce the reprotonation-deprotonation protein sequencing method. Unlike other methods, this proposed technique relies on the measurement of an electrical signal and requires no fluorescent labeling. In reprotonation-deprotonation protein sequencing, the terminal amino acid is identified through its unique protonation signal, and by repeatedly cleaving the terminal amino acids one-by-one, each amino acid in the peptide is measured. By means of simulations, we show that, given a reference database of known proteins, reprotonation-deprotonation sequencing has the potential to correctly identify proteins in a sample. Our simulations provide target values for the signal-to-noise ratios that sensor devices need to attain in order to detect reprotonation-deprotonation events, as well as suitable pH values and required measurement times per amino acid. For instance, an SNR of 10 is required for a 61.71% proteome recovery rate with 100 ms measurement time per amino acid.

  • Research Article
  • 10.14806/ej.21.a.815
Biobanking for the future, how to prepare for the next generation of Next Generation Sequencing
  • Mar 25, 2015
  • EMBnet.journal
  • Tomas Klingström

The technology watch of the BBMRI-LPC 2015 will focus on the pre-analytical, ethical and data management issues that may prevent biobanks from providing samples that fully take advantage of the next generation of sequencing technology. Biobanks are a long term commitment and by preparing for the next generation of technology, biobanks provide researchers with immediate access to material for testing scientific theories that would otherwise take years or decades to evaluate.

  • PDF Download Icon
  • Supplementary Content
  • Cite Count Icon 62
  • 10.3389/fmicb.2023.1043967
Portable nanopore-sequencing technology: Trends in development and applications
  • Feb 1, 2023
  • Frontiers in Microbiology
  • Pin Chen + 13 more

Sequencing technology is the most commonly used technology in molecular biology research and an essential pillar for the development and applications of molecular biology. Since 1977, when the first generation of sequencing technology opened the door to interpreting the genetic code, sequencing technology has been developing for three generations. It has applications in all aspects of life and scientific research, such as disease diagnosis, drug target discovery, pathological research, species protection, and SARS-CoV-2 detection. However, the first- and second-generation sequencing technology relied on fluorescence detection systems and DNA polymerization enzyme systems, which increased the cost of sequencing technology and limited its scope of applications. The third-generation sequencing technology performs PCR-free and single-molecule sequencing, but it still depends on the fluorescence detection device. To break through these limitations, researchers have made arduous efforts to develop a new advanced portable sequencing technology represented by nanopore sequencing. Nanopore technology has the advantages of small size and convenient portability, independent of biochemical reagents, and direct reading using physical methods. This paper reviews the research and development process of nanopore sequencing technology (NST) from the laboratory to commercially viable tools; discusses the main types of nanopore sequencing technologies and their various applications in solving a wide range of real-world problems. In addition, the paper collates the analysis tools necessary for performing different processing tasks in nanopore sequencing. Finally, we highlight the challenges of NST and its future research and application directions.

  • Research Article
  • Cite Count Icon 2
  • 10.1360/tb-2021-0066
Recent advances in protein sequencing
  • May 14, 2021
  • Chinese Science Bulletin
  • Houkai Chen + 2 more

Function or disfunction of proteins depends on the primary structures, and protein sequencing, which provides key information on protein related biological processes and disease, plays important roles in biological, biomedical, clinical research and application. To obtain the precise protein sequences, researchers developed different methods over the past few decades, and these methods include conventional methods and newly methods. The former includes Edman degradation and mass spectrometry (MS), and the latter includes single-molecule detection, nanopore and other lately developed techniques. In the 1960s, the classic Edman degradation was firstly developed for sequencing protein molecules from N-terminus using cyclic chemical reaction. Afterwards, solid-state, and gas-state Edman degradation was further developed that still plays a significant role in the modern technologies. This review discusses the principle and limits of Edman degradation. Moreover, we discussed advantages and shortcomings of MS-based approaches, which are the current standard methods for protein sequencing applications. Single-molecule approaches could bring revolution in proteomics, realizing high sensitivity for the low-abundance protein detection and single-cell proteomics. With the development of the single-molecule nucleic acid sequencing, four kinds of basic groups of DNA/RNA can be effectively detected using label-free or fluorescence labelling strategies. However, it is still a challenge to label and analyze all twenty kinds of amino acid residues. Moreover, sensitive optical detection has been utilized for high throughput protein sequencing using fluorescence labelling. In this approach, selected residues of peptides were labelled, and the C-terminus was anchored onto the glass substrate. N-terminus was degraded through Edman cycles. Finally, the sequence can be analyzed through the wide-field fluorescence signals. This method has potential of large-scale, sensitive, and parallel detection. We have discussed its principle and characteristic features in detail. Nanopore, including biological nanopore and solid-state nanopore, has been emerged as powerful technologies for protein sequencing. Nanopore can provide single-molecule sensing interface and controlled nano-confined space enabling ultimate sensitivity and high spatiotemporal resolution. The mechanism of nanopore-based technologies depends on the interaction of functional group and the nanopore, inducing the current modulations. The information of peptides can be obtained by monitoring the ionic current responses. Arrayed nanopores have potential of high-throughput detection at low-abundance. It is still in early stage of development and some challenges need to be addressed. As “finger-print” signal, Raman spectrum is an ideal candidate for protein sequencing. However, very weak signals can significantly restrict its application, especially at low concentration of target molecule. Surface enhanced Raman spectroscopy (SERS) can enhance the Raman signal to achieve the detection on the scale of a single molecule. Combination of the SERS and nanopore has demonstrated powerful capability of label-free detection of ten kinds of amino acids. Moreover, this method offers a new strategy for protein sequencing. Comparing with the weak Raman signal, fluorescence signals are more accessible, even on the level of single molecule. Several molecular dynamics (MD) simulations have been discussed to show possibility of fluorescence labelled protein sequencing within nanopore. Nevertheless, some drawbacks need to be addressed, especially the high-cost fabrication of nanopore and translocation of proteins through a pore. Specifically, this review also discusses the future challenges as well as summarize recent efforts to break the bottleneck of the current protein sequencing, promoting development of medical treatment, disease diagnosis and related fields.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant