Abstract

In the fall of 2001, the editors of BioTechniques decided to launch a new quarterly Supplement Series to address the critical issues currently facing the biological research laboratory. It was more than fitting that they chose to devote the first edition to the emerging field of Computational Proteomics. The goal was to gather experts in the fields of bioinformatics and proteomics to explore the role of these technologies in drug discovery. The emphasis was to evaluate the bottleneck of the vast amount of protein data being generated and the different strategies that are being developed to overcome this bottleneck and accelerate the underlying process of identifying drug targets. It seems appropriate to introduce this publication with a brief historical background on the development of the science of proteomics followed by a short commentary on each article appearing in this issue. In 1945, Fred Sanger developed one of the first methods for protein characterization by amino acid sequence analysis using fluorodinitrobenzene as part of his first Nobel prize winning work on the structure of insulin. A more efficient approach that used sequential degradation of residues from the amino terminus with phenylisothiocyanate was described in 1950 by Pehr Edman. Although a large number of protein sequences have been determined using automated “Edman degradation” over the years, the method is quite time-consuming and not amenable to large-scale identification of proteins and their primary structures. In the meantime, the advances in nucleic acid sequencing techniques and the ease of deduction of protein sequences from DNA sequences was reflected in an article published in Nature in 1978 entitled “The decline and fall of protein chemistry?” (4). As a young graduate student sequencing proteins under the supervision of the Nobel prize winning protein chemist Rodney Porter at the University of Oxford, I was obviously quite disappointed to read that. I did not have enough courage to ask him about the utility of becoming a protein chemist but posed the question to his PhD supervisor Fred Sanger. He assured me that the way developments were then taking place in recombinant DNA technology, in our time we should see advances in protein chemistry that had not even been contemplated at that time. He commented that we probably would be dealing with novel technology capable of characterizing proteins present in very trace amounts. I vividly remembered his words when fifteen years later, in 1995, it took me less than five minutes to analyze the tryptic digest of a protein band from a gel using a mass spectrometer and identify the protein by correlative sequence database searching. In most instances, that would now take less than a minute with the added luxury of a completely automated high-throughput process. The feasibility of such methods was first presented in a poster at the 1989 Protein Society Symposium in Seattle by W.J. Henzel, J.T. Stults, and C. Watanabe of Genentech and another one presented two years later at the Baltimore symposium by J.R. Yates, P.R. Griffin, T. Hunkapillar, S. Speicher, and L.E. Hood of Caltech (3). Since these contributions were not included in any published proceedings, the knowledge of the new method only gained limited circulation. It was not until 1993 that similar studies from five different groups appeared in the scientific literature. The word proteome, indicating proteins expressed by a genome, was coined by Marc Wilkins and colleagues and first used in late 1994 at the Siena 2-D Electrophoresis meeting and appeared in published literature for the first time in July, 1995 (5). It is interesting to note that in the same month, the first complete sequence of the genome of a living organism, Haemophilus influenzae Rd, was published. In the interim period, the complete sequence of several genomes has been determined, and the finished human genome will be available in the near future. However, our understanding of thousands to hundreds of thousands of proteins encoded by these genomes is lagging far behind. Since most of the cellular processes occur at the protein level, a thorough comprehension of the proteome, i.e., the pattern of proteins produced by a specific cell under a particular set of conditions, would definitely accelerate drug, vaccine, and diagnostic target discovery. Our current understanding of the human genome has led to the notion that there are fewer genes than originally thought, and this has become a strong driving force behind proteomics. Plans to elucidate the human proteome have been discussed at a meeting organized by the recently created Human Proteome Organization or HUPO (www.hupo.org) (1). In fact, the effort to produce an index of all human proteins, the human protein index, dates back to 1981 (2). The field of proteomics has rapidly expanded and includes diverse technologies like 2-D gel and mass spectrometry-based methods for protein profiling, protein microarrays, yeast twohybrid projects, protein-protein interaction pathways and cell signaling networks, high-throughput protein structural studies using mass spectrometry, nuclear magnetic resonance and X-ray crystallography, and high-throughput computational methods for protein 3-D structure as well as function determination. Two-dimensional (2-D) gel electrophoresis and mass spectrometry based methods are currently the major experimental technologies for large-scale as well as high-throughput analysis of proteomes focused on protein identification and their qualitative and quantitative comparison. In the review article, Deb Chakravarti, Bulbul Chakravarti, and Ioannis Moutsatsos have addressed the state-of-the-art software and informatic tools for proteome profiling using these techniques. These authors have drawn attention to the lack of significant advances in the development of informatics and software tools to support the analysis and management of huge amounts of data generated in the process and integration of such information with other sources of protein structural and functional information. In this context the authors have discussed the importance of relational databases for protein profiling data management. Proteome profiling is a very powerful tool in clinical medicine for identification of diagnostic markers. Clinical applications of proteomics can also provide information on drug targets, the mechanism of drug action, and drug-mediated toxicity. Eric Fung and Cynthia Enderwick have provided an elegant background of the ProteinChip technology based on the surface enhanced laser desorption/ionization time-of-flight (SELDI-TOF) INTRODUCTION

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call