Biology in silico – a mixed bag : Computational Methods in Molecular Biology (New Comprehensive Biochemistry Vol. 32) edited by S. L. Salzberg, D. B. Searls and S. Kasif

L Aravind

doi:10.1016/s0968-0004(99)01450-4

Abstract

Elsevier, 1998. US$59.00 (xxvi + 371 pages)ISBN 0 444 502041The past decade has seen the spectacular rise of yet another ‘flavor’ of biology – computational biology (or bioinformatics). As a computational biologist, I have always felt that this discipline has the potential to bridge the gap between several disparate aspects of biological research. However, given that the founders of this field hail from backgrounds as diverse as computer science, the physical sciences and biology, there is a certain degree of heterogeneity in its practice. It has even lead to the question of whether computational biology can be considered to be a coherent discipline or merely a bag of support tools to aid the experimental biologist in the data-rich environment of the 1990s. This might also have occurred to the cursory reader of this book, although, admittedly, the title of Computational Methods in Molecular Biology probably justifies the lack of a unified approach. An attempt at gelling the discrete parts is made by the well-written introductory or tutorial chapters by Searls and Salzberg, which, to some extent, also offer the basic conceptual introduction requisite for further reading. In spite of this, the reader is at the mercy of the variable explanatory skills of the individual authors in the subsequent chapters. Of note are the chapters on hidden Markov models in sequence analysis by Krogh and on splice-site-finding by Burge, which provide good introductory accounts of these topics.Undoubtedly, gene screening and biopolymer sequence analysis are two of the most important aspects of computational biology. Although a promising overview of comparative techniques in sequence analysis is offered in the chapter by States and Reisdorf, the details receive insufficient attention. There is a particular deficit in the description of the methodologies of generating and evaluating multiple sequence alignments and evolutionary reconstructions based on protein sequences. Further, the tome already feels the adversity of the rapid burgeoning of the field – PSI-BLAST, a redoubtable addition to the sequence analysis armamentarium made its advent after the book was compiled. In terms of gene prediction, the discussion is more complete but would have benefited from a more extensive discussion of homology-based techniques that could augment the basic statistical and machine-learning-based techniques. These two basic aspects of computational biology are linked to the major challenges that face today’s researchers – genome analysis and annotation. This area receives the attention of a single chapter that describes certain aspects, but, unfortunately, this hardly does justice to the entire range of issues in this area including protein function prediction and reconstruction of an organism’s biology based on its genome sequence.The third important area of ‘in silico’ biology is protein structure prediction, protein structure comparison and related issues. About a third of this volume is devoted to this and several chapters are focused specifically on protein structure prediction using threading and ligand docking. The chapter by Jones in particular provides a good basic introduction to the problem and the different possible approaches, which included his method that met with considerable success in the 1994 edition of the Critical Assessment (of Techniques for Protein) Structure Prediction (CASP) contest. Apart from this, the section on structural aspects of computational biology lacks important topics such as secondary structure prediction and protein structure comparison and alignment. It is becoming increasingly clear that establishment of homology with known structures is likely to be the most powerful means of large-scale fold recognition. Unfortunately, the book is already out of date in this respect.With the increasing availability of information in terms of quantity and phylogenetic range as never before, computational biology is becoming increasingly relevant in all branches of the life sciences. Even the most myopic graduate student and bench biologist are certain to see the need to educate themselves in this field. What is the relevance of this book in the education of such aspirants? It is certainly no introductory textbook, given the vastly different quality and relevance of the various chapters, as well as the number of lacunae in the key subjects. Nor does it serve as a protocol book that guides the user through practical applications. Nevertheless, a student in the middle of his or her learning curve could make successful use of some of the chapters as a platform to explore certain topics in greater detail. Besides, the book can have some use as reference material for those interested in theoretical aspects of particular methodologies of gene prediction and protein structure studies.

Full Text