Abstract

On February 12, 2001, an unprecedented collection of papers describing the initial sequencing and analysis of the human genome was published in Nature and Science (International Human Genome Sequencing Consortium 2001; Venter et al. 2001). Although much celebration and press attention had surrounded the earlier announcement in June 2000 of the coverage of the vast majority of the human genome sequence in working draft form, the publications in February 2001 carried with them the kind of satisfying scientific significance that laborers in the genome fields had longed for—the full description of the methods used to determine the letters of over 90% of the human instruction book, and a host of surprising revelations from the analysis of its contents. This brief essay represents a personal reflection on how we got here, and where we are going. The achievement of these landmarks, coming years ahead of the original schedule, was only possible because of the advances begun early in the preceding decade, reflecting the polyphonic set of interconnected goals that the planners of the Human Genome Project (HGP) wisely included as part of the original master plan. Science traditionally operates by the process of researchers standing on the shoulders of those who came before, and that has certainly been true for the HGP. Building detailed genetic and physical maps, developing better, cheaper, and faster technologies for handling DNA, and mapping and sequencing the more modest-sized genomes of model organisms were all critical stepping stones on the path to initiating the large-scale sequencing of the human genome. Pilot efforts to sequence the human genome began in the mid-1990s. When the International Human Genome Sequencing Consortium met for the first time in Bermuda in 1996, there was a sense of excitement, but the magnitude of the task at hand was sobering– throughput was too low, costs were too high, technology was still immature. Despite that anxiety, the assembled scientific leaders from several countries at that meeting endorsed the importance of high quality sequence, and made one of the most crucial decisions of the genome era–immediate data release. Led by John Sulston and BobWaterston, whohad adopted this same policy for the sequence of C. elegans, the assembled sequencing center directors unanimously adopted a statement that all assembled contigs greater than 1 or 2 kb would be placed in public databases within 24 hours. The argument was simple: The sequence would only benefit the public fully if it could be understood, and that required making it immediately available so that all the creative minds of the planet could work on it. The establishment of this principle was one of the defining moments of the HGP. Over the next three years the rate-limiting steps of large-scale sequencing began to yield to creative innovations. The genome centers implemented major improvements in library production, template preparation, and laboratory information management, so that less and less human intervention was required in the main production pipelines. The advent of capillary sequencing machines from Amersham and ABD provided a much-needed boost in efficiency. Much has also been made of the appearance of a commercial entity on the scene in May 1998 (Celera Genomics) as an additional nudge to the HGP. Whereas it is fair to say that the resulting sense of competition provided an additional incentive to the genome centers, it would be misguided to say that the HGP was previously operating in a relaxed fashion, or that the significant advances in throughput would otherwise not have happened. After all, most of those advances were born of previous accomplishments of the HGP itself. From my perspective, a major turning point occurred in Houston in February 1999. The largest NIHfunded centers (at the Whitehead Institute, Washington University, and Baylor College of Medicine) had just undergone rigorous peer review of their proposals to scale up sequencing throughput and were about to receive a significant increase in funding. The Sanger Centre in Hinxton (UK) and the Joint Genome Institute (Walnut Creek, CA) of the Department of Energy were also scaling up production rapidly. An experiment carried out the preceding summer had documented the high degree of utility of a “working draft” human genome sequence; the half-dozen labs that compared draft and finished sequences found that the draft could answer most of the scientific queries they posed (although it was harder to work with), and suggested that the majority of the HGP’s efforts might well be devoted to obtaining working-draft coverage of the genome as quickly as possible, as long as the commitment to finishing was not diminished. Accordingly, the sequencing plans for the NIH and DOE that were E-MAIL: fc23a@nih.gov; FAX (301) 402-0837. Article and publication are at http://www.genome.org/cgi/doi/10.1101/ gr.1898. Commentary

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call