Abstract
The February 15th, 2001 announcement of the draft human genome sequence was the culmination of a momentous undertaking. The analyses of this sequence (International Human Genome Sequencing Consortium 2001; Venter et al. 2001) predicted a surprisingly modest 31,000 genes for Homo sapiens (although this number has yet to be finalized), as compared with estimates as high as 140,000 genes just a few years ago (Fields et al. 1994). Given this, we narrowly top the list for the eukaryotic genomes that have been completely sequenced (Table 1). Even though sequencing the human genome may be merely a first pass at a deeper understanding of our biology, one fact stands out as demanding an immediate explanation: Why do humans have so few genes? The assumptions and chauvinism implicit in this question—that humans are vastly more complex than the other fully sequenced eukaryotes and should therefore have a commensurately larger suite of genes—are difficult to argue clearly and may be even more difficult to justify biologically (McShea 1996). Still, it is hard to deny our intuitive perception that the number of genes in a genome should be roughly correlated with complexity and that organismal complexity can be ranked as yeast nematodes flies humans (we reserve judgment on the relative position of the “green fly,” Arabidopsis ). However, the number of genes in the genomes of these organisms does not match our naive expectation. This disjunction between the number of genes and organismal complexity, what we call the “G-value paradox,” parallels the finding during the 1950s that the physical size of genomes does not correlate with organismal complexity, a relationship known as the C-value paradox (Cavalier-Smith 1985; Appendix and Table 1). The finding that much of the genome contains noncoding repeats and “junk” DNA seemed to resolve the C-value paradox. Implicitly, this resolution rested on the assumption that once noncoding DNA was taken into account, the total number of genes would then correlate with organismal complexity (Cavalier-Smith 1985). However, the published G values of the completely sequenced eukaryotes make it clear that we have not yet resolved the C-value paradox—it has merely given way to the G-value paradox. Just as the discovery of noncoding DNA seemed to resolve the C-value paradox, so a few simple observations may in time resolve the G-value paradox. These observations all attempt to give more value to each of our genes and thus to give us a more accurate genomic predictor of organismal complexity by identifying the true measure of information encoded by a genome, the “I-value” (Appendix; the concept of “information” with respect to genes is itself highly debated, see Oyama 1985 and Sarkar 1998, but a philosophical resolution to this issue is beyond the scope of this article). Some of the observations we discuss here have been offered as the answer to explaining our modest number of genes (Davidson 2001; Hanke et al. 1999; Szathmary et al. 2001), whereas some have been invoked in combination (International Human Genome Sequencing Consortium 2001; Petsko 2001). These observations indicate that the evolution of organismal complexity will typically involve changes in the genome that are subtler than simply adding genes. The C-value paradox was resolved by a plea to the G value; a resolution of the G-value paradox may be offered by a plea to the I value. However, what if no measure of genomic information content, no matter how precise, correctly predicts organismal complexity? Our last observation will attempt to undermine the basic assumption that organismal complexity somehow corresponds to even a refined measure of genomic complexity.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.