Abstract

The past thirty years have witnessed a renaissance in biology as advances in technology contributed to discoveries at ever-greater orders of magnitude. One of the primary reasons for this revolution has been the advancement of technologies that allow high-throughput discovery and processing of data. This accomplishment has placed volumes of data in the realm of “discovery” science. An important point in this period came with the complete sequencing of several microbial genomes followed by the sequencing of the first multicellular organism, Caenorhabditis elegans, and eventually that of humans and various model organisms, such as Drosophila melanogaster. The edifice of the genetic code fell by wedding a biological technique developed by Sanger, known as shotgun sequencing (Sanger et al., 1977), with that of computational techniques utilizing high-speed computers. Without the advances in computer chips and processors, at a pace defined by Moore’s law (Moore, 1965), sequencing would have been dramatically slower and would not have brought about the age of bioinformatics, a symbiosis of biological data, large amounts of information, and computer science. The hypothesis that gene number is related to organism complexity is quickly discarded when comparing Homo sapiens, which have a genome of only 3.1 billion base pairs (Olivier et al., 2001; Venter et al., 20010), to other organisms. Estimates for the marbled lungfish, Protopterus aethiopicus, suggest 133 billion base pairs (Pedersen, 1971), making it the largest vertebrate genome, while, to date, the lowly amoeba, Amoeba dubia, is estimated to have the largest genome overall at 670 billion base pairs (McGrath & Katz, 2004). However, large genomes may be a liability, as suggested in the plant world, where Japonica paris, which has a genome of approximately 150 billion base pairs (Pellicer et al., 2010), grows more slowly and is more sensitive to changes in the environment (Vinogradov, 2003). In vertebrates, there appears to be an inverse correlation between genome size and brain size (Andrews & Gregory, 2009), thus, complexity may lie with other factors such as epigenetics and protein interactions. While estimates of human gene numbers rest between 20,000 – 30,000 genes, these genes may encode over 500,000 proteins. Thus, the proteome of a cell can range from several thousand proteins in prokaryotes to over 10,000 in eukaryotes. These numbers are

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call