Abstract
The identification of regulatory sequences in animal genomes remains a significant challenge. Comparative genomic methods that use patterns of evolutionary conservation to identify non-coding sequences with regulatory function have yielded many new vertebrate enhancers. However, these methods have not contributed significantly to the identification of regulatory sequences in sequenced invertebrate taxa. We demonstrate here that this differential success, which is often attributed to fundamental differences in the nature of vertebrate and invertebrate regulatory sequences, is instead primarily a product of the relatively small size of sequenced invertebrate genomes. We sequenced and compared loci involved in early embryonic patterning from four species of true fruit flies (family Tephritidae) that have genomes four to six times larger than those of Drosophila melanogaster. Unlike in Drosophila, where virtually all non-coding DNA is highly conserved, blocks of conserved non-coding sequence in tephritids are flanked by large stretches of poorly conserved sequence, similar to what is observed in vertebrate genomes. We tested the activities of nine conserved non-coding sequences flanking the even-skipped gene of the teprhitid Ceratis capitata in transgenic D. melanogaster embryos, six of which drove patterns that recapitulate those of known D. melanogaster enhancers. In contrast, none of the three non-conserved tephritid non-coding sequences that we tested drove expression in D. melanogaster embryos. Based on the landscape of non-coding conservation in tephritids, and our initial success in using conservation in tephritids to identify D. melanogaster regulatory sequences, we suggest that comparison of tephritid genomes may provide a systematic means to annotate the non-coding portion of the D. melanogaster genome. We also propose that large genomes be given more consideration in the selection of species for comparative genomics projects, to provide increased power to detect functional non-coding DNAs and to provide a less biased view of the evolution and function of animal genomes.
Highlights
Animal genomes differ considerably in size, ranging from 20 million to over 100 billion basepairs [1], with significant variation between even closely related species. This diversity is reflected in sequenced animal genomes, which currently range from the nematode Meloidogyna incognita at around 80 Mb to humans at around 3.2 Gb, with a marked difference in the sizes of sequenced genomes of invertebrates and vertebrates
The difference in locus size is roughly proportional to the difference in genome size, and the larger size of tephritid loci is primarily due to increases in the size of introns and intergenic regions, and not of coding DNA (Table 2)
The Value of Big Genomes in Comparative Genomics When we began working with tephritid genomes, we viewed their large size as an annoyance that necessitated the screening of an unusually large number of clones to identify genes of interest
Summary
Animal genomes differ considerably in size, ranging from 20 million to over 100 billion basepairs [1], with significant variation between even closely related species (see Figure 1) This diversity is reflected in sequenced animal genomes, which currently range from the nematode Meloidogyna incognita at around 80 Mb to humans at around 3.2 Gb, with a marked difference in the sizes of sequenced genomes of invertebrates (most are smaller than 250 Mb) and vertebrates (most are larger than 1 Gb). No tetrapods, are known to have genomes smaller than 1 Gb, while most large invertebrate taxa contain species with far smaller genomes It is still not clear why these differences exist, several explanations have been proposed [2,3]. These broad trends in genome size do not fully account for the bias in the sizes of sequenced genomes
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.