Abstract

DNA sequences that are exactly conserved over long evolutionary time scales have been observed in a variety of taxa. Such sequences are likely under strong functional constraint and they have been useful in the field of comparative genomics for identifying genome regions with regulatory function. A potential new application for these ultra-conserved elements (UCEs) has emerged in the development of gene drives to control mosquito populations. Many gene drives work by recognizing and inserting at a specific target sequence in the genome, often imposing a reproductive load as a consequence. They can therefore select for target sequence variants that provide resistance to the drive. Focusing on highly conserved, highly constrained sequences lowers the probability that variant, gene drive-resistant alleles can be tolerated. Here, we search for conserved sequences of 18 bp and over in an alignment of 21 Anopheles genomes, spanning an evolutionary timescale of 100 million years, and characterize the resulting sequences according to their location and function. Over 8000 UCEs were found across the alignment, with a maximum length of 164 bp. Length-corrected gene ontology analysis revealed that genes containing Anopheles UCEs were over-represented in categories with structural or nucleotide-binding functions. Known insect transcription factor binding sites were found in 48% of intergenic Anopheles UCEs. When we looked at the genome sequences of 1142 wild-caught mosquitoes, we found that 15% of the Anopheles UCEs contained no polymorphisms. Our list of Anopheles UCEs should provide a valuable starting point for the selection and testing of new targets for gene-drive modification in the mosquitoes that transmit malaria.

Highlights

  • DNA sequences that are highly conserved over long evolutionary timescales have been identified in many organisms

  • When we looked at the genome sequences of 1142 wild-caught mosquitoes, we found that 15% of the Anopheles ultra-conserved elements (UCEs) contained no polymorphisms

  • While there is still some mystery around why sequences might be conserved at the nucleotide level over long evolutionary timescales, it has been shown that UCEs: (1) often are involved in the regulation of transcription of genes, especially essential genes involved in development (e.g. (Visel et al 2008); (2) may have a role in chromosomal structure (e.g. Chiang et al 2008); and (3) are sometimes non-coding RNA genes (e.g. Kern et al 2015)

Read more

Summary

Introduction

DNA sequences that are highly conserved over long evolutionary timescales have been identified in many organisms. Some of these sequences show complete conservation at the nucleotide level and are often known as ultra-conserved elements (UCEs). While there is still some mystery around why sequences might be conserved at the nucleotide level over long evolutionary timescales, it has been shown that UCEs: (1) often are involved in the regulation of transcription of genes, especially essential genes involved in development Even UCEs in protein-coding regions may have multi-functional roles (Warnefors et al 2016). UCEs can act as probes to facilitate genomic sequencing of non-model organisms using sequence-capture methods (Faircloth et al 2012). Alterations in UCEs have been shown to have an association with human cancers (e.g. Calin et al 2007; Lin et al 2012)

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call