Abstract

Background Anopheles funestus is one of the 3 most consequential and widespread vectors of human malaria in tropical Africa. However, the lack of a high-quality reference genome has hindered the association of phenotypic traits with their genetic basis in this important mosquito.FindingsHere we present a new high-quality A. funestus reference genome (AfunF3) assembled using 240× coverage of long-read single-molecule sequencing for contigging, combined with 100× coverage of short-read Hi-C data for chromosome scaffolding. The assembled contigs total 446 Mbp of sequence and contain substantial duplication due to alternative alleles present in the sequenced pool of mosquitos from the FUMOZ colony. Using alignment and depth-of-coverage information, these contigs were deduplicated to a 211 Mbp primary assembly, which is closer to the expected haploid genome size of 250 Mbp. This primary assembly consists of 1,053 contigs organized into 3 chromosome-scale scaffolds with an N50 contig size of 632 kbp and an N50 scaffold size of 93.811 Mbp, representing a 100-fold improvement in continuity versus the current reference assembly, AfunF1.ConclusionThis highly contiguous and complete A. funestus reference genome assembly will serve as an improved basis for future studies of genomic variation and organization in this important disease vector.

Highlights

  • Introduction and backgroundMany insect genomes remain a challenge to assemble, and mosquito genomes have proven difficult owing to their repeat content and structurally dynamic genomes

  • Using alignment and depth-of-coverage information, these contigs were deduplicated to a 211 megabase pairs (Mbp) primary assembly, which is closer to the expected haploid genome size of 250 Mbp

  • These alternative alleles likely derive from natural variation circulating within the sequenced FUMOZ colony, as the DNA from a pool of adult mosquitoes was required for Pacific Biosciences (PacBio) library preparation

Read more

Summary

Introduction and background

Many insect genomes remain a challenge to assemble, and mosquito genomes have proven difficult owing to their repeat content and structurally dynamic genomes. When inbreeding is not possible, the sequenced pool of individuals can carry population variation that fragments the resulting assembly In this case, instead of assembling a single genome, the assembler must reconstruct some unknown number of variant haplotypes. An initial assembly of the long-read data alone (AfunF3 contigs) yielded a contig N50 size of 94.05 kbp (N50 such that 50% of assembled bases are in contigs of this size or greater) and extensive haplotype separation as evidenced by an inflated assembly size of 446.04 Mbp and a high rate of core gene duplications (48%) as measured by BUSCO [42] These alternative alleles likely derive from natural variation circulating within the sequenced FUMOZ colony, as the DNA from a pool of adult mosquitoes was required for Pacific Biosciences (PacBio) library preparation. These results demonstrate the greater continuity of the updated assembly, which provides sequence-resolved reconstructions of many A. funestus intergenic regions for the first time

Discussion
Materials and Methods
Findings
Availability of supporting data and materials
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call