All forms of genetic variation originate from new mutations, making it crucial to understand their rates and mechanisms. Here, we use long-read PacBio sequencing to investigate de novo mutations that accumulated in 12 inbred mouse lines derived from three commonly used inbred strains (C3H, C57BL/6, and FVB) maintained for 8-15 generations in a mutation accumulation (MA) experiment. We built chromosome-level genome assemblies based on the MA line founders' genomes, and then employed a combination of read and assembly-based methods to call the complete spectrum of new mutations. On average, there are ~45 mutations per haploid genome per generation, about half of which (54%) are insertions and deletions shorter than 50 bp (indels). The remainder are single nucleotide mutations (SNMs, 44%) and large structural mutations (SMs, 2%). We found that the degree of DNA repetitiveness is positively correlated with SNM and indel rates, and that a substantial fraction of SMs can be explained by homology-dependent mechanisms associated with repeat sequences. Most (90%) indels can be attributed to microsatellite contractions and expansions, and there is a marked bias towards 4 bp indels. Among the different types of SMs, tandem repeat mutations have the highest mutation rate, followed by insertions of transposable elements (TEs). We uncover a rich landscape of active TEs, and notable differences in their spectrum among MA lines and strains, and a high rate of gene retroposition. Our study offers novel insights into mammalian genome evolution, and highlights the importance of repetitive elements in shaping genomic diversity.
Read full abstract