Abstract

Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, particularly with the recent availability of strain-sequencing and transcriptional data. Here, combining both manual curation and automatic pipelines, we present a genome-wide annotation of the pseudogenes in the mouse reference genome and 18 inbred mouse strains (available via the mouse.pseudogene.org resource). We also annotate 165 unitary pseudogenes in mouse, and 303, in human. The overall pseudogene repertoire in mouse is similar to that in human in terms of size, biotype distribution, and family composition (e.g. with GAPDH and ribosomal proteins being the largest families). Notable differences arise in the pseudogene age distribution, with multiple retro-transpositional bursts in mouse evolutionary history and only one in human. Furthermore, in each strain about a fifth of all pseudogenes are unique, reflecting strain-specific evolution. Finally, we find that ~15% of the mouse pseudogenes are transcribed, and that highly transcribed parent genes tend to give rise to many processed pseudogenes.

Highlights

  • Pseudogenes are ideal markers of genome remodelling

  • We provide the latest updates on the pseudogene annotation for both the mouse and human reference genomes, with a particular emphasis on the identification of new unitary pseudogenes

  • As pseudogene assignments are highly dependent on the quality of the proteincoding annotation, the manually curated set provides a highquality lower bound with respect to the true number of pseudogenes in the mouse genome, while the automatic annotation informs on the upper limit of the pseudogene complement size (Fig. 1c)

Read more

Summary

Introduction

Pseudogenes are ideal markers of genome remodelling. In turn, the mouse is an ideal platform for studying them, with the recent availability of strain-sequencing and transcriptional data. Mice frequently have been used as a model organism for studying human diseases due to their experimental tractability and similarities in their genetic makeup with humans[6] This has resulted in the development of mouse models of specific diseases and the generation of knockout mice to recapitulate phenotypes associated with loss-of-function (LOF) mutations observed in humans. Following an inbreeding process for at least 20 sequential generations, these mouse strains are homozygous at most loci and show a high level of consistency at genomic and phenotypic levels[10] This helps minimise a number of problems raised by the genetic variation between research animals[11]. If the pseudogenised or disabled allele is rare, one usually refers to this as LOF event on a functional gene Such pseudogenes represent disablements that have occurred on a more recent time scale. These are mutations that are not fixed in the population and are a

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call