In mammals, the adult testis is the tissue with the highest diversity in gene expression. Much of that diversity is attributed to germ cells, primarily meiotic spermatocytes and postmeiotic haploid spermatids. Exploiting a newly developed cell purification method, we profiled the transcriptomes of such postmitotic germ cells of mice. We used a de novo transcriptome assembly approach and identified thousands of novel expressed transcripts characterized by features distinct from those of known genes. Novel loci tend to be short in length, monoexonic, and lowly expressed. Most novel genes have arisen recently in evolutionary time and possess low coding potential. Nonetheless, we identify several novel protein-coding genes harboring open reading frames that encode proteins containing matches to conserved protein domains. Analysis of mass-spectrometry data from adult mouse testes confirms protein production from several of these novel genes. We also examine overlap between transcripts and repetitive elements. We find that although distinct families of repeats are expressed with differing temporal dynamics during spermatogenesis, we do not observe a general mode of regulation wherein repeats drive expression of nonrepetitive sequences in a cell type-specific manner. Finally, we observe many fairly long antisense transcripts originating from canonical gene promoters, pointing to pervasive bidirectional promoter activity during spermatogenesis that is distinct and more frequent compared with somatic cells.
Read full abstract