Abstract Short interspersed nuclear elements (SINEs) constitute a class of nonautonomous retrotransposons endowed with the capability to multiply and spread within a genome. SINEs serve as reliable indicators for assessing genetic variability in dairy populations, aiding in precise determination of breed composition. However, SINEs within the bovine genome remain insufficiently characterized, hindering the understanding of their impact on dairy production and health. This study aimed to systematically identify, classify and unveil the characteristics of SINEs within the bovine reference genome (ARS-UCD1.3). Multiple de novo identification tools, including RepeatModeler, SINE_scan, LongRepMarker, and SINE-finder, were employed for SINE identification and characterization. A total of 1.6 million copies of SINEs were identified, constituting approximately 10.8% of the genome. The SINEs were stratified into seven distinct families based on their structural attributes, denoted as BosSINEL, BosSINE1, BosSINE2, BosSINE3, BosSINE4, BosSINE5 and BosSINE6. BosSINE1 represents a novel family, whereas the remaining six families were previously documented in Repbase. Among these families, BosSINE3 emerges as the most prolific, boasting 254,586 complete copies (over 80% of the length of the consensus sequence), followed by BosSINEL (164,538 copies), BosSINE2 (162,207 copies) and BosSINE1 (150,799 copies). Conversely, the remaining families exhibit lower abundance, each harboring no more than 100,000 copies within the genome. Structural comparisons among SINE families elucidate distinctive features. Notably, BosSINEL lacks a tRNA-related region, including boxA and boxB, which is the hallmark element of the classic SINE structure according to SINEBase. Conversely, all other families exhibit tRNA-related regions. BosSINE1, the lengthiest at 328 bp (excluding Poly-A), possesses two tRNA-related regions, distinguishing it from other families. BosSINEL, BosSINE1, BosSINE2 and BosSINE3 exhibit long interspersed nuclear elements (LINE)-related regions, suggesting their retrotransposition by engaging with the enzymatic machinery encoded by LINEs. In contrast, BosSINE6, the shortest at 109 bp, features a GC-rich region resulting in a high GC content of 67%. Additionally, BosSINE4 contains an AT-rich region. Evolutionary analysis disclosed that BosSINEL is the youngest SINE family in the bovine genome, potentially still harboring active roles and contributing to the genetic diversity related to different dairy production and health performance. Conversely, BosSINE1 to BosSINE6 are comparatively aged in terms of evolution, likely fixed in the bovine genome with diminished or lost activities. In summary, this investigation unveils the comprehensive profile of SINE in the bovine genome, delineating both structural and evolutionary characteristics. Particularly noteworthy is the emergence of BosSINEL as the youngest SINE family, suggesting active roles in the bovine genome. Since lineage-specific TEs (such as Bov-A2 elements) have been reported to regulate interferon-inducible gene expression in bovine, our data warrants further investigations to reveal the potential of SINEs as markers for improving dairy production and health.
Read full abstract