Plant long noncoding RNAs (lncRNAs) exhibit features such as tissue-specific expression, spatiotemporal regulation, and stress responsiveness. Although diverse studies support the regulatory role of lncRNAs in model plants, our knowledge about lncRNAs in crops is limited. We employ a custom pipeline on a dataset of over 1000 RNA-seq samples across nine representative species of the family Cucurbitaceae to predict 91 209 nonredundant lncRNAs. The lncRNAs were characterized according to three confidence levels and classified by their genomic context into intergenic, natural antisense, intronic, and sense-overlapping. Compared with protein-coding genes, lncRNAs were, on average, expressed at low levels and displayed significantly higher specificity when considering tissue, developmental stages, and stress responsiveness. The evolutionary analysis indicates higher positional conservation than sequence conservation, probably linked to the conserved modular motifs within syntenic lncRNAs. Moreover, a positive correlation between the expression of intergenic/natural antisense lncRNAs and their closest/parental gene was observed. For those intergenic, the correlation decreases with the distance to the neighboring gene, supporting that their potential cis-regulatory effect is within a short-range. Furthermore, the analysis of developmental studies showed that a conserved NAT-lncRNA family is differentially expressed in a coordinated way with their cognate sense protein-coding genes. These genes code for proteins associated with phloem development, thus providing insights about the potential involvement of some of the identified lncRNAs in a developmental process. We expect that this extensive inventory will constitute a valuable resource for further research lines focused on elucidating the regulatory mechanisms mediated by lncRNAs in cucurbits.
Read full abstract