Abstract

Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-length transcriptome of P. catalpifolia leaves treated with varying degrees of drought stress. The sequencing data were then used to search for microsatellites, or simple sequence repeats (SSRs). A total of 28.83 Gb data were generated, 25,969 high-quality (HQ) transcripts with an average length of 1624 bp were acquired after removing the redundant reads, and 25,602 HQ transcripts (98.59%) were annotated using public databases. Among the HQ transcripts, 16,722 intact coding sequences, 149 long non-coding RNAs and 179 alternative splicing events were predicted, respectively. A total of 7367 SSR loci were distributed throughout 6293 HQ transcripts, of which 763 complex SSRs and 6604 complete SSRs. The SSR appearance frequency was 28.37%, and the average distribution distance was 5.59 kb. Among the 6604 complete SSR loci, 1–3 nucleotide repeats were dominant, occupying 97.85% of the total SSR loci, of which mono-, di- and tri-nucleotide repeats were 44.68%, 33.86% and 19.31%, respectively. We detected 112 repeat motifs, of which A/T (42.64%), AG/CT (12.22%), GA/TC (9.63%), GAA/TTC (1.57%) and CCA/TGG (1.54%) were most common in mono-, di- and tri-nucleotide repeats, respectively. The length of the repeat SSR motifs was 10–88 bp, and 4997 (75.67%) were ≤ 20 bp. This study provides a novel full-length transcriptome reference for P. catalpifolia and will facilitate the identification of germplasm resources and breeding of new drought-resistant P. catalpifolia varieties.

Highlights

  • Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture

  • A total of 30,953 transcripts were obtained after clustering and removal of redundant sequences using the PacBio Single-molecule real-time (SMRT) LINK Cluster tool, and 30,928 HQ transcripts with ≥ 99% accuracy and a full-length read support ≥ 2 were sequenced (Fig. 1c)

  • 28.83 Gb sequencing data were obtained including 349,745 full-length non-chimeric sequence reads, which was similar to the number of full-length non-chimeric sequences (FLNCs) reads in Rhododendron lapponicum[39]

Read more

Summary

Introduction

Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. Few transcriptomic and genetic studies have been conducted in P. catalpifolia. Single-molecule real-time sequencing technology was applied to obtain the fulllength transcriptome of P. catalpifolia leaves treated with varying degrees of drought stress. The sequencing data were used to search for microsatellites, or simple sequence repeats (SSRs). A total of 28.83 Gb data were generated, 25,969 high-quality (HQ) transcripts with an average length of 1624 bp were acquired after removing the redundant reads, and 25,602 HQ transcripts (98.59%). Paulownia catalpifolia is a typical and important species of Genus Paulownia in northern China, it exhibits some drought resistance and is renowned for its high density, good color, and beautiful texture. High-quality and drought-resistant P. catalpifolia varieties are urgently needed. Microsatellites, known as simple sequence repeats (SSRs), are DNA sequences consisting of continuously repeating motifs, which are composed of 1–6 b

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call