Abstract

Obtaining chloroplast (cp) genome sequence is necessary for studying physiological roles in plants. However, it is difficult to use traditional sequencing methods to get cp genome sequences because of the complex procedures of preparing templates. With the advent of next-generation sequencing technology, massive genome sequences can be produced. Thus, a good pipeline to assemble next-generation sequence reads with optimized k-mer length is essential to get whole cp genome sequences. Moreover, adjustment of other parameters is also very important, especially for the assembly of the cp genome. In this study, we developed a pipeline to generate the cp genome for Quercus spinosa. When Quercus rubra was used as a reference, we achieved coverage of 97.75% after optimizing k-mer length as well as other parameters. The efficiency of the pipeline makes it a useful method for cp genome construction in plants. It also provides great perspective on the analysis of cp genome characteristics and evolution.

Highlights

  • Chloroplast genome resources, both in model organisms and non-model organisms, have been extensively used in molecular ecology and evolution studies[1,2]

  • Our results indicated that: (1) the minimum read length had little effect on the result of assembly; and (2) when k-mer size was set to 81 with a sequence read length of 100 base pairs, the assembly of the cp genome of Quercus spinosa was most complete

  • The sequencing reads of Q. spinosa plants used to build the cp genome of Q. spinosa were downloaded from the NCBI SRA database under the accession number SRP061187

Read more

Summary

Introduction

Chloroplast (cp) genome resources, both in model organisms and non-model organisms, have been extensively used in molecular ecology and evolution studies[1,2]. Our results indicated that: (1) the minimum read length had little effect on the result of assembly; and (2) when k-mer size was set to 81 with a sequence read length of 100 base pairs, the assembly of the cp genome of Quercus spinosa was most complete. Since we used three numbers (17, 21 and 25) as the minimum read length and 42 numbers (odd numbers from 17 to 99) as k-mer length, we got 126 contigs in total.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call