Abstract

BackgroundStructural variations (SVs) have been reported to play an important role in genetic diversity and trait regulation. Many computer algorithms detecting SVs have recently been developed, but the use of multiple algorithms to detect high-confidence SVs has not been studied. The most suitable sequencing depth for detecting SVs in pear is also not known.ResultsIn this study, a pipeline to detect SVs using next-generation and long-read sequencing data was constructed. The performances of seven types of SV detection software using next-generation sequencing (NGS) data and two types of software using long-read sequencing data (SVIM and Sniffles), which are based on different algorithms, were compared. Of the nine software packages evaluated, SVIM identified the most SVs, and Sniffles detected SVs with the highest accuracy (> 90%). When the results from multiple SV detection tools were combined, the SVs identified by both MetaSV and IMR/DENOM, which use NGS data, were more accurate than those identified by both SVIM and Sniffles, with mean accuracies of 98.7 and 96.5%, respectively. The software packages using long-read sequencing data required fewer CPU cores and less memory and ran faster than those using NGS data. In addition, according to the performances of assembly-based algorithms using NGS data, we found that a sequencing depth of 50× is appropriate for detecting SVs in the pear genome.ConclusionThis study provides strong evidence that more than one SV detection software package, each based on a different algorithm, should be used to detect SVs with higher confidence, and that long-read sequencing data are better than NGS data for SV detection. The SV detection pipeline that we have established will facilitate the study of diversity in other crops.

Highlights

  • Structural variations (SVs) have been reported to play an important role in genetic diversity and trait regulation

  • Structural variant (SV) between ‘Yali’ and the reference genome detected using different algorithms and sequencing data Depending on the performances of the nine SV callers, which are based on different algorithms (Table 1), up to eight types of SVs in the ‘Yali’ genome were detected: insertions, deletions, inversions, duplications, translocations, Multiple nucleotide polymorphism (MNP), CTXs and ITXs (Table 1)

  • Breakdancer, DELLY, Lumpy and MetaSV were more sensitive in detecting large deletions and Pindel was more sensitive in detecting small SVs (Fig. 1); this is because read-pair algorithms are less sensitive in detecting small SVs, which are below the standard deviation for insert size [14, 25, 27, 28]

Read more

Summary

Introduction

Structural variations (SVs) have been reported to play an important role in genetic diversity and trait regulation. Structural variants (SVs), which include deletions, insertions, inversions, duplications and translocations, are defined as rearrangements in chromosomes larger than 50 nucleotides [1]. Insertions and duplications are called unbalanced SVs because they give rise to copy number variants (CNVs), while inversions and translocations are called balanced SVs [2]. SVs play an important role in biological processes, and the identification of SVs is crucial for studying human genetic diversity, gene and genome variants, evolution and disease [3, 4]. SVs such as insertions and deletions and CNVs have been shown to contribute to natural variation of plants and have played a significant role in the differentiation of complex traits, domestication, evolution and adaptation [8, 9]. The study of single nucleotide polymorphisms (SNPs), InDels and CNVs in tomato revealed introgressions from wild species and the mosaic structure of the genomes of

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call