Sampling Variation of RAD-Seq Data from Diploid and Tetraploid Potato (Solanum tuberosum L.).

Zhenyu Dang,Zewei Luo,Qin Tao,Yuxin Zhang,Lin Wang,Jixuan Yang,Fengjun Zhang

doi:10.3390/plants10020319

Zhenyu Dang, Zewei Luo + Show 5 more

Open Access

https://doi.org/10.3390/plants10020319

Copy DOI

Abstract

The new sequencing technology enables identification of genome-wide sequence-based variants at a population level and a competitively low cost. The sequence variant-based molecular markers have motivated enormous interest in population and quantitative genetic analyses. Generation of the sequence data involves a sophisticated experimental process embedded with rich non-biological variation. Statistically, the sequencing process indeed involves sampling DNA fragments from an individual sequence. Adequate knowledge of sampling variation of the sequence data generation is one of the key statistical properties for any downstream analysis of the data and for implementing statistically appropriate methods. This paper reports a thorough investigation on modeling the sampling variation of the sequence data from the optimized RAD-seq (Restriction sit associated DNA sequencing) experiments with two parents and their offspring of diploid and autotetraploid potato (Solanum tuberosum L.). The analysis shows significant dispersion in sampling variation of the sequence data over that expected under multinomial distribution as widely assumed in the literature and provides statistical methods for modeling the variation and calculating the model parameters, which may be easily implemented in real sequence datasets. The optimized design of RAD-seq experiments enabled effective control of presentation of undesirable chloroplast DNA and RNA genes in the sequence data generated.

Highlights

Development of next-generation sequencing technology (NGS) has enabled the identification of sequence variant-based genetic molecular markers at a genome-wide scale, a population level, and a very competitive cost in comparison to traditional DNA molecular markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs), and single-nucleotide polymorphisms (SNPs) [1,2]
genotyping by sequencing (GBS) is relatively straightforward in diploid species, serious consideration must be given to several major sources of variation in collecting and processing the sequencing data for accurate identification of allele-specific sequencing reads [5]
The variation may be biological or nonbiological in nature, and it may be associated with technical issues such as errors associated in process of sequencing library construction, sequencing errors, and errors stemmed from data processing [5,7,8,9]

Summary

Introduction

Development of next-generation sequencing technology (NGS) has enabled the identification of sequence variant-based genetic molecular markers at a genome-wide scale, a population level, and a very competitive cost in comparison to traditional DNA molecular markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs), and single-nucleotide polymorphisms (SNPs) [1,2] This has motivated great interest in genotyping by sequencing (GBS) for population and quantitative genetic analyses in diploid and tetraploid species [3]. Among the variation sources discussed in the literature, sampling variation is the ultimate and key statistical property of sequencing data, and it is essential information for the reliability of modeling and any downstream analysis with the data They pointed out that the “messy” hexaploid sequence data may involve dispersion over standard independent distributions, but little is known about to what extent the data deviate from a specific distribution and what form of the statistical distribution the data follow. We discussed how the sampling variation pattern predicted from the analysis may influence quantitative genetic analysis involving use of the next-generation genomic sequence data

Sequence Data Collected

Preliminary Bioinformatic Analysis of the RAD-Seq Data

Materials and Methods

Construction of RAD-Seq Libraries

Preliminary Processing of the Sequence Data

Identifying SNPs from the Sequence Data

Calling Polymorphic Sites and Genotype at the Identified Sites

Sampling Distributions of Sequence Data

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Plants (Basel, Switzerland)	Publication Date: Feb 7, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Sampling Variation of RAD-Seq Data from Diploid and Tetraploid Potato (Solanum tuberosum L.).

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Plants (Basel, Switzerland)

Lead the way for us

Similar Papers

Micro-Tuber Production in Diploid and Tetraploid Potato after Gamma Irradiation of <i>in Vitro</i> Cuttings for Mutation Induction
Souleymane Bado ... Margit Laimer
American Journal of Plant Sciences | VOL. 07
Souleymane Bado, et. al.Souleymane Bado ... Margit Laimer
01 Jan 2015
American Journal of Plant Sciences | VOL. 07

Tissue Culture and Refreshment Techniques for Improvement of Transformation in Local Tetraploid and Diploid Potato with Late Blight Resistance as an Example.
Eu Sheng Wang ... Nam Phuong Kieu
Plants | VOL. 9
Eu Sheng Wang, et. al.Eu Sheng Wang ... Nam Phuong Kieu
29 May 2020
Plants | VOL. 9

Application of high-resolution DNA melting for genotyping and variant scanning of diploid and autotetraploid potato
David De Koeyer ... Walter De Jong
Molecular Breeding | VOL. 25
David De Koeyer, et. al.David De Koeyer ... Walter De Jong
17 Jul 2009
Molecular Breeding | VOL. 25

The identification of allelic variation in potato
Johan Willemsen
-
Johan WillemsenJohan Willemsen
21 Nov 2020
21 Nov 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sampling Variation of RAD-Seq Data from Diploid and Tetraploid Potato (Solanum tuberosum L.).

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Plants (Basel, Switzerland)