Abstract

BackgroundGenotyping-by-sequencing (GBS) is becoming an attractive alternative to array-based methods for genotyping individuals for a large number of single nucleotide polymorphisms (SNPs). Costs can be lowered by reducing the mean sequencing depth, but this results in genotype calls of lower quality. A common analysis strategy is to filter SNPs to just those with sufficient depth, thereby greatly reducing the number of SNPs available. We investigate methods for estimating relatedness using GBS data, including results of low depth, using theoretical calculation, simulation and application to a real data set.ResultsWe show that unbiased estimates of relatedness can be obtained by using only those SNPs with genotype calls in both individuals. The expected value of this estimator is independent of the SNP depth in each individual, under a model of genotype calling that includes the special case of the two alleles being read at random. In contrast, the estimator of self-relatedness does depend on the SNP depth, and we provide a modification to provide unbiased estimates of self-relatedness. We refer to these methods of estimation as kinship using GBS with depth adjustment (KGD). The estimators can be calculated using matrix methods, which allow efficient computation. Simulation results were consistent with the methods being unbiased, and suggest that the optimal sequencing depth is around 2–4 for relatedness between individuals and 5–10 for self-relatedness. Application to a real data set revealed that some SNP filtering may still be necessary, for the exclusion of SNPs which did not behave in a Mendelian fashion. A simple graphical method (a ‘fin plot’) is given to illustrate this issue and to guide filtering parameters.ConclusionWe provide a method which gives unbiased estimates of relatedness, based on SNPs assayed by GBS, which accounts for the depth (including zero depth) of the genotype calls. This allows GBS to be applied at read depths which can be chosen to optimise the information obtained. SNPs with excess heterozygosity, often due to (partial) polyploidy or other duplications can be filtered based on a simple graphical method.Electronic supplementary materialThe online version of this article (doi:10.1186/s12864-015-2252-3) contains supplementary material, which is available to authorized users.

Highlights

  • Genotyping-by-sequencing (GBS) is becoming an attractive alternative to array-based methods for genotyping individuals for a large number of single nucleotide polymorphisms (SNPs)

  • We have considered the estimation of relatedness using a popular method

  • We argued that the divisor of each element in the genomic relationship matrix (GRM) should only use SNPs that are in the corresponding calculation for the numerator

Read more

Summary

Introduction

Genotyping-by-sequencing (GBS) is becoming an attractive alternative to array-based methods for genotyping individuals for a large number of single nucleotide polymorphisms (SNPs). The first applications of GWAS and GS have relied on genome-wide data generated using ‘SNP chips’—assays that give genotype results for a pre-defined set of many thousands of single nucleotide polymorphisms. An alternative method that is gaining popularity is to use genotype calls on the basis of generation sequencing. This is usually implemented by sequencing a subset of the genome using restriction enzymes and size selection, so that more reads (i.e., higher read depth) of the sequenced regions can be obtained for a given sequencing effort [3]. It is likely that much of the genome will be in linkage disequilibrium with SNPs assayed by this technique

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call