Reference-free detection of isolated SNPs.

Raluca Uricaru,Guillaume Rizk,Olivier Plantard,Claire Lemaitre,Rayan Chikhi,Elsa Quillery,Pierre Peterlongo,Vincent Lacroix

doi:10.1093/nar/gku1187

Abstract

Detecting single nucleotide polymorphisms (SNPs) between genomes is becoming a routine task with next-generation sequencing. Generally, SNP detection methods use a reference genome. As non-model organisms are increasingly investigated, the need for reference-free methods has been amplified. Most of the existing reference-free methods have fundamental limitations: they can only call SNPs between exactly two datasets, and/or they require a prohibitive amount of computational resources. The method we propose, discoSnp, detects both heterozygous and homozygous isolated SNPs from any number of read datasets, without a reference genome, and with very low memory and time footprints (billions of reads can be analyzed with a standard desktop computer). To facilitate downstream genotyping analyses, discoSnp ranks predictions and outputs quality and coverage per allele. Compared to finding isolated SNPs using a state-of-the-art assembly and mapping approach, discoSnp requires significantly less computational resources, shows similar precision/recall values, and highly ranked predictions are less likely to be false positives. An experimental validation was conducted on an arthropod species (the tick Ixodes ricinus) on which de novo sequencing was performed. Among the predicted SNPs that were tested, 96% were successfully genotyped and truly exhibited polymorphism.

Highlights

Assessing the genetic differences between individuals within a species or between chromosomes of an individual is a fundamental task in many aspects of biology
Results presented in this paper show that DISCOSNP outperforms other reference-free single nucleotide polymorphisms (SNPs) detection methods in terms of resources, type and number of input dataset(s), and quality of the ranking of predicted isolated SNPs
We propose experiments that aim at (i) assessing the quality of DISCOSNP results on simulated datasets, in comparison with state-of-the-art reference-free SNP detection methods; (ii) showing how DISCOSNP performs on real data, with biological validation

Summary

Introduction

Assessing the genetic differences between individuals within a species or between chromosomes of an individual is a fundamental task in many aspects of biology. This is increasingly feasible with next-generation sequencing technologies, as individuals from virtually any species can be sequenced at a modest cost. To be amplified by polymerase chain reaction (PCR), such SNPs must not be surrounded by other polymorphism sources, i.e. other SNPs, indels or structural variants. Isolated SNPs must be distant to the left and to the right by at least k nucleotides from any other polymorphism, k being one of the main parameters of a SNP detection tool

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nucleic Acids Research	Publication Date: Nov 17, 2014
Citations: 84	License type: CC BY 4.0

R Discovery Prime

Reference-free detection of isolated SNPs.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Nucleic Acids Research

Lead the way for us

Similar Papers

High‐throughput genotyping in estimating genetic resources and detecting pathogens in aquaculture
Chenhong Li ... Junlong Jiang
Journal of the World Aquaculture Society | VOL. 54
Chenhong Li, et. al.Chenhong Li ... Junlong Jiang
01 Jun 2023
Journal of the World Aquaculture Society | VOL. 54

An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome.
Antonio Ribeiro ... Gordon Stephen
BMC Bioinformatics | VOL. 16
Antonio Ribeiro, et. al.Antonio Ribeiro ... Gordon Stephen
11 Nov 2015
BMC Bioinformatics | VOL. 16

Author response: Tiled-ClickSeq for targeted sequencing of complete coronavirus genomes with simultaneous capture of RNA recombination and minority variants
Elizabeth Jaworski ...
-
Elizabeth Jaworski, et. al.Elizabeth Jaworski ...
03 Sep 2021
03 Sep 2021

Rapid Detection of SNP (c.309T>G) in the MDM2 Gene by the Duplex SmartAmp Method
Yasuaki Enokida ...
PLoS ONE | VOL. 8
Yasuaki Enokida, et. al.Yasuaki Enokida ...
02 Apr 2013
Rapid Detection of SNP (c.309T>G) in the MDM2 Gene by the Duplex SmartAmp Method
Yasuaki Enokida ...

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Reference-free detection of isolated SNPs.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Nucleic Acids Research