SNP-PHAGE – High throughput SNP discovery pipeline

Lakshmi K Matukumalli,David L Hyten,Ik-Young Choi,John J Grefenstette,Perry B Cregan,Curtis P Van Tassell

doi:10.1186/1471-2105-7-468

Lakshmi K Matukumalli, David L Hyten + Show 4 more

Open Access

https://doi.org/10.1186/1471-2105-7-468

Copy DOI

Abstract

BackgroundSingle nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species. As a result of their abundance and the availability of high throughput analysis technologies SNP markers have begun to replace other traditional markers such as restriction fragment length polymorphisms (RFLPs), amplified fragment length polymorphisms (AFLPs) and simple sequence repeats (SSRs or microsatellite) markers for fine mapping and association studies in several species. For SNP discovery from chromatogram data, several bioinformatics programs have to be combined to generate an analysis pipeline. Results have to be stored in a relational database to facilitate interrogation through queries or to generate data for further analyses such as determination of linkage disequilibrium and identification of common haplotypes. Although these tasks are routinely performed by several groups, an integrated open source SNP discovery pipeline that can be easily adapted by new groups interested in SNP marker development is currently unavailable.ResultsWe developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. Scripts to generate a user-friendly web interface are also provided with common queries for preliminary data analysis. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at .ConclusionSNP-PHAGE provides a bioinformatics solution for high throughput SNP discovery, identification of common haplotypes within an amplicon, and GenBank (dbSNP) submissions. SNP selection and visualization are aided through a user-friendly web interface. This tool is useful for analyzing sequence tagged sites (STSs) of genomic sequences, and this software can serve as a starting point for groups interested in developing SNP markers.

Highlights

Single nucleotide polymorphisms (SNPs) as defined here are single base sequence changes or short insertion/deletions between or within individuals of a given species
Additional pre-requisites for installing this application are minimal and only require those needed for base calling, sequence assembly and polymorphism detection
Polymorphism discovery and validation requires a balance between sensitivity and specificity

Summary

Results

We developed SNP-PHAGE (SNP discovery Pipeline with additional features for identification of common haplotypes within a sequence tagged site (Haplotype Analysis) and GenBank (-dbSNP) submissions. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs. This tool was applied for analyzing sequence traces from diverse soybean genotypes to discover over 10,000 SNPs This package was developed on UNIX/Linux platform, written in Perl and uses a MySQL database. A machine learning tool developed by this group for increasing the efficiency of SNP discovery is integrated as a part of this package as an optional feature. The SNP-PHAGE package is being made available open source at http://bfgl.anri.barc.usda.gov/ML/snp-phage/

Conclusion

Background

Results and discussion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC bioinformatics	Publication Date: Oct 23, 2006
Citations: 56	License type: cc-by

R Discovery Prime

R Discovery Prime

SNP-PHAGE – High throughput SNP discovery pipeline

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics

Lead the way for us

Similar Papers

Development of STS markers for Verticillium wilt resistance in cotton based on RGA–AFLP analysis
Hui Fang ... Huiping Zhou
Molecular breeding : new strategies in plant improvement | VOL. 34
Hui Fang, et. al.Hui Fang ... Huiping Zhou
09 Apr 2014
Molecular breeding : new strategies in plant improvement | VOL. 34

Mapping QTLs for submergence tolerance in rice by AFLP analysis and selective genotyping.
S Nandi ... N L Manigbas
MGG Molecular & General Genetics | VOL. 255
S Nandi, et. al.S Nandi ... N L Manigbas
01 Jun 1997
MGG Molecular & General Genetics | VOL. 255

Transfer of sequence tagged site PCR markers between wheat and barley
J E Erpelding ... N K Blake
Genome | VOL. 39
J E Erpelding, et. al.J E Erpelding ... N K Blake
01 Aug 1996
Genome | VOL. 39

Population genetics of the yellow fever mosquito in Trinidad: comparisons of amplified fragment length polymorphism (AFLP) and restriction fragment length polymorphism (RFLP) markers.
G Yan ... D D Chadee
Molecular Ecology | VOL. 8
G Yan, et. al.G Yan ... D D Chadee
01 Jun 1999
Molecular Ecology | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SNP-PHAGE – High throughput SNP discovery pipeline

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC bioinformatics