Abstract

BackgroundA typical bacterial pathogen genome mapping project can identify thousands of single nucleotide polymorphisms (SNP). Interpreting SNP data is complex and it is difficult to conceptualise the data contained within the large flat files that are the typical output from most SNP calling algorithms. One solution to this problem is to construct a database that can be queried using simple commands so that SNP interrogation and output is both easy and comprehensible.ResultsHere we present snp-search, a tool that manages SNP data and allows for manipulation and searching of SNP data. After creation of a SNP database from a VCF file, snp-search can be used to convert the selected SNP data into FASTA sequences, construct phylogenies, look for unique SNPs, and output contextual information about each SNP. The FASTA output from snp-search is particularly useful for the generation of robust phylogenetic trees that are based on SNP differences across the conserved positions in whole genomes. Queries can be designed to answer critical genomic questions such as the association of SNPs with particular phenotypes.Conclusionssnp-search is a tool that manages SNP data and outputs useful information which can be used to test important biological hypotheses.

Highlights

  • A typical bacterial pathogen genome mapping project can identify thousands of single nucleotide polymorphisms (SNP)

  • While these tools have excelled at the management of the SNP data, less attention has been directed towards simple processing and searching of the stored data

  • Variant Call Format (VCF) files were an initiative from the 1000 Genomes Project [8] and are an output produced by many SNP calling programs such as dbSNPs [9], Samtools mpileup [10] and GATK [11]

Read more

Summary

Introduction

A typical bacterial pathogen genome mapping project can identify thousands of single nucleotide polymorphisms (SNP). There is a need for a tool that extracts SNPs from VCF files, stores them into a simple database and provides multiple output options for analysis.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.