Abstract

BackgroundNext generation sequencing (NGS) technologies have substantially increased the sequence output while the costs were dramatically reduced. In addition to the use in whole genome sequencing, the 454 GS-FLX platform is becoming a widely used tool for biodiversity surveys based on amplicon sequencing. In order to use NGS for biodiversity surveys, software tools are required, which perform quality control, trimming of the sequence reads, removal of PCR primers, and generation of input files for downstream analyses. A user-friendly software utility that carries out these steps is still lacking.FindingsWe developed CANGS (Cleaning and Analyzing Next Generation Sequences) a flexible and user-friendly integrated software utility: CANGS is designed for amplicon based biodiversity surveys using the 454 sequencing platform. CANGS filters low quality sequences, removes PCR primers, filters singletons, identifies barcodes, and generates input files for downstream analyses. The downstream analyses rely either on third party software (e.g.: rarefaction analyses) or CANGS-specific scripts. The latter include modules linking 454 sequences with the name of the closest taxonomic reference retrieved from the NCBI database and the sequence divergence between them. Our software can be easily adapted to handle sequencing projects with different amplicon sizes, primer sequences, and quality thresholds, which makes this software especially useful for non-bioinformaticians.ConclusionCANGS performs PCR primer clipping, filtering of low quality sequences, links sequences to NCBI taxonomy and provides input files for common rarefaction analysis software programs. CANGS is written in Perl and runs on Mac OS X/Linux and is available at http://i122server.vu-wien.ac.at/pop/software.html

Highlights

  • Generation sequencing (NGS) technologies have substantially increased the sequence output while the costs were dramatically reduced

  • CANGS performs PCR primer clipping, filtering of low quality sequences, links sequences to NCBI taxonomy and provides input files for common rarefaction analysis software programs

  • CANGS is written in Perl and runs on Mac OS X/Linux and is available at http://i122server.vu-wien.ac.at/pop/software.html

Read more

Summary

Conclusion

CANGS is a user-friendly tool for primer clipping and quality filtering of 454 sequences. Availability & requirements Project name: CANGS–Cleaning and Analyzing 454 GS-FLX sequences. Additional file 2: Input test data set for CANGS. This file contains 454 GS-FLX reads in FASTA file format and quality score file as sample input data set to run all modules of the CANGS utility. Click here for file [ http://www.biomedcentral.com/content/supplementary/1756-0500-3-3S2.ZIP ]. A user manual in PDF file format; it describes how to set working environments of this software and how to use the modules of the CANGS utility. Click here for file [ http://www.biomedcentral.com/content/supplementary/1756-0500-3-3S3.PDF ]

Background
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call