Abstract
BackgroundGenomic features of whole genome sequences emerging from various sequencing and annotation projects are represented and stored in several formats. Amongst these formats, the GFF (Generic/General Feature Format) has emerged as a widely accepted, portable and successfully used flat file format for genome annotation storage. With an increasing interest in genome annotation projects and secondary and meta-analysis, there is a need for efficient tools to extract sequences of interests from GFF files.FindingsWe have developed GFF-Ex to automate feature-based extraction of sequences from a GFF file. In addition to automated sequence extraction of the features described within a feature file, GFF-Ex also assigns boundaries for the features (introns, intergenic, regions upstream to genes), which are not explicitly specified in the GFF format, and exports the corresponding primary sequence information into predefined feature specific output files. GFF-Ex package consists of several UNIX Shell and PERL scripts.ConclusionsCompared to other available GFF parsers, GFF-Ex is a simpler tool, which permits sequence retrieval based on additional inferred features. GFF-Ex can also be integrated with any genome annotation or analysis pipeline. GFF-Ex is freely available at http://bioinfo.icgeb.res.in/gff.
Highlights
Genomic features of whole genome sequences emerging from various sequencing and annotation projects are represented and stored in several formats
Generic Feature Format (GFF)-Ex can be integrated with any genome annotation or analysis pipeline
General Feature Format/Generic Feature Format (GFF) is a flat file data format widely used for storing genome annotations, describing sequence-based annotations of a genome
Summary
Genomic features of whole genome sequences emerging from various sequencing and annotation projects are represented and stored in several formats. Conclusions: Compared to other available GFF parsers, GFF-Ex is a simpler tool, which permits sequence retrieval based on additional inferred features. GFF-Ex can be integrated with any genome annotation or analysis pipeline. General Feature Format/Generic Feature Format (GFF) (http://www.sanger.ac.uk/resources/software/ gff/spec.html) is a flat file data format widely used for storing genome annotations, describing sequence-based annotations of a genome.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.