Abstract

BackgroundGenomic features of whole genome sequences emerging from various sequencing and annotation projects are represented and stored in several formats. Amongst these formats, the GFF (Generic/General Feature Format) has emerged as a widely accepted, portable and successfully used flat file format for genome annotation storage. With an increasing interest in genome annotation projects and secondary and meta-analysis, there is a need for efficient tools to extract sequences of interests from GFF files.FindingsWe have developed GFF-Ex to automate feature-based extraction of sequences from a GFF file. In addition to automated sequence extraction of the features described within a feature file, GFF-Ex also assigns boundaries for the features (introns, intergenic, regions upstream to genes), which are not explicitly specified in the GFF format, and exports the corresponding primary sequence information into predefined feature specific output files. GFF-Ex package consists of several UNIX Shell and PERL scripts.ConclusionsCompared to other available GFF parsers, GFF-Ex is a simpler tool, which permits sequence retrieval based on additional inferred features. GFF-Ex can also be integrated with any genome annotation or analysis pipeline. GFF-Ex is freely available at http://bioinfo.icgeb.res.in/gff.

Highlights

  • Genomic features of whole genome sequences emerging from various sequencing and annotation projects are represented and stored in several formats

  • Generic Feature Format (GFF)-Ex can be integrated with any genome annotation or analysis pipeline

  • General Feature Format/Generic Feature Format (GFF) is a flat file data format widely used for storing genome annotations, describing sequence-based annotations of a genome

Read more

Summary

Introduction

Genomic features of whole genome sequences emerging from various sequencing and annotation projects are represented and stored in several formats. Conclusions: Compared to other available GFF parsers, GFF-Ex is a simpler tool, which permits sequence retrieval based on additional inferred features. GFF-Ex can be integrated with any genome annotation or analysis pipeline. General Feature Format/Generic Feature Format (GFF) (http://www.sanger.ac.uk/resources/software/ gff/spec.html) is a flat file data format widely used for storing genome annotations, describing sequence-based annotations of a genome.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.