AutoFACT: An Auto matic F unctional A nnotation and C lassification T ool

Liisa B Koski,Michael W Gray,Gertraud Burger,B Franz Lang

doi:10.1186/1471-2105-6-151

Liisa B Koski, Michael W Gray + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-6-151

Copy DOI

Abstract

BackgroundAssignment of function to new molecular sequence data is an essential step in genomics projects. The usual process involves similarity searches of a given sequence against one or more databases, an arduous process for large datasets.ResultsWe present AutoFACT, a fully automated and customizable annotation tool that assigns biologically informative functions to a sequence. Key features of this tool are that it (1) analyzes nucleotide and protein sequence data; (2) determines the most informative functional description by combining multiple BLAST reports from several user-selected databases; (3) assigns putative metabolic pathways, functional classes, enzyme classes, GeneOntology terms and locus names; and (4) generates output in HTML, text and GFF formats for the user's convenience. We have compared AutoFACT to four well-established annotation pipelines. The error rate of functional annotation is estimated to be only between 1–2%. Comparison of AutoFACT to the traditional top-BLAST-hit annotation method shows that our procedure increases the number of functionally informative annotations by approximately 50%.ConclusionAutoFACT will serve as a useful annotation tool for smaller sequencing groups lacking dedicated bioinformatics staff. It is implemented in PERL and runs on LINUX/UNIX platforms. AutoFACT is available at .

Highlights

Assignment of function to new molecular sequence data is an essential step in genomics projects
It is derived from the raw alignment score in which the statistical properties of the scoring system used have been taken into account
AutoFACT annotation is similar AutoFACT annotation is 'Unassigned protein' AutoFACT annotation differs aFCuiogtmoumrpeart3icsopnipoefliAneustoFACT annotations across four phylogenetically diverse organisms previously annotated by well-established Comparison of AutoFACT annotations across four phylogenetically diverse organisms previously annotated by well-established automatic pipelines

Summary

Results

Methodology AutoFACT takes a single FASTA-formatted sequence file as input, automatically recognizes the sequence type as nucleotide or protein and proceeds to ask the user for preferences regarding which databases to use, the order of database importance and bit score cutoff. If there are no matches to UniRef terms, the informative terms from the informative hit of the database (nr, in this example) are queried in the same way as above, until a functionally informative description line has been assigned to the sequence. AutoFACT yields an ~50% increase in informative annotations compared to top BLAST hits against NCBI's nr and the UniRef databases. FACT annotated as 'unassigned protein', either because the only BLAST hits were to other human sequences or because the informative terms could not be matched across database sources. Because AutoFACT considered hits to Saccharomyces cerevisiae as 'uninformative', 6/10 sequences were classified as ' [domain name]-containing proteins'. AutoFACT annotations for each organism mentioned above can be viewed at http://megasun.bch.umontreal.ca/ Software/AutoFACT.htm

Conclusion

Background

14. Barrett AJ

24. Kurosky A BDRLTHTBHREAMSBBHFWM

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jun 16, 2005
Citations: 211	License type: cc-by

R Discovery Prime

R Discovery Prime

AutoFACT: An Auto matic F unctional A nnotation and C lassification T ool

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Graph Based Automatic Protein Function Annotation Improved by Semantic Similarity
Bishnu Sarker ... Sabeur Aridhi
-
Bishnu Sarker, et. al.Bishnu Sarker ... Sabeur Aridhi
01 Jan 2020
01 Jan 2020

Predicting protein-binding RNA nucleotides with consideration of binding partners
Narankhuu Tuvshinjargal ... Kyungsook Han
Computer Methods and Programs in Biomedicine | VOL. 120
Narankhuu Tuvshinjargal, et. al.Narankhuu Tuvshinjargal ... Kyungsook Han
08 Apr 2015
Computer Methods and Programs in Biomedicine | VOL. 120

Improving automatic GO annotation with semantic similarity
Bishnu Sarker ... Marie-Dominique Devignes
BMC Bioinformatics | VOL. 23
Bishnu Sarker, et. al.Bishnu Sarker ... Marie-Dominique Devignes
12 Dec 2022
BMC Bioinformatics | VOL. 23

The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation
Chenggang Yu ... Nela Zavaljevski
BMC Bioinformatics | VOL. 9
Chenggang Yu, et. al.Chenggang Yu ... Nela Zavaljevski
25 Jan 2008
BMC Bioinformatics | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AutoFACT: An Auto matic F unctional A nnotation and C lassification T ool

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics