StandEnA: a customizable workflow for standardized annotation and generating a presence-absence matrix of proteins.

Fatma Chafra,Felipe Borim Correa,Faith Oni,Ulisses Nunes Da Rocha,Özlen Konu Karakayalı,Peter F Stadler

doi:10.1093/bioadv/vbad069

Abstract

Several genome annotation tools standardize annotation outputs for comparability. During standardization, these tools do not allow user-friendly customization of annotation databases; limiting their flexibility and applicability in downstream analysis. StandEnA is a user-friendly command-line tool for Linux that facilitates the generation of custom databases by retrieving protein sequences from multiple databases. Directed by a user-defined list of standard names, StandEnA retrieves synonyms to search for corresponding sequences in a set of public databases. Custom databases are used in prokaryotic genome annotation to generate standardized presence-absence matrices and reference files containing standard database identifiers. To showcase StandEnA, we applied it to six metagenome-assembled genomes to analyze three different pathways. StandEnA is an open-source software available at https://github.com/mdsufz/StandEnA. Supplementary data are available at Bioinformatics Advances online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

StandEnA: a customizable workflow for standardized annotation and generating a presence-absence matrix of proteins.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics advances

Lead the way for us

Journal: Bioinformatics advances	Publication Date: Jan 5, 2023
License type: CC BY 4.0

Similar Papers

Identifying compartments in presence–absence matrices and bipartite networks: insights into modularity measures
Elisa Thébault
Journal of Biogeography | VOL. 40
Elisa ThébaultElisa Thébault
27 Nov 2012
Journal of Biogeography | VOL. 40

Swap and fill algorithms in null model analysis: rethinking the knight's tour.
Nicholas J Gotelli ... Gary L Entsminger
Oecologia | VOL. 129
Nicholas J Gotelli, et. al.Nicholas J Gotelli ... Gary L Entsminger
01 Oct 2001
Oecologia | VOL. 129

Scanning accuracy of nondental structured light extraoral scanners compared with that of a dental-specific scanner
Wenceslao Piedra-Cascón ... Marta Revilla-León
The Journal of Prosthetic Dentistry | VOL. 126
Wenceslao Piedra-Cascón, et. al.Wenceslao Piedra-Cascón ... Marta Revilla-León
19 Jul 2020
The Journal of Prosthetic Dentistry | VOL. 126

Evaluation of bottom-up and top-down mass spectrum identifications with different customized protein sequences databases
Ziwei Li ... Bo He
Bioinformatics | VOL. 36
Ziwei Li, et. al.Ziwei Li ... Bo He
04 Oct 2019
Bioinformatics | VOL. 36

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

StandEnA: a customizable workflow for standardized annotation and generating a presence-absence matrix of proteins.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics advances