Abstract

We present RVDB-prot, a database corresponding to the protein equivalent of the nucleic acid reference virus database RVDB. Protein databases can be helpful to perform more sensitive protein sequence comparisons. Similarly to its homologous public repository, RVDB-prot aims to provide reliable and accurately annotated unique entries, while including also an Hidden Markov Model (HMM) protein profiles database for distant protein searching.

Highlights

  • Any reports and responses or comments on the article can be found at the end of the article

  • We detailed where the protein sequences coded by nucleic sequences are found

  • UniProtKB1 contains numerous viral sequences (4 497 049 in total, including 17 008 (0.38%) reviewed) that could, as for NCBI/nr, increase computation time when thousands of sequences have to be analyzed concomitantly, which is routinely practiced in metagenomics analyses

Read more

Summary

23 Apr 2019

Any reports and responses or comments on the article can be found at the end of the article. The need for better, well-annotated and comprehensive public viral database that can be used for the identification of viruses by high-throughput sequencing led Goodacre et al to propose their Reference Viral DataBase (RVDB)[2] This database consists of a collection of all currently known viral genomes and virus-related nucleic sequences retrieved from NCBI/nr or RefSeq and includes a specific, both manual and computational reviewing process, as well as four updates of the contents per year. The reviewing process eliminates a great quantity of unwanted non-viral sequences like: cloning vectors, endogenous sequences, sequences that were wrongly annotated as virus but were of cellular origin, etc This high level of curation makes RVDB quite attractive for the virology research community and in June 2020, version 19.0 was released. We describe the conversion from the nucleotide version of RVDB to the protein version RVDB-prot, as well as the clustering process leading to the HMM profiles

Methods
UniProt Consortium
Eddy SR
12. Bigot T
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call