ViruSurf: an integrated database to investigate viral sequences.

Arif Canakoglu,Anna Bernasconi,Tommaso Alfonsi,Stefano Ceri,Damianos P Melidis,Pietro Pinoli

doi:10.1093/nar/gkaa846

Arif Canakoglu, Anna Bernasconi + Show 4 more

Open Access

https://doi.org/10.1093/nar/gkaa846

Copy DOI

Abstract

ViruSurf, available at http://gmql.eu/virusurf/, is a large public database of viral sequences and integrated and curated metadata from heterogeneous sources (RefSeq, GenBank, COG-UK and NMDC); it also exposes computed nucleotide and amino acid variants, called from original sequences. A GISAID-specific ViruSurf database, available at http://gmql.eu/virusurf_gisaid/, offers a subset of these functionalities. Given the current pandemic outbreak, SARS-CoV-2 data are collected from the four sources; but ViruSurf contains other virus species harmful to humans, including SARS-CoV, MERS-CoV, Ebola and Dengue. The database is centered on sequences, described from their biological, technological and organizational dimensions. In addition, the analytical dimension characterizes the sequence in terms of its annotations and variants. The web interface enables expressing complex search queries in a simple way; arbitrary search queries can freely combine conditions on attributes from the four dimensions, extracting the resulting sequences. Several example queries on the database confirm and possibly improve results from recent research papers; results can be recomputed over time and upon selected populations. Effective search over large and curated sequence data may enable faster responses to future threats that could arise from new viruses.

Highlights

The pandemic outbreak of the coronavirus disease COVID19, caused by the virus species SARS-CoV-2, has created unprecedented attention toward the genetic mechanisms of viruses
All tables have a numerical sequential primary key (PK), conventionally named using the table name and the postfix ‘ id’, and indicated as PK in Figure 1; we indicate with foreign keys (FK) the relationships from a non-key attribute to a primary key attribute of a different table
The web interface of ViruSurf is composed of four sections, numbered in Figure 4: [1] the menu bar, for accessing services, documentation and query utilities; [2] the search interface over metadata attributes; [3] the search interface over annotations and nucleotide/amino acid variants; [4] the result visualization section, showing resulting sequences with their metadata

Summary

Introduction

The pandemic outbreak of the coronavirus disease COVID19, caused by the virus species SARS-CoV-2, has created unprecedented attention toward the genetic mechanisms of viruses. The sudden outbreak has shown that the research community is generally unprepared to face pandemic crises in a number of aspects, including well-organized databases and search systems. We respond to such urgent need by means of a novel integrated database and search system collecting and curating virus sequences with their properties. We are driven by the Viral Conceptual Model (VCM) for virus sequences [1], which was recently developed by interviewing a variety of experts of the various aspects of virus research (including clinicians, epidemiologists, drug and vaccine developers). Variants are extracted by performing data analysis and include both nucleotide variants––with respect to the reference sequence for the specific species––with their impact, and amino acid variants related to the genes

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nucleic Acids Research	Publication Date: Oct 12, 2020
Citations: 40	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ViruSurf: an integrated database to investigate viral sequences.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic Acids Research

Lead the way for us

Similar Papers

Identifying optimal incomplete phylogenetic data sets from sequence databases
Changhui Yan ... Oliver Eulenstein
Molecular Phylogenetics and Evolution | VOL. 35
Changhui Yan, et. al.Changhui Yan ... Oliver Eulenstein
21 Mar 2005
Molecular Phylogenetics and Evolution | VOL. 35

Mutation Vif-22H, which allows HIV-1 to use the APOBEC3G hypermutation to develop resistance, could appear more quickly in certain non-B variants
G Yebra ... A Holguin
Journal of Antimicrobial Chemotherapy | VOL. 66
G Yebra, et. al.G Yebra ... A Holguin
03 Feb 2011
Journal of Antimicrobial Chemotherapy | VOL. 66

Historical Trends in Smart City Governance for the Management of Ancient Urban Centers in Iran: A Case Study of Saqqez Municipality
Lucia Nucci ... Yang Liu
International Journal of Innovation Management and Organizational Behavior | VOL. 3
Lucia Nucci, et. al.Lucia Nucci ... Yang Liu
01 Jan 2023
International Journal of Innovation Management and Organizational Behavior | VOL. 3

National center for biotechnology information viral genomes project.
Yiming Bao ... Tatiana Tatusova
Journal of Virology | VOL. 78
Yiming Bao, et. al.Yiming Bao ... Tatiana Tatusova
25 Jun 2004
Journal of Virology | VOL. 78

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ViruSurf: an integrated database to investigate viral sequences.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nucleic Acids Research