Abstract

Domain-specific databases are essential resources for the biomedical community, leveraging expert knowledge to curate published literature and provide access to referenced data and knowledge. The limited scope of these databases, however, poses important challenges on their infrastructure, visibility, funding and usefulness to the broader scientific community. CollecTF is a community-oriented database documenting experimentally validated transcription factor (TF)-binding sites in the Bacteria domain. In its quest to become a community resource for the annotation of transcriptional regulatory elements in bacterial genomes, CollecTF aims to move away from the conventional data-repository paradigm of domain-specific databases. Through the adoption of well-established ontologies, identifiers and collaborations, CollecTF has progressively become also a portal for the annotation and submission of information on transcriptional regulatory elements to major biological sequence resources (RefSeq, UniProtKB and the Gene Ontology Consortium). This fundamental change in database conception capitalizes on the domain-specific knowledge of contributing communities to provide high-quality annotations, while leveraging the availability of stable information hubs to promote long-term access and provide high-visibility to the data. As a submission portal, CollecTF generates TF-binding site information through direct annotation of RefSeq genome records, definition of TF-based regulatory networks in UniProtKB entries and submission of functional annotations to the Gene Ontology. As a database, CollecTF provides enhanced search and browsing, targeted data exports, binding motif analysis tools and integration with motif discovery and search platforms. This innovative approach will allow CollecTF to focus its limited resources on the generation of high-quality information and the provision of specialized access to the data.Database URL: http://www.collectf.org/

Highlights

  • Biological databases have rapidly become a cornerstone of modern biology, centralizing access to knowledge and data to facilitate and often guide experimental and computational research across all biological science disciplines

  • Beyond major coordinated resources hosted by federal institutions, such as the National Center for Biotechnological Information (NCBI) or the European Bioinformatics Institute (EMBL-EBI), the biological database arena is dominated by domain-specific databases [1,2,3,4]

  • These databases aggregate a community of researchers devoted to the highly specific annotation of a particular facet of biology and have become an essential resource for biomedical research in many different ways

Read more

Summary

Introduction

Biological databases have rapidly become a cornerstone of modern biology, centralizing access to knowledge and data to facilitate and often guide experimental and computational research across all biological science disciplines. The CollecTF records implemented for UniProtKB entries contain detailed information on the sites bound by the TF, including their genomic location, the experimental evidence and literature sources, the genes regulated through the binding event and links to external databases providing additional information on the binding mechanism (e.g. the Protein Data Bank), the bound sites or their regulatory role (e.g. Gene Expression Omnibus) [25, 26] (Figure 3). Database users can search CollecTF for TF-binding motifs spanning an arbitrary number of bacterial clades They can specify the level of experimental support for reported sites, ranging from broad groupings (e.g. in vitro techniques) to specific methods (e.g. DNAse footprinting), allowing them to generate fully customized collections of binding sites for a TF of interest [12]. CollecTF provides a TF-binding search service that allows users to search genome assemblies, as well as integration with the MEME discovery suite as a reference database for TF-binding site search and motif discovery [17]

Discussion
Conclusions and future directions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.