Abstract

In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources.

Highlights

  • The field of molecular biology is a dataintensive discipline, which can largely be attributed to recent advancements in ‘omics technologies [1]

  • An instance of Ontogrator for the Genomic Standards Consortium (GSC) To demonstrate the utility of facetted browsing as applied to biological data, we have worked within the Genomic Standards Consortium (GSC) [6,13] to produce an instance of the Ontogrator system populated with content from genomic, metagenomic, marker gene sequences and culture collection databases [14]

  • Using Ontogrator to explore marked up data In the screenshot of the Ontogrator online interface as shown in Figure 2, the initial view of the ontograted resources shows a default data source and the root terms of the different ontologies

Read more

Summary

Introduction

The field of molecular biology is a dataintensive discipline, which can largely be attributed to recent advancements in ‘omics technologies [1]. Faceted Browsing Here we explore an approach – that of faceted browsing – for pulling together and viewing biological data resources in a new way This approach has been successfully used in eCommerce for managing the exploration of large and complex search spaces. Individual products are placed under multiple classification hierarchies and can be viewed by users in a multitude of ways. This method is prevalent in Web sites that have extensive product catalogues, such as iTunes and Amazon, where items are described by their key attributes like price, manufacturer/publisher or genre

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call