Abstract

BackgroundAlthough many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions.MethodsAn application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs.ResultsWhen a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF.ConclusionsWe demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited.

Highlights

  • Many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be achieved

  • Use case of integrated Tissue Microarray (TMA) and DNA microarray database We designed a use case using integrated DNA and TMA database, which was to find the distributions of the histologies among the cores of TMA where the intensities of the antibodies corresponding to the markers of interest in DNA microarray were high or low Xperanto-Resource Description Framework (RDF) As described above, Xperanto-RDF was implemented based on existing TMA and DNA microarray databases

  • Data from these two databases were represented as RDF triples by a mediator system such that users could have the benefits of semantic web technologies as well as those of relational databases (RDBs)

Read more

Summary

Introduction

Many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be achieved. Biological databases became essential resources to biologists in their daily researches by providing information about biological facts and experimental results and procedures and by providing management tools for the obtained data. Because these biological databases are designed for specific purposes, and independently managed, and metadata are not biological databases provide integrated data structure for knowledge management by applying semantic web technologies [1,2,3,4,6,9]. In spite of the benefits of semantic web technologies, these databases cannot directly answer to biologists for biologically meaningful questions or hypotheses. Semantic web technologies are still beneficial in those applications

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.