Abstract

Advanced genomic data mining.

Highlights

  • As data banks increase their size, one of the current challenges in bioinformatics is to be able to query them in a sensible way

  • Data mining is vital to bioinformatics as it allows users to go beyond simple browsing of genome browsers, such as Ensembl [1,2] or the UCSC Genome Browser [3], to address questions; for example, the biological meaning of the results obtained with a microarray platform, or how to identify a short motif upstream of a gene, amongst others

  • Galaxy [5] provides a set of tools that can retrieve data from the Table Browser (Table Browser and BioMart will be explained below), facilitating complex queries that require multiple joins (Figure 2)

Read more

Summary

Introduction

As data banks increase their size, one of the current challenges in bioinformatics is to be able to query them in a sensible way. The generic query system has shifted toward a federated approach that has been deployed for several biological databases, and has become a component of the Generic Model Organism Database (GMOD) project In this contribution, we provide some solutions for data mining; we focus on advanced ways of interacting with BioMart using other applications to retrieve information through different platforms such as Galaxy [5] and the biomaRt package of BioConductor [11,12]. We provide some solutions for data mining; we focus on advanced ways of interacting with BioMart using other applications to retrieve information through different platforms such as Galaxy [5] and the biomaRt package of BioConductor [11,12] Many of these tools interact with the UCSC Table Browser and have similar approaches using the UCSC system.

BioMart Web Interface
BioMart Central Server BioConductor
Other Data Mining Tools
Conclusions
Accession Numbers Used in the Text
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call