Abstract
The trans-genomic query (TGQ) problem--enabling the free query of biological information, even across genomes--is a central challenge facing bioinformatics. Solutions to this problem can alter the nature of the field, moving it beyond the jungle of data integration and expanding the number and scope of questions that can be answered. An alignment table is a binary relationship on locations (sequence segments). An important special case of alignment tables are hit tables ? tables of pairs of highly similar segments produced by alignment tools like BLAST. However, alignment tables also include general binary relationships, and can represent any useful connection between sequence locations. They can be curated, and provide a high-quality queryable backbone of connections between biological information. Alignment tables thus can be a natural foundation for TGQ, as they permit a central part of the TGQ problem to be reduced to purely technical problems involving tables of locations.Key challenges in implementing alignment tables include efficient representation and indexing of sequence locations. We define a location datatype that can be incorporated naturally into common off-the-shelf database systems. We also describe an implementation of alignment tables in BLASTGRES, an extension of the open-source POSTGRESQL database system that provides indexing and operators on locations required for querying alignment tables. This paper also reviews several successful large-scale applications of alignment tables for Trans-Genomic Query. Tables with millions of alignments have been used in queries about alternative splicing, an area of genomic analysis concerning the way in which a single gene can yield multiple transcripts. Comparative genomics is a large potential application area for TGQ and alignment tables.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.