Abstract
The vast increase of genomic sequence data reveals the limitation of single computer systems in terms of both processing and storage capacity. The only realistic way to cope with this growth in the near future will consist in using several computers rather than a single one. Catching up with the steady growth of genomic sequence data is likely to require systems with an increasingly large number of nodes. For this purpose, solutions based on PC clusters are attractive because they offer good performance for the price, and provide reasonable hardware scalability. Nevertheless, building such a system raises at least three important concerns: (1) the whole system, including the database management, must be scalable and accommodate a very large number of server nodes (say, tens of thousands), (2) the system must tolerate the failure individual nodes because this happen frequently as their number increases, and (3) the whole system should appear to clients as a single database image. Considering this context, we sketch the overall architecture of an autonomous scalable genome database which should satisfy the requirements mentioned above. The system consists of partially independent databases, with a management system to provides a single database image. Our prototype system makes use of XML to store information, and of Java servlet/applet technology for autonomous data management and remote access. As an illustration, we have tested on BLAST search results between mouse cDNA and human chromosome sequence.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.