Abstract

The revolution in sequencing technologies has enabled human genomes to be sequenced at a very low cost and time leading to exponential growth in the availability of whole-genome sequences. However, the complete understanding of our genome and its association with cancer is a far way to go. Researchers are striving hard to detect new variants and find their association with diseases, which further gives rise to the need for aggregation of this Big Data into a common standard scalable platform. In this work, a database named Enlightenment has been implemented which makes the availability of genomic data integrated from eight public databases, and DNA sequencing profiles of H. sapiens in a single platform. Annotated results with respect to cancer specific biomarkers, pharmacogenetic biomarkers and its association with variability in drug response, and DNA profiles along with novel copy number variants are computed and stored, which are accessible through a web interface. In order to overcome the challenge of storage and processing of NGS technology-based whole-genome DNA sequences, Enlightenment has been extended and deployed to a flexible and horizontally scalable database HBase, which is distributed over a hadoop cluster, which would enable the integration of other omics data into the database for enlightening the path towards eradication of cancer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call