Abstract

Abstract Genomics next generation sequencing (NGS) and third generation sequencing (TGS) have a broad area of applications in life science, such as Non-Invasive Prenatal Testing (NIPT), ctDNA Testing for Non-Invasive Tumor Personalized Therapy, Whole Genome Sequencing (WGS), Whole Exome Sequencing (WES), RNA Sequencing, etc. The high throughput genome sequencing instrument is capable of sequencing thousands of samples in parallel, generating dozens of terabytes of genomic data in one day. The storage and analysis of the big petabytes of genomic data are approaching a very challenge for much of the biomedical research communities. In this paper, we give the design and implementation of a genomics cloud, which can scale storage and computing abilities flexibly and provides many easy to use genomics analysis software for customers. This paper gives the technical solution for building a genomics cloud based on CWL/WDL, Docker, DAG, NAS and Object Storage System. The implemented cloud platform also frees scientists from the burden of: building high performance cluster, managing millions of genomic data files, and scripting genomics analysis pipelines.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call