Abstract

Tremendous amount of whole-genome sequencing data have been provided by large consortium projects such as TCGA (The Cancer Genome Atlas), COSMIC and so on, which creates incredible opportunities for functional gene research and cancer associated mechanism uncovering. While the existing web servers are valuable and widely used, many whole genome analysis functions urgently needed by experimental biologists are still not adequately addressed. A cloud-based platform, named CG (ClickGene), therefore, was developed for DIY analyzing of user’s private in-house data or public genome data without any requirement of software installation or system configuration. CG platform provides key interactive and customized functions including Bee-swarm plot, linear regression analyses, Mountain plot, Directional Manhattan plot, Deflection plot and Volcano plot. Using these tools, global profiling or individual gene distributions for expression and copy number variation (CNV) analyses can be generated by only mouse button clicking. The easy accessibility of such comprehensive pan-cancer genome analysis greatly facilitates data mining in wide research areas, such as therapeutic discovery process. Therefore, it fills in the gaps between big cancer genomics data and the delivery of integrated knowledge to end-users, thus helping unleash the value of the current data resources. More importantly, unlike other R-based web platforms, Dubbo, a cloud distributed service governance framework for ‘big data’ stream global transferring, was used to develop CG platform. After being developed, CG is run on an independent cloud-server, which ensures its steady global accessibility. More than 2 years running history of CG proved that advanced plots for hundreds of whole-genome data can be created through it within seconds by end-users anytime and anywhere. CG is available at http://www.clickgenome.org/.

Highlights

  • The rapid development of next-generation sequencing and array-based profiling methods generate large quantities of diverse types of genomic data [1]

  • We developed the CG (ClickGene) platform, a cloud-based one, to deliver fast and customizable functionalities to complement with the existing tools

  • Besides Dynamic Time Warping (DTW), to quantity the similarity, we introduced other three popular scores shown in the following equations: XÀ

Read more

Summary

Introduction

The rapid development of next-generation sequencing and array-based profiling methods generate large quantities of diverse types of genomic data [1]. Public data portals like TCGA (The Cancer Genome Atlas) and COSMIC [2] provide more and more genome data in different formats and file types. While all of these enable researchers to study the genome at unprecedented resolution, due to the precondition of computer skills and mathematical/statistical techniques, analyzing. It’s a very time-consuming process to figure out a way to download the proper data It is not reasonable for a given researcher to get familiar with all these kinds of trials of data downloading and analyzing. For example: Tablet [3], BamView [4], IGV [5], MethylMix [6], GISTIC [7], Web-TCGA [8], TCGA-assembler [9], cBioPortal [10], GEPIA [11], The UCSC Cancer Genomics Browser [12] and so on

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call