ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration

Jia-Hao Bi,Yi-Fan Tong,Xing-Feng Yang,Adi F Gazdar,Zhe-Wei Qiu,John Minna,Kai Song

doi:10.1186/s13040-019-0202-3

Jia-Hao Bi, Yi-Fan Tong + Show 5 more

Open Access

https://doi.org/10.1186/s13040-019-0202-3

Copy DOI

Abstract

Tremendous amount of whole-genome sequencing data have been provided by large consortium projects such as TCGA (The Cancer Genome Atlas), COSMIC and so on, which creates incredible opportunities for functional gene research and cancer associated mechanism uncovering. While the existing web servers are valuable and widely used, many whole genome analysis functions urgently needed by experimental biologists are still not adequately addressed. A cloud-based platform, named CG (ClickGene), therefore, was developed for DIY analyzing of user’s private in-house data or public genome data without any requirement of software installation or system configuration. CG platform provides key interactive and customized functions including Bee-swarm plot, linear regression analyses, Mountain plot, Directional Manhattan plot, Deflection plot and Volcano plot. Using these tools, global profiling or individual gene distributions for expression and copy number variation (CNV) analyses can be generated by only mouse button clicking. The easy accessibility of such comprehensive pan-cancer genome analysis greatly facilitates data mining in wide research areas, such as therapeutic discovery process. Therefore, it fills in the gaps between big cancer genomics data and the delivery of integrated knowledge to end-users, thus helping unleash the value of the current data resources. More importantly, unlike other R-based web platforms, Dubbo, a cloud distributed service governance framework for ‘big data’ stream global transferring, was used to develop CG platform. After being developed, CG is run on an independent cloud-server, which ensures its steady global accessibility. More than 2 years running history of CG proved that advanced plots for hundreds of whole-genome data can be created through it within seconds by end-users anytime and anywhere. CG is available at http://www.clickgenome.org/.

Highlights

The rapid development of next-generation sequencing and array-based profiling methods generate large quantities of diverse types of genomic data [1]
We developed the CG (ClickGene) platform, a cloud-based one, to deliver fast and customizable functionalities to complement with the existing tools
Besides Dynamic Time Warping (DTW), to quantity the similarity, we introduced other three popular scores shown in the following equations: XÀ

Summary

Introduction

The rapid development of next-generation sequencing and array-based profiling methods generate large quantities of diverse types of genomic data [1]. Public data portals like TCGA (The Cancer Genome Atlas) and COSMIC [2] provide more and more genome data in different formats and file types. While all of these enable researchers to study the genome at unprecedented resolution, due to the precondition of computer skills and mathematical/statistical techniques, analyzing. It’s a very time-consuming process to figure out a way to download the proper data It is not reasonable for a given researcher to get familiar with all these kinds of trials of data downloading and analyzing. For example: Tablet [3], BamView [4], IGV [5], MethylMix [6], GISTIC [7], Web-TCGA [8], TCGA-assembler [9], cBioPortal [10], GEPIA [11], The UCSC Cancer Genomics Browser [12] and so on

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BioData Mining	Publication Date: Jun 26, 2019
Citations: 13	License type: open-access

R Discovery Prime

R Discovery Prime

ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData Mining

Lead the way for us

Similar Papers

Somatic single nucleotide variations and copy number variation can be used to distinguish high grade serous ovarian cancer from benign fallopian tubes with high accuracy (219)
Nicholas Cardillo ... Eric Devor
Gynecologic Oncology | VOL. 166
Nicholas Cardillo, et. al.Nicholas Cardillo ... Eric Devor
01 Aug 2022
Gynecologic Oncology | VOL. 166

Whole Genome Copy Number Variation Analysis of Chronic Lymphocytic Leukemia (CLL) Cells From Early-Intermediate Stage, High Risk CLL Patients Prior to First Treatment Reveals New Loss of Heterozygosity and Duplication Events in the CLL Genome.
Steven A Schichman ... David S Viswanatha
Blood | VOL. 114
Steven A Schichman, et. al.Steven A Schichman ... David S Viswanatha
20 Nov 2009
Blood | VOL. 114

Complex Interstitial Deletions of 11q and Copy-Neutral Loss of Heterozygosity of 11q Are Detected by Whole Genome Copy Number Variation Analysis of Early-Intermediate Stage, High Risk Chronic Lymphocytic Leukemia Patients.
Steven A Schichman ... David S Viswanatha
Blood | VOL. 114
Steven A Schichman, et. al.Steven A Schichman ... David S Viswanatha
20 Nov 2009
Blood | VOL. 114

DataSheet_2.pdf
-
-
--
16 Dec 2021
16 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ClickGene: an open cloud-based platform for big pan-cancer data genome-wide association study, visualization and exploration

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData Mining