Abstract

With the advancement of high-throughput technologies, gene expression profiles in cell lines and clinical samples are widely available in the public domain for research. However, a challenge arises when trying to perform a systematic and comprehensive analysis across independent datasets. To address this issue, we developed a web-based system, CellExpress, for analyzing the gene expression levels in more than 4000 cancer cell lines and clinical samples obtained from public datasets and user-submitted data. First, a normalization algorithm can be utilized to reduce the systematic biases across independent datasets. Next, a similarity assessment of gene expression profiles can be achieved through a dynamic dot plot, along with a distance matrix obtained from principal component analysis. Subsequently, differentially expressed genes can be visualized using hierarchical clustering. Several statistical tests and analytical algorithms are implemented in the system for dissecting gene expression changes based on the groupings defined by users. Lastly, users are able to upload their own microarray and/or next-generation sequencing data to perform a comparison of their gene expression patterns, which can help classify user data, such as stem cells, into different tissue types. In conclusion, CellExpress is a user-friendly tool that provides a comprehensive analysis of gene expression levels in both cell lines and clinical samples. The website is freely available at http://cellexpress.cgm.ntu.edu.tw/. Source code is available at https://github.com/LeeYiFang/Carkinos under the MIT License. Database URL: http://cellexpress.cgm.ntu.edu.tw/

Highlights

  • Cell lines play an important role as a model for conducting biological experiments and developing new therapies in cancer studies [1, 2]

  • An interactive principal component analysis (PCA) plot is embedded to help users to evaluate the similarity of the gene expression profiles in cell lines versus clinical samples

  • Both microarray and next-generation sequencing (NGS) data uploaded by the user can be analyzed in the CellExpress system and compared with cell lines or clinical

Read more

Summary

Introduction

Cell lines play an important role as a model for conducting biological experiments and developing new therapies in cancer studies [1, 2]. Huge differences in gene expression profiles across different cancer cell lines exist, even if they are classified into the same organ types. With the advancement of high-throughput technologies, such as microarrays and next-generation sequencing (NGS), researchers are able to identify cancer cell lines with unusual expression patterns. To address this issue, a systematic and comprehensive system for analyzing the gene expression levels across different cell lines is required. By using the CellExpress system, researchers can identify cell lines that are unsuitable for further functional studies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call