Abstract

BackgroundRNA-seq is a tool for measuring gene expression and is commonly used to identify differentially expressed genes (DEGs). Gene clustering is used to classify DEGs with similar expression patterns for the subsequent analyses of data from experiments such as time-courses or multi-group comparisons. However, gene clustering has rarely been used for analyzing simple two-group data or differential expression (DE). In this study, we report that a model-based clustering algorithm implemented in an R package, MBCluster.Seq, can also be used for DE analysis.ResultsThe input data originally used by MBCluster.Seq is DEGs, and the proposed method (called MBCdeg) uses all genes for the analysis. The method uses posterior probabilities of genes assigned to a cluster displaying non-DEG pattern for overall gene ranking. We compared the performance of MBCdeg with conventional R packages such as edgeR, DESeq2, and TCC that are specialized for DE analysis using simulated and real data. Our results showed that MBCdeg outperformed other methods when the proportion of DEG (PDEG) was less than 50%. However, the DEG identification using MBCdeg was less consistent than with conventional methods. We compared the effects of different normalization algorithms using MBCdeg, and performed an analysis using MBCdeg in combination with a robust normalization algorithm (called DEGES) that was not implemented in MBCluster.Seq. The new analysis method showed greater stability than using the original MBCdeg with the default normalization algorithm.ConclusionsMBCdeg with DEGES normalization can be used in the identification of DEGs when the PDEG is relatively low. As the method is based on gene clustering, the DE result includes information on which expression pattern the gene belongs to. The new method may be useful for the analysis of time-course and multi-group data, where the classification of expression patterns is often required.

Highlights

  • RNA-seq is a tool for measuring gene expression and is commonly used to identify differentially expressed genes (DEGs)

  • Various sophisticated algorithms for gene clustering are available [30,31,32], we focus our analysis on the R package MBCluster.Seq [28], as its framework is compatible with differential expression (DE) analysis and it is comparable with other R packages dedicated to detecting DEGs

  • Proposed method for identification of DEGs based on gene clustering In this study, we devised a new analysis method that uses the functionalities of the MBCluster.Seq package

Read more

Summary

Introduction

RNA-seq is a tool for measuring gene expression and is commonly used to identify differentially expressed genes (DEGs). RNA-seq is commonly used to obtain genome-wide expression data for genes [1, 2] and to identify differentially expressed genes (DEGs) for different groups or conditions [3, 4]. Obtaining a ranked gene list for the degree of differential expression (DE) is a starting point for gaining biological insights into the groups being compared [26]. Several analysis methods such as gene ontology and the construction of co-expression networks have been used to examine the biological mechanisms underlying DEGs [25].

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.