Abstract
BackgroundMicroRNAs (miRNAs) have great potential serving as tumor biomarkers and therapeutic targets. As the rapid development of high-throughput experimental technology, gene expression experiments have become more and more specialized and diversified. The complex data structure has brought great challenge for the identification of biomarkers. In the meantime, current statistical and machine learning methods for detecting biomarkers have the problem of low reliability and biased criteria.ResultsThis study aims to select combinatorial miRNA biomarkers, which have higher sensitivity and specificity than single-gene biomarkers. In order to avoid exhaustive search and redundant information, miRNAs are firstly clustered, then the combinations of representative cluster members are assessed as potential biomarkers. Both the criteria for the partition of clusters and selection of representative members are based on Fisher linear discriminant analysis (FDA). The FDA-based criterion has been demonstrated to be superior to three other criteria in selecting representative members, and also good at refining clusters. In the comparison with eight common feature selection methods, this clustering-based method performs the best with regard to the discriminative ability of selected biomarkers.ConclusionsOur experimental results demonstrate that the clustering-based method can identify microRNA combinatorial biomarkers with high accuracy and efficiency. Our method and data are available to the public upon request.
Highlights
MicroRNAs have great potential serving as tumor biomarkers and therapeutic targets
The goal of this study is slightly different from the aforementioned literatures in that we aim to develop efficient method for identifying miRNA combinatorial biomarkers, instead of large feature subsets which are hard to be interpreted in biology
GSE40525 was classified into two categories according to tumor and peri-tumor status, while GSE22220 was divided into two categories according to estrogen receptor (ER) status
Summary
MicroRNAs (miRNAs) have great potential serving as tumor biomarkers and therapeutic targets. Current statistical and machine learning methods for detecting biomarkers have the problem of low reliability and biased criteria. Tremendous researches have demonstrated that miRNAs can serve as oncogene or tumor suppressor in various cancer types [1, 2]. In order to search biomarkers, the analysis of differential gene expression is performed and genes are ranked according to certain criteria. A variety of statistical methods have been applied into the gene expression analysis. Fold change has been used as an initial metric for measuring the significance of change in expression levels between two groups of samples [7], and t-test [8] is a widely-used statistical method to select differentially expressed genes. The single-gene biomarkers are often unreliable or have insufficient ability to distinguish subtypes or different conditions for complex diseases
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.