Abstract

BackgroundCorrectly classifying the subtypes of cancer is of great significance for the in-depth study of cancer pathogenesis and the realization of personalized treatment for cancer patients. In recent years, classification of cancer subtypes using deep neural networks and gene expression data has gradually become a research hotspot. However, most classifiers may face overfitting and low classification accuracy when dealing with small sample size and high-dimensional biology data.ResultsIn this paper, a laminar augmented cascading flexible neural forest (LACFNForest) model was proposed to complete the classification of cancer subtypes. This model is a cascading flexible neural forest using deep flexible neural forest (DFNForest) as the base classifier. A hierarchical broadening ensemble method was proposed, which ensures the robustness of classification results and avoids the waste of model structure and function as much as possible. We also introduced an output judgment mechanism to each layer of the forest to reduce the computational complexity of the model. The deep neural forest was extended to the densely connected deep neural forest to improve the prediction results. The experiments on RNA-seq gene expression data showed that LACFNForest has better performance in the classification of cancer subtypes compared to the conventional methods.ConclusionThe LACFNForest model effectively improves the accuracy of cancer subtype classification with good robustness. It provides a new approach for the ensemble learning of classifiers in terms of structural design.

Highlights

  • Classifying the subtypes of cancer is of great significance for the in-depth study of cancer pathogenesis and the realization of personalized treatment for cancer patients

  • Datasets and parameters The data used in this paper are RNA sequence gene expression data, which were obtained from The Cancer Genome Atlas (TCGA) [24]

  • In order to make better comparison with other classifier models, the experiments used the datasets of three kinds of cancers used in the literature [25]: breast invasive cancer (BRCA), glioblastoma multiforme (GBM) and lung cancer (LUNG)

Read more

Summary

Introduction

Classifying the subtypes of cancer is of great significance for the in-depth study of cancer pathogenesis and the realization of personalized treatment for cancer patients. Objective and accurate subtype classification of cancer can enable doctors to correctly understand the pathogenesis and primary location of cancer, which is of great significance to the study of. High-throughput sequencing technologies have evolved, and many types of molecular biology data have been rapidly accumulated (mainly in terms of the characteristic dimension of the data, rather than the number of samples). Traditional analysis methods are difficult to meet the analysis requirements of high-dimensional data, so the use of bioinformatics to process genetic information at the molecular level has received increasing attention in recent years [6, 7]. Considering the comprehensiveness of gene expression data and its high correlation with cancer, it is highly feasible to use gene expression data to develop classification models for cancer subtypes [8]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call