Abstract

A feature selection technique should, in theory, be able to reliably extract pertinent features, identify non-linear feature interactions, scale linearly with the number of features and dimensions, and permit the integration of known sparsity structure. Identifying a machine learning algorithm that performs best given varied distributions may be quite challenging because not all machine learning algorithms are equally created, even though many of them suit very well for a given task. The heterogeneous ensemble feature selection (HETR-EFS) technique learns to combine the feature subsets provided by base feature selectors in an ensemble. Similarly, the stacked ensemble (SE) technique learns how to best combine base classifier models to form a strong model. As a prognostic model for classifying Head and Neck Squamous Cell Carcinoma {HNSCC} recurrence patterns, this study sought to identify the combination of SE classification model and EFS technique that fit optimally when the same ML classifiers for EFS and SE learning are used. Four SE classification models; in which first one used two base classifiers: gradient boosting machine (GBM) and distributed random forest (DRF); second one used three base classifiers: GBM, DRF, and deep neural network (DNN); third one used four base classifiers: GBM, DRF, DNN, and generalized linear model (GLM); and fourth one used five base classifiers: GBM, DRF, DNN, GLM, and Naïve bayes (NB), were developed based on various EFS techniques, using GBM meta-classifier in each case. The results showed that implementing SE technique consisting of five base classifiers on heterogeneous ensemble feature (HETR-EF) subset achieved better performance than achieved on other EF subsets and implementing this SE technique on HETR-EFs achieved better performance compared to other SE techniques implemented on HETR-EFs and other feature subsets used. Thus, learning SE technique having five base classifiers on HETR-EFs is clinically appropriate as a prognostic model for classifying and predicting HNSCC patients’ recurrence data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call