Dictionary learning LASSO for feature selection with application to hepatocellular carcinoma grading using contrast enhanced magnetic resonance imaging.

Lei Lei,Zujun Hou,Cong Wang,Bao-Lin Ye,Ying-Long He,Jian-Peng Yuan,Pan Wang,Li-Xin Du

doi:10.3389/fonc.2023.1123493

Abstract

The successful use of machine learning (ML) for medical diagnostic purposes has prompted myriad applications in cancer image analysis. Particularly for hepatocellular carcinoma (HCC) grading, there has been a surge of interest in ML-based selection of the discriminative features from high-dimensional magnetic resonance imaging (MRI) radiomics data. As one of the most commonly used ML-based selection methods, the least absolute shrinkage and selection operator (LASSO) has high discriminative power of the essential feature based on linear representation between input features and output labels. However, most LASSO methods directly explore the original training data rather than effectively exploiting the most informative features of radiomics data for HCC grading. To overcome this limitation, this study marks the first attempt to propose a feature selection method based on LASSO with dictionary learning, where a dictionary is learned from the training features, using the Fisher ratio to maximize the discriminative information in the feature. This study proposes a LASSO method with dictionary learning to ensure the accuracy and discrimination of feature selection. Specifically, based on the Fisher ratio score, each radiomic feature is classified into two groups: the high-information and the low-information group. Then, a dictionary is learned through an optimal mapping matrix to enhance the high-information part and suppress the low discriminative information for the task of HCC grading. Finally, we select the most discrimination features according to the LASSO coefficients based on the learned dictionary. The experimental results based on two classifiers (KNN and SVM) showed that the proposed method yielded accuracy gains, compared favorably with another 5 state-of-the-practice feature selection methods.

Full Text