Gain Ratio Feature Selection Research Articles

Currently, there is an increased need for employing machine learning (ML) and data mining in the healthcare system domain, applications of which play a pivotal role in providing beneficial knowledge to society by utilizing the available data. There are disease risk prediction models that either do not use feature selection or use traditional feature selection models—filter, wrapper, and embedded—which have limitations such as, not using ML, using single evaluation metric, and working on imbalanced datasets. Apart from this, there is a great scope for enhancement in the prediction performance. To address these issues, an advanced hybrid ensemble gain ratio feature selection (AHEG-FS) model is proposed, which consists of four major feature selection techniques: an ensemble feature selection, a gain ratio feature selection, a backward feature elimination, and an area under the curve (AUC)—an additional evaluation metric of the novel feature reduction along with accuracy, for robust feature selection aimed at effective disease risk prediction. The subsets of important and highly ranked features are obtained with the first two techniques. Then, the proposed model is aligned with nine ML algorithms. Additionally, in the third and fourth techniques of the proposed model, the AUCs are evaluated for the aforementioned ML algorithms and the backward feature elimination is applied to remove the redundant features, resulting in the acquisition of the best subsets of highly contributing features that produce the highest precision results. Thus, four benchmarked heart disease datasets—Cleveland, Hungarian, Statlog, and Switzerland of the University of California, Irvine ML repository—are used, and the results are encouraging. The highest AUC and accuracy of 99.00% and 95.47%, respectively, are achieved, with 46.15% of features reduced. A 6.18% higher accuracy than recent studies was achieved with convergent speed.

Read full abstract

11114 Background: Lipids play roles in membrane structure, energy storage, and signal transduction as well as lung cancer. Lipidomics, a new technology aims to measure all the lipids in a cell, has not been applied to diagnostic test development for a variety of cancer types. Here, we adopt lipidomics as a means to identify plasma lipid markers for the early detection of lung cancer and complement CT-based methods for lung cancer screening. Methods: Using mass spectrometry, we profiled 390 individual lipids in a training discovery cohort comprised of cohorts that were either at “high-risk” for lung cancer (n=22) and squamous cell carcinoma at early stages (n=22). Cases had a minimum of two years clinical follow-up and were matched in terms of race, sex, age and smoking status. Gain ratio feature selection and local weighted classification model were employed to find the best training classifier, which was further validated against an additional cohort, including high-risk individuals (n= 20) and squamous cell carcinoma patients (n=17). Results: In the training discovery stage, we found 20 distinct lipids that were significantly distributed between high-risk and cases of squamous cell carcinoma. We further defined a two lipid marker panel had a training accuracy at 95.5% sensitivity, 90.9% specificity and 95.2% AUC (Area under ROC curve). The validation accuracy against the additional cohort is 100.0% sensitivity, 90.0% specificity and 99.0% AUC (Table). The power for sample size we used in both discovery training and validation stages were over 90%. Conclusions: Using lipidomics we identified two lipid markers capable of discerning cases of squamous cell carcinoma from individuals at high risk for lung cancer, with a high sensitivity, specificity and accuracy. The markers maybe further developed as a quick, safe blood test for early diagnosis of squamous cell lung cancer and reduce unnecessary follow-up imaging or invasive procedures. [Table: see text]

Read full abstract

Gain Ratio Feature Selection Research Articles

Articles published on Gain Ratio Feature Selection

A novel credit scoring system in financial institutions using artificial intelligence technology

Performance Analysis of LVQ 1 Using Feature Selection Gain Ratio for Sex Classification in Forensic Anthropology

A comparative study of metaheuristic-based machine learning classifiers using non-parametric tests for the detection of COPD severity grade

Advanced hybrid ensemble gain ratio feature selection model using machine learning for enhanced disease risk prediction

Cloud Computing-Based Framework for Breast Cancer Diagnosis Using Extreme Learning Machine.

Stress Detection via Keyboard Typing Behaviors by Using Smartphone Sensors and Machine Learning Techniques.

Effects of Machine Learning Approach in Flow-Based Anomaly Detection on Software-Defined Networking

Stock daily return prediction using expanded features and feature selection

Wavelet-based energy features for diagnosis of melanoma from dermoscopic images

An Ensemble Model for Classification of Attacks with Feature Selection based on KDD99 and NSL-KDD Data Set

Ensemble Learning Model for Diabetes Classification

Two lipids based on lipidomics as novel biomarkers for early detection of squamous cell lung cancer.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Gain Ratio Feature Selection Research Articles

Articles published on Gain Ratio Feature Selection

A novel credit scoring system in financial institutions using artificial intelligence technology

Performance Analysis of LVQ 1 Using Feature Selection Gain Ratio for Sex Classification in Forensic Anthropology

A comparative study of metaheuristic-based machine learning classifiers using non-parametric tests for the detection of COPD severity grade

Advanced hybrid ensemble gain ratio feature selection model using machine learning for enhanced disease risk prediction

Cloud Computing-Based Framework for Breast Cancer Diagnosis Using Extreme Learning Machine.

Stress Detection via Keyboard Typing Behaviors by Using Smartphone Sensors and Machine Learning Techniques.

Effects of Machine Learning Approach in Flow-Based Anomaly Detection on Software-Defined Networking

Stock daily return prediction using expanded features and feature selection

Wavelet-based energy features for diagnosis of melanoma from dermoscopic images

An Ensemble Model for Classification of Attacks with Feature Selection based on KDD99 and NSL-KDD Data Set

Ensemble Learning Model for Diabetes Classification

Two lipids based on lipidomics as novel biomarkers for early detection of squamous cell lung cancer.