Data Mining Perspective Research Articles

Data mining is an analytical approach that contributes to achieving a solution to many problems by extracting previously unknown, fascinating, nontrivial, and potentially valuable information from massive datasets. Clustering in data mining is used for splitting or segmenting data items/points into meaningful groups and clusters by grouping the items that are near to each other based on certain statistics. This paper covers various elements of clustering, such as algorithmic methodologies, applications, clustering assessment measurement, and researcher-proposed enhancements with their impact on data mining thorough grasp of clustering algorithms, its applications, and the advances achieved in the existing literature. This study includes a literature search for papers published between 1995 and 2023, including conference and journal publications. The study begins by outlining fundamental clustering techniques along with algorithm improvements and emphasizing their advantages and limitations in comparison to other clustering algorithms. It investigates the evolution measures for clustering algorithms with an emphasis on metrics used to gauge clustering quality, such as the F-measure and the Rand Index. This study includes a variety of clustering-related topics, such as algorithmic approaches, practical applications, metrics for clustering evaluation, and researcher-proposed improvements. It addresses numerous methodologies offered to increase the convergence speed, resilience, and accuracy of clustering, such as initialization procedures, distance measures, and optimization strategies. The work concludes by emphasizing clustering as an active research area driven by the need to identify significant patterns and structures in data, enhance knowledge acquisition, and improve decision making across different domains. This study aims to contribute to the broader knowledge base of data mining practitioners and researchers, facilitating informed decision making and fostering advancements in the field through a thorough analysis of algorithmic enhancements, clustering assessment metrics, and optimization strategies.

Read full abstract

BackgroundSearching for immunotherapy-related markers is an important research content to screen for target populations suitable for immunotherapy. Prognosis-related genes in early stage lung cancer may also affect the tumor immune microenvironment, which in turn affects immunotherapy.ResultsWe analyzed the differential genes affecting lung cancer patients receiving immunotherapy through the Cancer Treatment Response gene signature DataBase (CTR-DB), and set a threshold to obtain a total of 176 differential genes between response and non-response to immunotherapy. Functional enrichment analysis found that these differential genes were mainly involved in immune regulation-related pathways. The early-stage lung adenocarcinoma (LUAD) prognostic model was constructed through the cancer genome atlas (TCGA) database, and three target genes (MMP12, NFE2, HOXC8) were screened to calculate the risk score of early-stage LUAD. The receiver operating characteristic (ROC) curve indicated that the model had good prognostic value, and the validation set (GSE50081, GSE11969 and GSE42127) from the gene expression omnibus (GEO) analysis indicated that the model had good stability, and the risk score was correlated with immune infiltrations to varying degrees. Multi-type survival analysis and immune infiltration analysis revealed that the transcriptome, methylation and the copy number variation (CNV) levels of the three genes were correlated with patient prognosis and some tumor microenvironment (TME) components. Drug sensitivity analysis found that the three genes may affect some anti-tumor drugs. The mRNA expression of immune checkpoint-related genes showed significant differences between the high and low group of the three genes, and there may be a mutual regulatory network between immune checkpoint-related genes and target genes. Tumor immune dysfunction and exclusion (TIDE) analysis found that three genes were associated with immunotherapy response and maybe the potential predictors to immunotherapy, consistent with the CTR-DB database analysis.ConclusionsFrom the perspective of data mining, this study suggests that MMP12, NFE2, and HOXC8 may be involved in tumor immune regulation and affect immunotherapy. They are expected to become markers of immunotherapy and are worthy of further experimental research.

Read full abstract

Data Mining Perspective Research Articles

Related Topics

Articles published on Data Mining Perspective

Recovering nested structures in networks: an evaluation of hierarchical clustering techniques

Hexagonal Boron Nitride Quantum Simulator: Prelude to Spin and Photonic Qubits.

Investigating explainable transfer learning for battery lifetime prediction under state transitions

Big data application and corporate investment decisions: Evidence from A-share listed companies in China

Voices in the digital storm: Unraveling online polarization with ChatGPT

Clustering Social Media Data for Marketing Strategies

New exploration of signal detection of Regional Risks from the perspective of data mining: a pharmacovigilance analysis based on spontaneous reporting data in Zhenjiang, China

Building information modelling-enabled multi-objective optimization for energy consumption parametric analysis in green buildings design using hybrid machine learning algorithms

Research on learning behavior patterns from the perspective of educational data mining: Evaluation, prediction and visualization

A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective

Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective

Fake News Prediction

A Review on Fake News Detection using Machine Learning

Genomic and immunogenomic analysis of three prognostic signature genes in LUAD

Multi-objective optimal allocation of distributed generation considering the spatiotemporal correlation of wind-photovoltaic-load

A Data-Driven Smart Evaluation Framework for Teaching Effect Based on Fuzzy Comprehensive Analysis

Investigating the Material Properties of Nodular Cast Iron from a Data Mining Perspective

Multivariate Statistical Analysis of Quality Improvement Effect of Innovation and Entrepreneurship Education Based on Random Matrix Theory

Data mining in predictive maintenance systems: A taxonomy and systematic review

The Numeric Characteristics of Chinese A-Share Market Index Volatility, Model Simulation, and Forecasting

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Data Mining Perspective Research Articles

Related Topics

Articles published on Data Mining Perspective

Recovering nested structures in networks: an evaluation of hierarchical clustering techniques

Hexagonal Boron Nitride Quantum Simulator: Prelude to Spin and Photonic Qubits.

Investigating explainable transfer learning for battery lifetime prediction under state transitions

Big data application and corporate investment decisions: Evidence from A-share listed companies in China

Voices in the digital storm: Unraveling online polarization with ChatGPT

Clustering Social Media Data for Marketing Strategies

New exploration of signal detection of Regional Risks from the perspective of data mining: a pharmacovigilance analysis based on spontaneous reporting data in Zhenjiang, China

Building information modelling-enabled multi-objective optimization for energy consumption parametric analysis in green buildings design using hybrid machine learning algorithms

Research on learning behavior patterns from the perspective of educational data mining: Evaluation, prediction and visualization

A Systematic Literature Review on Identifying Patterns Using Unsupervised Clustering Algorithms: A Data Mining Perspective

Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective

Fake News Prediction

A Review on Fake News Detection using Machine Learning

Genomic and immunogenomic analysis of three prognostic signature genes in LUAD

Multi-objective optimal allocation of distributed generation considering the spatiotemporal correlation of wind-photovoltaic-load

A Data-Driven Smart Evaluation Framework for Teaching Effect Based on Fuzzy Comprehensive Analysis

Investigating the Material Properties of Nodular Cast Iron from a Data Mining Perspective

Multivariate Statistical Analysis of Quality Improvement Effect of Innovation and Entrepreneurship Education Based on Random Matrix Theory

Data mining in predictive maintenance systems: A taxonomy and systematic review

The Numeric Characteristics of Chinese A-Share Market Index Volatility, Model Simulation, and Forecasting