Dimensional Data Research Articles

Big data has emerged as a pivotal asset in addressing oral health disparities in recent years. Big data encompasses the vast pool of health care-related biomedical information sourced from diverse channels, such as claims data, patient registries, and electronic health records (EHRs). This study is a critical review that synthesizes the evidence, identifies gaps in knowledge, and discusses future implications regarding big data analytics and oral health disparities. Published reports from 2014 to 2023 that studied associations between big data, social determinants of oral health, and oral health disparities, published in English and available in electronic databases, were included. Search engines were MEDLINE via PubMed, Google Scholar, and Web of Science. A total of 23 studies were included in the review, and all were retrospective data analytics. Studies have used a variety of big data sources, including EHRs, claims, and national or regional registries. This study used a framework of data quality dimensions with intrinsic (data attributes) and contextual values (information provided by the data, in this case, oral health disparities) to critically appraise the included studies. Big data revealed disparities in oral health outcomes and dental care utilization based on race, ethnicity, socioeconomic status, geographical location, insurance category, access to care, and other barriers to care. For the intrinsic data dimension, none of the studies addressed or reported data missingness or consistency of the data. The studies clearly provided contextual data dimensions. From a value-added perspective, several studies provided novel and new information related to racial oral health inequities. Several studies used more than one oral health disparities variable or a composite variable. However, the conclusions from several studies were based on association-based analytics, and few studies used artificial intelligence approaches to understand the population's oral health inequities-gaps were seen in the study designs and causal analytics.

Read full abstract

The Endoscopic Third Ventriculostomy Success Score (ETVSS) is a useful decision-making heuristic when considering the probability of surgical success, defined traditionally as no repeat cerebrospinal fluid diversion surgery needed within 6 months. Nonetheless, the performance of the logistic regression (LR) model in the original 2009 study was modest, with an area under the receiver operating characteristic curve (AUROC) of 0.68. The authors sought to use a larger dataset to develop more accurate machine learning (ML) models to predict endoscopic third ventriculostomy (ETV) success and also to perform the largest validation of the ETVSS to date. The authors queried the MarketScan national database for the years 2005-2022 to identify patients < 18 years of age who underwent first-time ETV and subsequently had at least 6 months of continuous enrollment in the database. The authors collected data on predictors matching the original ETVSS: age, etiology of hydrocephalus, and history of any previous shunt placement. Next, they used 6 ML algorithms-LR, support vector classifier, random forest, k-nearest neighbors, Extreme Gradient Boosted Regression (XGBoost), and naive Bayes-to develop predictive models. Finally, the authors used nested cross-validation to assess the models' comparative performances on unseen data. The authors identified 2047 patients who met inclusion criteria, and 1261 (61.6%) underwent successful ETV. The performances of most ML models were similar to that of the original ETVSS, which had an AUROC of 0.693 on the validation set and 0.661 (95% CI 0.600-0.722) on the test set. The authors' new LR model performed comparably with AUROCs of 0.693 on both the validation and test sets, with 95% CI 0.633-0.754 on the test set. Among the more complex ML algorithms, XGBoost performed best, with AUROCs of 0.683 and 0.672 (95% CI 0.609-0.734) on the validation and test sets, respectively. This is the largest external validation of the ETVSS, and it confirms modest performance. More sophisticated ML algorithms do not meaningfully improve predictive performance compared to ETVSS; this underscores the need for higher utility, novelty, and dimensionality of input data rather than changes in modeling strategies.

Read full abstract

Dimensional Data Research Articles

Related Topics

Articles published on Dimensional Data

Least angle sparse principal component analysis for ultrahigh dimensional data

MinLinMo: a minimalist approach to variable selection and linear model prediction

A study on identifying representative trips for mobility service design

Quantum Distance Approximation for Persistence Diagrams

Intelligent aerodynamic modelling method for steady/unsteady flow fields of airfoils driven by flow field images based on modified U-Net neural network

A Storm Frame Optimization Method for Predicting and Warning the Safety Status of a Shearer

Research on a Passenger Flow Prediction Model Based on BWO-TCLS-Self-Attention

Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity

Automatic extraction of fine structural information in angle-resolved photoemission spectroscopy by multi-stage clustering algorithm

Image Processing Technique for Enhanced Combustion Efficiency of Wood Pellets

Big Data and Oral Health Disparities: A Critical Appraisal.

Classification model for blast furnace status based on multi-source information

Data-driven approach for the classification of gas turbine faults

Revisiting the Endoscopic Third Ventriculostomy Success Score using machine learning: can we do better?

Robust Principal Component Analysis Based on Fuzzy Local Information Reservation.

HMS-TENet: A hierarchical multi-scale topological enhanced network based on EEG and EOG for driver vigilance estimation

Evaluation of grounding grid corrosion extent based on laser-induced breakdown spectroscopy (LIBS) combined with machine learning

Another pipeline in local Partial Least Squares Regression (LPLS) methods: Assessing the impact of wavelet transform integration

Sensor attack online classification for UAVs using machine learning

Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Dimensional Data Research Articles

Related Topics

Articles published on Dimensional Data

Least angle sparse principal component analysis for ultrahigh dimensional data

MinLinMo: a minimalist approach to variable selection and linear model prediction

A study on identifying representative trips for mobility service design

Quantum Distance Approximation for Persistence Diagrams

Intelligent aerodynamic modelling method for steady/unsteady flow fields of airfoils driven by flow field images based on modified U-Net neural network

A Storm Frame Optimization Method for Predicting and Warning the Safety Status of a Shearer

Research on a Passenger Flow Prediction Model Based on BWO-TCLS-Self-Attention

Optimal Multitask Linear Regression and Contextual Bandits under Sparse Heterogeneity

Automatic extraction of fine structural information in angle-resolved photoemission spectroscopy by multi-stage clustering algorithm

Image Processing Technique for Enhanced Combustion Efficiency of Wood Pellets

Big Data and Oral Health Disparities: A Critical Appraisal.

Classification model for blast furnace status based on multi-source information

Data-driven approach for the classification of gas turbine faults

Revisiting the Endoscopic Third Ventriculostomy Success Score using machine learning: can we do better?

Robust Principal Component Analysis Based on Fuzzy Local Information Reservation.

HMS-TENet: A hierarchical multi-scale topological enhanced network based on EEG and EOG for driver vigilance estimation

Evaluation of grounding grid corrosion extent based on laser-induced breakdown spectroscopy (LIBS) combined with machine learning

Another pipeline in local Partial Least Squares Regression (LPLS) methods: Assessing the impact of wavelet transform integration

Sensor attack online classification for UAVs using machine learning

Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis.