Abstract

This paper proposes a novel algorithm for optimizing decision variables with respect to an outcome variable of interest in complex problems, such as those arising from big data. The proposed algorithm builds on the notion of Markov blankets in Bayesian networks to alleviate the computational challenges associated with optimization tasks in complex datasets. Through a case study, we apply the algorithm to optimize medication prescriptions for diabetic patients, who have different characteristics, suffer from multiple comorbidities, and take multiple medications concurrently. In particular, we demonstrate how the optimal combination of diabetic medications can be found by examining the comparative effectiveness of the medications among similar patients. The case study is based on 5 years of data for 19,223 diabetic patients. Our results indicate that certain patient characteristics (e.g., clinical and demographic features) influence optimal treatment decisions. Among patients examined, monotherapy with metformin was the most common optimal medication decision. The results are consistent with the relevant clinical guidelines and reports in the medical literature. The proposed algorithm obviates the need for knowledge of the whole Bayesian network model, which can be very complex in big data problems. The procedure can be applied to any complex Bayesian network with numerous features, multiple decision variables, and a target variable.

Highlights

  • MotivationComplex datasets arise in many disciplines, including health care, business, and engineering

  • Bayesian network (BN) model The learning procedure resulted in a BN with a total precision of 88.75% and area under receiver operating characteristic curve (AUC) of 71.15%, which together suggest an acceptable model fit

  • Concluding remarks This paper demonstrates how a BN could help in optimal decision-making when facing complex problems such as those arising from big data

Read more

Summary

Introduction

MotivationComplex datasets arise in many disciplines, including health care, business, and engineering. In problems involving data-driven optimization based on such complex datasets, it is difficult to characterize optimality. The difficulty results from the many variables in the data (including nonmodifiable features as well as modifiable decision variables) and the numerous sources that generate the data at high speed. The complexity of a dataset might affect the optimization task since relevant features might fail to be Hosseini et al J Big Data (2020) 7:26 matched, and considerations of all combinations of the decision variables might yield a computationally intractable problem. By leveraging machine learning (ML) and big data, we propose a new algorithm that identifies the optimal decision variables that affect the target variable while matching the relevant features. It is quite challenging to identify the optimal combination of medications for these patients

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call