Abstract
The application of machine learning (ML) for use in generating insights and making predictions on new records continues to expand within the medical community. Despite this progress to date, the application of time series analysis has remained underexplored due to complexity of the underlying techniques. In this study, we have deployed a novel ML, called automated time series (AutoTS) machine learning, to automate data processing and the application of a multitude of models to assess which best forecasts future values. This rapid experimentation allows for and enables the selection of the most accurate model in order to perform time series predictions. By using the nation-wide ICD-10 (International Classification of Diseases, Tenth Revision) dataset of hospitalized patients of Romania, we have generated time series datasets over the period of 2008–2018 and performed highly accurate AutoTS predictions for the ten deadliest diseases. Forecast results for the years 2019 and 2020 were generated on a NUTS 2 (Nomenclature of Territorial Units for Statistics) regional level. This is the first study to our knowledge to perform time series forecasting of multiple diseases at a regional level using automated time series machine learning on a national ICD-10 dataset. The deployment of AutoTS technology can help decision makers in implementing targeted national health policies more efficiently.
Highlights
Accurate disease forecasts can help medical organizations in taking countermeasures and advance preparedness of hospitals and the general population
In order to perform time series forecasting, a series of data points in time order had to be prepared for each one of the top 10 deadliest diseases, as defined by the WHO [41]. For this purpose the corresponding ICD-10 codes for ischemic heart diseases, stroke, chronic obstructive pulmonary disease, lower respiratory infections, Alzheimer’s disease, lung cancer, diabetes mellitus, road injuries, diarrheal diseases, and tuberculosis (Table S1) were extracted from the whole ICD-10 data set of hospitalized patients in Romania from the period 2008–2018
When compared to the current literature, this is the first study on a national ICD-10 database to perform thorough time series forecasting on multiple diseases on a regional level using AutoML to select the most accurate of a multitude of models (Table S5). This is the first study to apply automated machine learning for time series forecasting on a nationwide ICD-10 dataset
Summary
Accurate disease forecasts can help medical organizations in taking countermeasures and advance preparedness of hospitals and the general population. Deep learning, a subset of ML, has been extensively deployed over the past years due to increasing computer processing power and the availability of so-called big data sets [2,3]. Deep learning (DL) algorithms are able to perform highly complex computational analysis of massive labeled and unlabeled raw data [4]. While such DL applications have already been widely used as diagnostic tools either in disease predictions, [5,6,7] or in clinical [8,9] or pathological image analysis [10,11], there is limited ML deployment described for time series forecasting in the current literature [12]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Environmental Research and Public Health
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.