Abstract

In this research, we develop ordinal decision-tree-based ensemble approaches in which an objective-based information gain measure is used to select the classifying attributes. We demonstrate the applicability of the approaches using AdaBoost and random forest algorithms for the task of classifying the regional daily growth factor of the spread of an epidemic based on a variety of explanatory factors. In such an application, some of the potential classification errors could have critical consequences. The classification tool will enable the spread of the epidemic to be tracked and controlled by yielding insights regarding the relationship between local containment measures and the daily growth factor. In order to benefit maximally from a variety of ordinal and non-ordinal algorithms, we also propose an ensemble majority voting approach to combine different algorithms into one model, thereby leveraging the strengths of each algorithm. We perform experiments in which the task is to classify the daily COVID-19 growth rate factor based on environmental factors and containment measures for 19 regions of Italy. We demonstrate that the ordinal algorithms outperform their non-ordinal counterparts with improvements in the range of 6–25% for a variety of common performance indices. The majority voting approach that combines ordinal and non-ordinal models yields a further improvement of between 3% and 10%.

Highlights

  • The main objectives of this study are fourfold: (i) to extend the weighted information gain measure such that the classification error can be measured from a statistical value that is not necessarily defined by a single class; (ii) to develop ordinal decision-tree-based ensemble approaches in which an objective-based information gain measure is used; (iii) to examine the advantage of combining ordinal decision-tree-based ensemble approaches with non-ordinal individual classifiers to leverage the strengths of each type of classifier; and (iv) to examine the ability to carry out multi-class identification of different levels of a daily growth factor using ordinal decision-tree-based ensemble approaches

  • This subsection compares the performance of the objective-based information gain (OBIG)-based ordinal CART, i.e., a single decision tree, with the popular non-ordinal CART

  • In this research we suggest an extension to the objective-based information gain (OBIG) measure that was proposed in [27,28] for selecting the attributes with the greatest explanatory value in a classification problem

Read more

Summary

Introduction

Mathematical modeling is widely used to predict the transmissibility and dynamic spread of an epidemic, while statistical analysis is often used to evaluate the effect of a variety of variables on epidemic transmission. The most commonly used mathematical models are those that apply SIR/SEIR (susceptible, (exposed), infectious, and removed) differential equations [1,2,3,4]. These studies usually assume available data on the number of susceptible individuals and the numbers of infections, deaths, and recoveries. Mecenas et al [8], for example, described 17 recent studies that used these techniques to investigate the effect of weather variables on the spread of COVID-19 and SARS

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call