Abstract

Background/Objectives: Owing to the continuous increase of electronic records and recent advances in machine learning, various automated disease diagnosis tools have been developed and proposed in healthcare sector. In the present study, an ensemble methodology using voting and boosting techniques has been proposed for optimal selection of features and prediction of infants\' data of India. Methods/Analysis: For feature selection, the best-first search algorithm of wrapper technique has been used in addition to votingboosting. The proposed ensemble consists of combination of heterogeneous classifiers including Random Forest, J48, JRip, CART and Stochastic Gradient Descent (SGD). The effectiveness of the proposed ensemble and single classifiers have been investigated in terms of classification accuracy, precision, f-measure, recall, MCC and PRC area using varied k-fold cross validation. Findings: The results depicted that the proposed Voting-Boosting ensemble (k=15) outperforms the individual classifiers using selected features. Applications / Improvements: The proposed Voting-Boosting ensemble can be extended by using more state-of-the art classification approaches and further utilized for other healthcare datasets for enhancing the performance. Keywords: Machine learning; ensemble; feature selection; wrapper; voting and boosting

Highlights

  • India is one of the fastest growing economies and second most populous country of the world but poor health outcomes among infants has drawn global attention in health profile of India

  • The overall comparison of accuracy, precision, recall, f-measure, Matthew’s correlation coefficient (MCC) and Precision-recall curve (PRC) area was compared without feature selection & ensembling and with feature selection & ensembling at varied K-fold cross-validation which is reflected in bar graph viz. Figures 3, 4, 5, 6, 7 and 8

  • Gigantic amount of data gets analyzed by data mining and machine learning techniques and various new methodologies and automated systems have been developed

Read more

Summary

Introduction

India suffers a large proportion of the disease burden from which Infant Mortality Rate (IMR) is quite a big hurdle for the government. Infant Mortality Rate (IMR) is a standard measure for measuring infants’ death per 1,000 live births less than one year of age [1]. Sourabh et al / Indian Journal of Science and Technology 2020;13(22):2189–2202 from 81/1000 live births in 1990 to 34/1000 in 2016 and there was a total of 1.08 million deaths of under-5 children in 2016(2). India contributed 500,000 i.e. one-third of the global deaths annually and most of these are vaccine preventable deaths[3] and nearly 60 million children are malnourished every year[4]. There is a gradual decline in mortality rates globally but it is critical to regard that this reduction does not occur the same way in all countries. The health of newborns is an important indicator in the assessment of the development of any society and a growing concern at the global level

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call