Optimality Assessments of Classifiers on Single and Multi-labelled Obstetrics Outcome Classification Problems

Udoinyang G Inyang,Chuwkudi O,Funebi F,Samuel A,Ifiok J

doi:10.14569/ijacsa.2021.0120260

Abstract

It is indisputable that clinicians cannot exactly state the outcome of pregnancies through conventional knowledge and methods even as the surge in human knowledge continues. Hence, several computational techniques have been adapted for precise pregnancy outcome (PO) prediction. Obstetric datasets for PO determination exist as single label learning (SLL), multi-label learning (MLL) and multi-target (MTP) problems. There is however no single classifier recommended to optimally satisfy the needs of all the classification types. This work therefore identifies six widely used PO classifiers and investigates their performances in all three classification categories; to find the best performing classifier. Obstetric dataset exposed to input rank analysis via Principal component Analysis, produced thirteen (13) significant features for the experiment. Accuracy, F1-measure and build/test time were used as evaluation metrics. Decision tree (DT) had an average accuracy and F1 score of 89.23% and 88.23% respectively, with 1.0 average rank. Under MLL configuration, average accuracy (91.71%) and F1 score (94.28%) were highest in the random forest (RF) which had a 1.0 average test time rank. Using MTP, DT had an average accuracy of 88.80% and average F1 score of 71.13%, the multi-layered perceptron (MLP) had the best time cost with an average rank value of 2.0. From the results, RF is most optimal in terms of accuracy and average rank value, while DT is the most efficient in terms of time cost. The comparative analysis of global averages of the six base classifiers shows that RF is the most optimal algorithm with an average accuracy of 87.3% given all three data setups in the study. MLP on the other hand had an unexpectedly high time cost, making it unsuitable for similar data classifications if time is the main criterion. It is recommended that the choice of the classifier should either be RF or DT depending on the application domain and whether or not time cost is a major consideration.

Highlights

Machine Learning (ML), a fast-rising branch of artificial intelligence (AI), encompasses computer science, engineering, mathematical sciences, cognitive science and many more disciplines [1]
In the single label learning (SLL) setup, Decision tree (DT) had the best accuracy, F1 score and test time with an average rank of 1.0. This was followed by random forest (RF) in accuracy and SVM in F1 score, while multi-layered perceptron (MLP) had the second best time cost
RF performed better with the highest accuracy and F1 scores and was followed by DT and MLP for accuracy and F1 measures, respectively

Summary

Introduction

Machine Learning (ML), a fast-rising branch of artificial intelligence (AI), encompasses computer science, engineering, mathematical sciences, cognitive science and many more disciplines [1]. The advancement and wide applications of ML is largely due to the availability of enormous data repositories and the satisfaction and reliability of its performances — accuracy and computational cost. It equips systems with cognitive capability of understanding the concepts of their environments through the building of models and functions, and the communication of their experiences with patterns. Classification is the most common and widely applied SML approach. It is aimed at identifying and assigning membership class to a new record, from a set of already defined classes [4,6]. For example in medical diagnosis, a laboratory test result might confirm the presence or otherwise of causative organisms in the tested patient’s sample while the patient can concurrently suffer from more than two diseases

Objectives

Methods

Results

Conclusion