Abstract

Artificial intelligence and its related machine learning technologies constantly change how organisations manage their business data in a dynamic environment of ubiquitous data sources and formats. Most organisations face the challenge of selecting the appropriate machine learning models to extract insights from their existing business data, of which datasets may be unstructured, of different forms, types, and sizes. Logistic regression, random forest, and decision tree were the three machine learning models selected for this paper’s preliminary experiments to predict the likelihood of passengers surviving the Titanic disaster. Our investigation revealed that specific models are required to handle specific dataset types, in this case, categorical datasets. It was noted from the findings that a logistic regression model could be highly recommended for use on a categorical dataset based on the speed and high prediction performance obtained in the classification error metrics and confusion matrix. The selected models form part of a set of models currently being explored in the construction of hybrid machine learning models beyond the scope of this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call