BackgroundPatients who exceed their expected length of stay in the hospital come at a cost to stakeholders in the healthcare sector as bed spaces are limited for new patients, nosocomial infections increase and the outcome for many patients is hampered due to multimorbidity after hospitalization.ObjectivesThis paper develops a technique for predicting Extended Length of Hospital Stay (ELOHS) at preadmission and their risk factors using hospital data.MethodsA total of 91,468 records of patient’s hospital information from a private acute teaching hospital were used for developing a machine learning algorithm relaying on Recursive Feature Elimination with Cross-Validation and Extra Tree Classifier (RFECV-ETC). The study implemented Synthetic Minority Oversampling Technique (SMOTE) and tenfold cross-validation to determine the optimal features for predicting ELOHS while relying on multivariate Logistic Regression (LR) for computing the risk factors and the Relative Risk (RR) of ELOHS at a 95% confidence level.ResultsAn estimated 11.54% of the patients have ELOHS, which increases with patient age as patients < 18 years, 18–40 years, 40–65 years and ≥ 65 years, respectively, have 2.57%, 4.33%, 8.1%, and 15.18% ELOHS rates. The RFECV-ETC algorithm predicted preadmission ELOHS to an accuracy of 89.3%. Age is a predominant risk factors of ELOHS with patients who are > 90 years—PAG (> 90) {RR: 1.85 (1.34–2.56), P: < 0.001} having 6.23% and 23.3%, respectively, higher likelihood of ELOHS than patient 80–90 years old—PAG (80–90) {RR: 1.74 (1.34–2.38), P: < 0.001} and those 70–80 years old—PAG (70–80) {RR: 1.5 (1.1–2.05), P: 0.011}. Those from admission category—ADC (US1) {RR: 3.64 (3.09–4.28, P: < 0.001} are 14.8% and 70.5%, respectively, more prone to ELOHS compared to ADC (UC1) {RR: 3.17 (2.82–3.55), P: < 0.001} and ADC (EMG) {RR: 2.11 (1.93–2.31), P: < 0.001}. Patients from SES (low) {RR: 1.45 (1.24–1.71), P: < 0.001)} are 13.3% and 45% more susceptible to those from SES (middle) and SES (high). Admission type (ADT) such as AS2, M2, NEWS, S2 and others {RR: 1.37–2.77 (1.25–6.19), P: < 0.001} also have a high likelihood of contributing to ELOHS while the distance to hospital (DTH) {RR: 0.64–0.75 (0.56–0.82), P: < 0.001}, Charlson Score (CCI) {RR: 0.31–0.68 (0.22–0.99), P: < 0.001–0.043} and some VMO specialties {RR: 0.08–0.69 (0.03–0.98), P: < 0.001–0.035} have limited influence on ELOHS.ConclusionsRelying on the preadmission assessment of ELOHS helps identify those patients who are susceptible to exceeding their expected length of stay on admission, thus, making it possible to improve patients’ management and outcomes.