Abstract

Diagnosing infants who are small for gestational age (SGA) at early stages could help physicians to introduce interventions for SGA infants earlier. Machine learning (ML) is envisioned as a tool to identify SGA infants. However, ML has not been widely studied in this field. To develop effective SGA prediction models, we conducted four groups of experiments that considered basic ML methods, imbalanced data, feature selection and the time characteristics of variables, respectively. Infants with SGA data collected from 2010 to 2013 with gestational weeks between 24 and 42 were detected. Support vector machine (SVM), random forest (RF), logistic regression (LR) and Sparse LR models were trained on 10-fold cross validation. Precision and the area under the curve (AUC) of the receiver operator characteristic curve were evaluated. For each group, the performance of SVM and Sparse LR was similarly well. LR without any sparsity penalties performed worst, possibly caused by the overfitting problem. With the combination of handling imbalanced data and feature selection, the RF ensemble classifier performed best, which even obtained the highest AUC value (0.8547) with the help of expert knowledge. In other cases, RF performed worse than Sparse LR and SVM, possibly because of fully grown trees.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.