Abstract

This work studies the problem of identifying risk factors of Small for Gestational Age (SGA) and building classifiers for SGA prediction. Recently, SGA infants have received more and more concerns as this illness brings many difficulties to them along with their whole life. Some experts have begun to study the risk factors of SGA onset by using traditional statistical ways. Others have used logistic regression (LR) to construct SGA prediction models. Meanwhile, machine learning have evolved and envisioned as a tool able to potentially identify babies with SGA. This work tests several feature selection methods. Based on the risk factors obtained through them, it trains support vector machine, random forest, and LR models and evaluates them via 10-fold cross validation in terms of precision and area under the curve of receiver operator characteristic curve. The results show that sparse LR of the wrapper algorithms owns the best feature selection effectiveness. In addition, this work compares data driven factors and knowledge driven factors and shows that the feature selection is necessary and effective. Among the trained classifiers, the LR model achieves the best performance on the data driven factors.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.