Abstract

The development of screening instruments for psychiatric disorders involves item selection from a pool of items in existing questionnaires assessing clinical and behavioral phenotypes. A screening instrument should consist of only a few items and have good accuracy in classifying cases and non-cases. Variable/item selection methods such as Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, Classification and Regression Tree, Random Forest, and the two-sample t-test can be used in such context. Unlike situations where variable selection methods are most commonly applied (e.g., ultra high-dimensional genetic or imaging data), psychiatric data usually have lower dimensions and are characterized by the following factors: correlations and possible interactions among predictors, unobservability of important variables (i.e., true variables not measured by available questionnaires), amount and pattern of missing values in the predictors, and prevalence of cases in the training data. We investigate how these factors affect the performance of several variable selection methods and compare them with respect to selection performance and prediction error rate via simulations. Our results demonstrated that: (1) for complete data, LASSO and Elastic Net outperformed other methods with respect to variable selection and future data prediction, and (2) for certain types of incomplete data, Random Forest induced bias in imputation, leading to incorrect ranking of variable importance. We propose the Imputed-LASSO combining Random Forest imputation and LASSO; this approach offsets the bias in Random Forest and offers a simple yet efficient item selection approach for missing data. As an illustration, we apply the methods to items from the standard Autism Diagnostic Interview-Revised version.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.