Abstract

Linear models are the most common predictive models for a continuous, discrete or categorical response and often include interaction terms, but for more than a few predictors interactions tend to be neglected because they add too many terms to the model. In this paper, we propose a simulation-based tree method to detect the interactions, which contributes to the predictions. In the method, we first bootstrap the observations and randomly choose a number of variables to build trees. The interactions between the roots and the corresponding leaves are collected. The times of each interaction that appear are counted. To obtain the benchmark of the number of each interaction that appears in the trees, the response values are substituted by randomly generated values and then we repeat the procedure. The interactions with occurrence frequency more than the benchmark are put into the regression models. Finally, we select variables by running LASSO for the model with main effects and the interactions obtained. In the experiments, our method shows good performances, especially for the data set with many interactions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.