Abstract

Hierarchical categorical data is commonly encountered in social science, genetics, and other fields. The interactions between variables in hierarchical structures introduce complexity in modeling and predicting. We focus on modeling the high-dimensional linear models with hierarchical categorical variables and introduce an efficient method. The proposed method offers computational advantages when dealing with high-dimensional categorical data. In the theoretical part, we demonstrate the uniqueness of the solution and show that the proposed estimator converges the least square solution under the high probability. Additionally, we showcase the effectiveness of our method on two real-world datasets, a cancer-reg dataset and an adult dataset, and simulated datasets, where our method outperforms comparative approaches in terms of predictive accuracy, variable selection, and model complexity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call