Abstract

We consider the problem of both prediction and model selection in high dimensional generalized linear models. Predictive performance can be improved by leveraging structure information among predictors. In this paper, a graphic model-based doubly sparse regularized estimator is discussed under the high dimensional generalized linear models, that utilizes the graph structure among the predictors. The graphic information among predictors is incorporated node-by-node using a decomposed representation and the sparsity is encouraged both within and between the decomposed components. We propose an efficient iterative proximal algorithm to solve the optimization problem. Statistical convergence rates and selection consistency for the doubly sparse regularized estimator are established in the ultra-high dimensional setting. Specifically, we allow the dimensionality grows exponentially with the sample size. We compare the estimator with existing methods through numerical analysis on both simulation study and a microbiome data analysis.

Highlights

  • No work has been done on finite sample bounds of the estimation error and the model selection consistency in graphic model-based doubly sparse generalized linear models

  • The model selection consistency is established for the ultra-high dimensional graphic model-based doubly sparse GLMs

  • In the simulation studies and the real data application, we show that the method can improve the performance in the aspects of estimation, prediction, and model selection compared with the regularization methods without using the predictors’ graphic structure

Read more

Summary

Introduction

Yu and Liu [46] proposed the sparse regression method incorporating graph structure (SRIG) with a node-wise neighborhood based penalty where the penalty term is distributed over all nodes instead of all edges. They proposed an efficient computational method to solve the node-wise penalty. The model selection consistency is established for the ultra-high dimensional graphic model-based doubly sparse GLMs. In the simulation studies and the real data application, we show that the method can improve the performance in the aspects of estimation, prediction, and model selection compared with the regularization methods without using the predictors’ graphic structure.

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call