Abstract

Empirical Bayes methods are designed in selecting massive variables, which may be inter-connected following certain hierarchical structures, because of three attributes: taking prior information on model parameters, allowing data-driven hyperparameter values, and free of tuning parameters. We propose an iterated conditional modes/medians (ICM/M) algorithm to implement empirical Bayes selection of massive variables, while incorporating sparsity or more complicated a priori information. The iterative conditional modes are employed to obtain data-driven estimates of hyperparameters, and the iterative conditional medians are used to estimate the model coefficients and therefore enable the selection of massive variables. The ICM/M algorithm is computationally fast, and can easily extend the empirical Bayes thresholding, which is adaptive to parameter sparsity, to complex data. Empirical studies suggest competitive performance of the proposed method, even in the simple case of selecting massive regression predictors.

Highlights

  • Selecting variables in problems with a large number of predictors is a challenging yet critical problem in analyzing high-dimensional data

  • Many efforts have been devoted to selecting variables from massive candidates by incorporating rich a priori information accumulated from historical research or practices

  • We propose an iterative conditional modes/medians (ICM/M) algorithm for easy implementation and fast computation of empirical Bayes variable selection (EBVS)

Read more

Summary

Introduction

Selecting variables in problems with a large number of predictors is a challenging yet critical problem in analyzing high-dimensional data. Because highdimensional data are usually of relatively small sample sizes, successful variable selection demands appropriate incorporation of a priori information. Many methods have been developed to take full advantage of this sparsity assumption, mostly built upon thresholding procedures (Donoho and Johnstone, 1994), see Tibshirani (1996), Fan and Li (2001), and others. For graph-structured variables, Li and Li (2010) and Pan et al (2010) proposed to use Laplacian matrices and Lγ norms, respectively. Li and Zhang (2010) and Stingo et al (2011) both employed Bayesian approaches to incorporate structural information of the variables, both formulating Ising priors For graph-structured variables, Li and Li (2010) and Pan et al (2010) proposed to use Laplacian matrices and Lγ norms, respectively. Li and Zhang (2010) and Stingo et al (2011) both employed Bayesian approaches to incorporate structural information of the variables, both formulating Ising priors

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.