Abstract

The banding estimator of Bickel and Levina and its tapering version of Cai, Zhang, and Zhou are important high-dimensional covariance estimators. Both estimators require a bandwidth parameter. We propose a bandwidth selector for the banding estimator by minimizing an empirical estimate of the expected squared Frobenius norms of the estimation error matrix. The ratio consistency of the bandwidth selector is established. We provide a lower bound for the coverage probability of the underlying bandwidth being contained in an interval around the bandwidth estimate. Extensions to the bandwidth selection for the tapering estimator and threshold level selection for the thresholding covariance estimator are made. Numerical simulations and a case study on sonar spectrum data are conducted to demonstrate the proposed approaches. Supplementary materials for this article are available online.

Highlights

  • With the advance in the modern data collection technology, data of very high dimensions are increasingly collected in scientific, social economic and financial studies, which include the microarray data, the generation sequencing data, recordings of large networks and financial observations of large portfolios

  • It is clear from the analysis in Bickel and Levina (2008a) and Cai et al (2010) that the convergence rates of the banding and the tapering estimators are critically dependent on the band width k, whereas the band width k depends on the unknown index parameter α of the bandable classes

  • Before we present an algorithm to find an estimate of t0(Σ), we review the cross validation (CV) approach proposed in Bickel and Levina (2008b), which was designed to approximate the Frobenius risk ObjD(t, Σ)

Read more

Summary

Introduction

With the advance in the modern data collection technology, data of very high dimensions are increasingly collected in scientific, social economic and financial studies, which include the microarray data, the generation sequencing data, recordings of large networks and financial observations of large portfolios. There have been advances in constructing consistent covariance estimators for high dimensional data via the regularization methods that involve thresholding or truncation. Bickel and Levina (2008a,b) and Cai et al (2010) showed that the performance of these estimators are crucially dependent on the choice of the band width or the threshold level. One segment of the sample was used to estimate Σ and the other was employed to form cross-validation scores for the band width and the threshold level selection, respectively. The conventional sample covariance was used to estimate Σ in the first segment This can adversely affect the performance of the band width or threshold level selection due to the sample covariance’s known defects under high dimensionality. Yi and Zou (2013) proposed a band width selection for the tapering estimator by minimizing the expected squared Frobenius norm of the estimation error matrix for Gaussian distributed data. Technical proofs are provided in the Appendix and the Supplementary Material, respectively

Bandable Classes and Assumptions
Underlying Band Width
Consistent Band Width Estimator
Extension to Tapering Estimation
Extension to Thresholding Estimation
Simulation Results
Empirical Study
Discussion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.