Variance estimation in high-dimensional linear models

L H Dicker

doi:10.1093/biomet/ast065

Abstract

The residual variance and the proportion of explained variation are important quantities in many statistical models and model fitting procedures. They play an important role in regression diagnostics and model selection procedures, as well as in determining the performance limits in many problems. In this paper we propose new method-of-moments-based estimators for the residual variance, the proportion of explained variation and other related quantities, such as the ℓ2 signal strength. The proposed estimators are consistent and asymptotically normal in high-dimensional linear models with Gaussian predictors and errors, where the number of predictors d is proportional to the number of observations n; in fact, consistency holds even in settings where d/n → ∞. Existing results on residual variance estimation in high-dimensional linear models depend on sparsity in the underlying signal. Our results require no sparsity assumptions and imply that the residual variance and the proportion of explained variation can be consistently estimated even when d>n and the underlying signal itself is nonestimable. Numerical work suggests that some of our distributional assumptions may be relaxed. A real-data analysis involving gene expression data and single nucleotide polymorphism data illustrates the performance of the proposed methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Variance estimation in high-dimensional linear models

Abstract

Talk to us

Similar Papers

More From: Biometrika

Lead the way for us

Journal: Biometrika	Publication Date: Mar 6, 2014
Citations: 102

Similar Papers

Optimal equivariant prediction for high-dimensional linear models with arbitrary predictor covariance
Lee H Dicker
Electronic Journal of Statistics | VOL. 7
Lee H DickerLee H Dicker
01 Jan 2013
Electronic Journal of Statistics | VOL. 7

Analysis of Test Day Yield Data of Costa Rican Dairy Cattle
B Vargas ... J.A.M Van Arendonk
Journal of Dairy Science | VOL. 81
B Vargas, et. al.B Vargas ... J.A.M Van Arendonk
01 Jan 1998
Journal of Dairy Science | VOL. 81

Error density estimation in high-dimensional sparse linear model
Feng Zou ... Hengjian Cui
Annals of the Institute of Statistical Mathematics | VOL. 72
Feng Zou, et. al.Feng Zou ... Hengjian Cui
16 Nov 2018
Annals of the Institute of Statistical Mathematics | VOL. 72

G.ridge: An R Package for Generalized Ridge Regression for Sparse and High-Dimensional Linear Models
Takeshi Emura ... Hirofumi Michimae
Symmetry | VOL. 16
Takeshi Emura, et. al.Takeshi Emura ... Hirofumi Michimae
12 Feb 2024
Symmetry | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Variance estimation in high-dimensional linear models

Abstract

Talk to us

Similar Papers

More From: Biometrika