Asymptotic Behavior of the Number of Regression Quantile Breakpoints

Stephen Portnoy

doi:10.1137/0912047

Stephen Portnoy

https://doi.org/10.1137/0912047

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

In the general regression model $y_i = x'_i \beta + e_i $, for $i = 1, \cdots ,n$ and $\beta \in {\bf R}^p $, the “regression quantile” $\hat{\beta } (\theta )$ estimates the coefficients of the linear regression function parallel to $x'_i \beta $ and roughly lying above a fraction $\theta $ of the data. As introduced by Koenker and Bassett [Econometrica, 46 (1978), pp. 33–50], these regression quantiles are analogous to order statistics and provide a natural and appealing approach to the analysis of the general linear model. Computation of $\hat{\beta } (\theta )$ can be expressed as a parametric linear programming problem with $J_n $ distinct extremal solutions as $\theta $ goes from zero to one. That is, there will be $J_n $ breakpoints $\{ \theta _J \} $, for $j = 1, \cdots ,J_n $, such that $\hat{\beta } (\theta _j )$ is obtained from $\hat{\beta } (\theta _{j - 1} )$ by a single simplex pivot. Each $\hat{\beta } (\theta _j )$ is characterized by a specific subset of p observations. Although no previous result restricts $J_n $to be less than the upper bound $\begin{pmatrix} n \\ p \end{pmatrix} = {\bf O}(n^p )$, practical experience suggests that $J_n $ grows roughly linearly with n. Here it is shown that, in fact, $J_n = {\bf O}(n\log n)$ in probability, where the distributional assumptions are those typical of multiple regression situations. The result is based on a probabilistic rather than combinatoric approach which should have general application to the probabilistic behavior of the number of pivots in (parametric) linear programming problems. The conditions are roughly that the constraint coefficients form random independent vectors, and that the number of variables is fixed while the number of constraints tends to infinity.

Full Text