Abstract

The first part of the paper is dedicated to the construction of a nonparametric confidence interval for a conditional quantile with a level depending on the sample size. When this level tends to 0 or 1 as the sample size increases, the conditional quantile is said to be extreme and is located in the tail of the conditional distribution. The proposed confidence interval is constructed by approximating the distribution of the order statistics selected with a nearest neighbor approach by a Beta distribution. We show that its coverage probability converges to the preselected probability and its accuracy is illustrated on a simulation study. When the dimension of the covariate increases, the coverage probability of the confidence interval can be very different from the nominal probability. This is a well known consequence of the data sparsity especially in the tail of the distribution. In a second part, a dimension reduction procedure is proposed in order to select more appropriate nearest neighbors in the right tail of the distribution and in turn to obtain a better coverage probability for extreme conditional quantiles. This procedure is based on the Tail Conditional Independence assumption introduced in (Gardes, Extremes, pp. 57--95, 18(3), 2018).

Highlights

  • In the literature dedicated to dimension reduction, it is commonly assumed that there exists a function g0 : Rp → R such that X Y | g0(X) or equivalently such that the conditional distribution of Y given X is equal to the conditional

  • One way to quantify such a dispersion is to consider a Gini-type dispersion measure given for a large threshold y ∈ R by E[|g0(X) − g0(X∗)| | min(Y, Y ∗) > y] where (X∗, Y ∗) is an independent copy of (X, Y ). This measure is estimated by replacing the expectation by its empirical counterpart and by taking for the threshold y the order statistic Yn− nβn,n where is a sequence tending to 0 as the sample size increases

  • Based on the condition (TCI) introduced in Gardes [10], we reduce the dimension of the covariate

Read more

Summary

Introduction

Our second contribution is the proposition of a new data driven procedure to find an appropriate distance to use in the nearest neighbors selection process This distance is used in the nearest neighbors order statistics approach for the construction of confidence intervals for conditional quantiles with extreme levels α = αn → 0. To reach this goal, we start with the dimension reduction assumption introduced in Gardes [10].

Definition and main result
Illustration on simulated data
Selection of the nearest neighbors for large-dimensional covariates
Dimension reduction model
Estimation of g0
Chicago air pollution data set
Concluding remarks
Preliminaries results
Proofs of main results
Main results
Proofs of the results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call