Let $\mathscr{F}$ be an arbitrary class of distribution functions and let $Y_1, Y_2, \cdots, Y_N$ be a sample of size $N$ from some $F \varepsilon \mathscr{F}$. For $0 < P, \gamma < 1$, a statistic $U_N = U(Y_1, \cdots, Y_N)$ is said to be a $(P, \gamma)$ upper tolerance limit for $F$ relative to $\mathscr{F}$, if \begin{equation*}\tag{0} P_F\lbrack 1 - F(U_N) \leqq P\rbrack \geqq \gamma\end{equation*} for all $F \varepsilon \mathscr{F}$. (The interval $(-\infty, U_N\rbrack$ would be called a $1 - P$ content tolerance limit at confidence level $\gamma$ in the terminology of [3].) For certain parametric classes of distributions such as the normal family or the exponential family, tolerance limits are available for all sample sizes greater than or equal to $N = 2$ for all $0 < P, \gamma < 1$. (See, e.g. [5] for normal tolerance limits. In [4] exponential and double exponential tolerance limits are obtained based on the concept of exponential content.) The forms of the tolerance limits in these cases are heavily dependent upon the particular parametric class under consideration. Consequently, if tolerance limits are desired for larger classes of distributions it is necessary to abandon these results in favor of statistics $U_N$ for which the distribution of $1 - F(U_N)$ can be appropriately bounded for all $F$'s in the larger class. Until now, the statistics used for this purpose were certain of the sample order statistics, $X_K = K$th smallest of $Y_1, Y_2, \cdots, Y_N, K = 1, 2, \cdots, N$. These are the traditional non-parametric tolerance limits. Fraser [2] and Robbins [7] have shown that they have desirable uniqueness and optimality properties when $\mathscr{F}$ is the class of distributions absolutely continuous with respect to Lebesgue measure. The non-parametric tolerance limits have one unfortunate disadvantage; namely, for given $P$ and $\gamma$, there is a minimum sample size $N(P, \gamma, K)$ such that the condition $P_F\lbrack 1 - F(X_{N - K}) \leqq P\rbrack \geqq \gamma$ is met only if $N \geqq N(P, \gamma, K)$. Thus, in cases where sampling is very expensive and stringent requirements are made on the tolerance limit (small $P$ and large $\gamma$) or where the statistician is presented with a sample of size $N < N(P, \gamma, 0)$ without the possibility of obtaining additional observations, the only recourse has been to use parametric tolerance limits with the distinct possibility that the limit obtained is meaningless. In this paper we propose a compromise scheme whereby upper tolerance limits of the form $X_{N - K - j} + b(X_{N - K} - X_{N - K - j})$ are constructed which are valid for all $N \geqq 2$ and all $0 < P, \gamma < 1$. These limits satisfy (0) for the class, $\mathscr{F}_U$, of absolutely continuous distribution functions for which $\varphi = -\log (1 - F)$ is convex and, thus, earn the title log-convex (L.C.) tolerance limits. The class $\mathscr{F}_U$ has been studied by Barlow, Marshall and Proschan in [1] and was shown to contain most of the distributions commonly used as models in statistics including the normal and exponential. The distributions in this class are called increasing hazard rate distributions in reliability theory due to the fact that the instantaneous hazard rate $f/(1 - F)$, is non-decreasing on the support of the probability density $f$. We will elaborate on the properties of this class in Section 2. Lower $(P, \gamma)$ tolerance limits of the form $L_N = X_{K + j + 1} - b(X_{K + j + 1} - X_{K + 1})$, which satisfy $P_F\lbrack F(L_N) \leqq P\rbrack \geqq \gamma$ for the class $\mathscr{F}_L$ of absolutely continuous distributions for which $\psi = -\log F$ is convex, will be obtained in Section 3. It is thus of interest to note (in Section 2) that the important class of distributions possessing densities which are Polya frequency functions of order 2 $(PF_2)$ is contained in $\mathscr{F}_L \cap \mathscr{F}_U$. Hence both upper and lower tolerance limits are available for members of this class. Tables of the $b$-factors needed for both upper and lower tolerance limits are given in Section 4 for $j = 1$ and all combinations of $P = .500, .250, .100, .050$ and $\gamma = .90, .95, .99$. The sample sizes range from $N = 2$ to the smallest value of $N$ for which the usual non-parametric tolerance limit can be used. At this point the L.C. tolerance limits are equal to the non-parametric ones. No questions of optimality are considered here; however, some comparisons of the L.C. tolerance limits with the usual normal and exponential tolerance limits are made by means of Monte Carlo sampling in Section 5. Some extensions of the theory will also be considered in Section 5.
Read full abstract