Let $F$ be a (nonempty) set of distribution functions (d.f.'s) of random variables (r.v.'s) with zero means and positive, finite variances. Let $\mathfrak{F}(F)$ be the set of all sequences of independent r.v.'s (independent within each sequence) whose d.f.'s belong to $F$ but are not necessarily the same from term to term of the sequence. No assumptions are made on the interrelations between the joint probability spaces of the r.v.'s of different sequences of $\mathfrak{F}(F)$. Accordingly, dependence or independence between r.v.'s of different sequences needs not be specified. A generic member of $\mathfrak{F}(F)$ will be denoted by $\epsilon = \{\epsilon_k; k = 1, 2, \cdots\}$, or, when we discuss sequences of members of $\mathfrak{F}(F)$, by $\epsilon(n) = \{\epsilon_{nk}; k = 1, 2, \cdots\}, n = 1, 2, \cdots$. In the following, $\mathfrak{F}(F)$ plays the role of a parameter space, the parameter points being $\epsilon, \epsilon(1), \epsilon(2), \cdots$. Since only the d.f.'s of a sequence of (independent) r.v.'s are relevant to the central limit theorem, instead of $\mathfrak{F}(F)$, the set of sequences of d.f.'s corresponding to the elements of $\mathfrak{F}(F)$ may also be regarded as the parameter space. It clearly is a map of $\mathfrak{F}(F)$. The inverse mapping subdivides $\mathfrak{F}(F)$ into certain equivalence classes. From elementary set theory it follows that $\mathfrak{F}(F)$ as well as its map has the cardinality of the continuum if $F$ contains more than one element. Further, let $\{a_{nk}; n = 1, 2, \cdots; k = 1, 2, \cdots\}$ be a double sequence of real constants, and let $\{k_n\}$ be a sequence of positive integers such that $a_{nk_n} \neq 0$ and $a_{nk} = 0$ for $k > k_n, n = 1, 2, \cdots$. We denote the variance of $\sum^{k_n}_{k=1} a_{nk}\epsilon_{nk}$ by \begin{equation*}\tag{1}B_n^2 = \sum^{k_n}_{k=1} a^2_{nk}\sigma^2_{nk}, \quad\sigma^2_{nk} = \operatorname{var} \epsilon_{nk},\end{equation*} and put \begin{equation*}\tag{2}\zeta_n = B_n^{-1} \sum^{k_n}_{k=1} a_{nk}\epsilon_{nk},\quad n = 1, 2, \cdots,\end{equation*} where $B_n = + (B_n^2)^{\frac{1}{2}}$. If all sequences $\epsilon(n)$ are identical we denote them by $\epsilon$ and write instead of (2) $\zeta_n(\epsilon) = B_n^{-1}(\epsilon) \sum^{k_n}_{k=1} a_{nk}\epsilon_k, \quad n = 1, 2, \cdots,$ which shows the dependence on the parameter $\epsilon$ more clearly; $B_n^2(\epsilon)$ is defined analogously to (1). This note deals with necessary and sufficient conditions on the set $F$ and on the double sequence $\{a_{nk}\}$ in order that the d.f.'s of the $\zeta_n$ tend to the standard normal d.f. (denoted by $\Phi(x)$) and that the $B^{-1}_na_{nk}\epsilon_{nk}$ are infinitesimal for every sequence of sequences $\epsilon(1), \epsilon(2), \cdots \varepsilon \mathfrak{F}(F)$ (Theorem 1). For the case of identical sequences $\epsilon(n) (= \epsilon)$, which is of particular interest in applications, a normal convergence theorem (Theorem 3) holding uniformly for $\epsilon$ on $\mathfrak{F}(F)$ is obtained under the same necessary and sufficient conditions as in Theorem 1. Theorem 3 yields, e.g., conditions for the asymptotic normality of the least squares estimators of the parameters in a linear regression with independent and not necessarily identically distributed errors whose d.f.'s are unknown but belong to a certain class $F$ (Eicker (1963)). From the necessity of the conditions it follows that these conditions are the best possible ones under the limited information about the error terms provided by the model assumptions. Frequently, when limit theorems for families of sequences of random variables are met in statistics and probability theory, the emphasis is on the uniformity of the convergence of the sequences with respect to the family parameter which assumes values in a given a priori set. (For an example, compare Parzen (1954), p. 38. That paper also cites some of the earlier publications on the subject.) The present note emphasizes not primarily the uniformity of the convergence, but the necessity of the conditions (including one on the parameter space $\mathfrak{F}(F))$ for the convergence on the parameter space. Accordingly, the set $F$ cannot be prescribed arbitrarily. The uniformity of the convergence on $\mathfrak{F}(F)$ is shown without difficulty to be implied by the ordinary convergence on $\mathfrak{F}(F)$. Another difference from earlier work may be seen in the particular structure of $\mathfrak{F}(F)$, which is comparable to an infinite-dimensional vector space. Although many of the earlier convergence theorems are (or can be) formulated for abstract parameter spaces, in the applications such as estimation theory these spaces are usually specialized to intervals on the line or in a finite-dimensional vector space.
Read full abstract