Abstract

In many applications, the cumulative distribution function (cdf) $$F_{Q_N}$$ of a positively weighted sum of N i.i.d. chi-squared random variables $$Q_N$$ is required. Although there is no known closed-form solution for $$F_{Q_N}$$ , there are many good approximations. When computational efficiency is not an issue, Imhof’s method provides a good solution. However, when both the accuracy of the approximation and the speed of its computation are a concern, there is no clear preferred choice. Previous comparisons between approximate methods could be considered insufficient. Furthermore, in streaming data applications where the computation needs to be both sequential and efficient, only a few of the available methods may be suitable. Streaming data problems are becoming ubiquitous and provide the motivation for this paper. We develop a framework to enable a much more extensive comparison between approximate methods for computing the cdf of weighted sums of an arbitrary random variable. Utilising this framework, a new and comprehensive analysis of four efficient approximate methods for computing $$F_{Q_N}$$ is performed. This analysis procedure is much more thorough and statistically valid than previous approaches described in the literature. A surprising result of this analysis is that the accuracy of these approximate methods increases with N.

Highlights

  • The cumulative distribution function FQN of a positively weighted sum of i.i.d. χ12 random variables Q N, NQ N = di Wi2, di > 0, Wi ∼ N (0, 1), (1)i =1 has no known closed-form solution

  • An approximation of FQN is used in goodness-of-fit tests (Moore and Spruill 1975) and various other applications (Zhang and Chen 2007; Jayasuriya 1996; Bentler and Xie 2000)

  • The normal approximation is slightly faster than SW, but is much less accurate

Read more

Summary

Introduction

In offline situations where computational resources are not an issue, Imhof’s method (Imhof 1961), which inverts the characteristic function numerically, should be the preferred choice. It can be considered exact (Solomon and Stephens 1977; Johnson et al 2002) since it provides error bounds and can be used to compute FQN (x), for some quantile value x, to within a desired precision. These methods all require the entire vector of coefficients (d1, . . . , dN ) to be stored in order to compute the

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call