Abstract

The R package crossrun computes the joint distribution of the number of crossings and the longest run in a sequence of independent Bernoulli observations. The main intended application is statistical process control where the joint distribution may be used for systematic investigation, and possibly refinement, of existing rules for distinguishing between signal and noise. While the crossrun vignette is written to assist in practical use, this article gives a hands-on explanation of why the procedures works. The article also includes a discussion of limitations of the present version of crossrun together with an outline of ongoing work to meet these limitations. There is more to come, and it is necessary to grasp the basic ideas behind the procedure implemented both to understand these planned extensions, and how presently implemented rules in statistical process control, based on the number of crossings and the longest run, may be refined.

Highlights

  • The setting is defined by a number of independent observations from a Bernoulli distribution with the same success probability

  • The focus of the R package crossrun [2,3] is the joint distribution of number of crossings, C, and the length of the longest run, L, in random data sequences

  • The joint distributions for the number of crossings and the longest run may be used to investigate, and possible refine, the Anhoej rules in statistical process control ([8], Table 1)

Read more

Summary

Introduction

The setting is defined by a number of independent observations from a Bernoulli distribution with the same success probability. The joint distribution of the number of crossings and the longest run conditional on these variables is given. The key to computing these probabilities is to recognize that, except for the initial run of f-1 observations, the remaining observations constitute n-(f-1) = n+1-f identical and independent Bernoulli observations with success probability p, they represent the same setting as for all n observations, just a shorter sequence These n+1-f observations are conditional on a fixed value of their first observation, only that this fixed value is the opposite as in the entire sequence. By an induction argument following the iterative procedure, all these probabilities are integer multiples of 0.5n−1 and, represent a partition of the binomial coefficients in the distribution of C, by the values l = 1, . The joint distributions for the number of crossings and the longest run may be used to investigate, and possible refine, the Anhoej rules in statistical process control ([8], Table 1). The specificities may be estimated by extensive and complicated simulations as used in [9], but with lower precision

Limitations and planned extensions
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.