Abstract
An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of modern measurement systems in geodesy and surveying. The test statistic associated with IDS is the extreme normalised least-squares residual. It is well-known in the literature that critical values (quantile values) of such a test statistic cannot be derived from well-known test distributions but must be computed numerically by means of Monte Carlo. This paper provides the first results on the Monte Carlo-based critical value inserted into different scenarios of correlation between outlier statistics. From the Monte Carlo evaluation, we compute the probabilities of correct identification, missed detection, wrong exclusion, over-identifications and statistical overlap associated with IDS in the presence of a single outlier. On the basis of such probability levels, we obtain the Minimal Detectable Bias (MDB) and Minimal Identifiable Bias (MIB) for cases in which IDS is in play. The MDB and MIB are sensitivity indicators for outlier detection and identification, respectively. The results show that there are circumstances in which the larger the Type I decision error (smaller critical value), the higher the rates of outlier detection but the lower the rates of outlier identification. In such a case, the larger the Type I Error, the larger the ratio between the MIB and MDB. We also highlight that an outlier becomes identifiable when the contributions of the measures to the wrong exclusion rate decline simultaneously. In this case, we verify that the effect of the correlation between outlier statistics on the wrong exclusion rate becomes insignificant for a certain outlier magnitude, which increases the probability of identification.
Highlights
In recent years, Outlier Detection has been increasingly applied in sensor data processing [1,2,3,4,5,6,7,8,9]
On the basis of the probability levels associated with iterative data snooping (IDS) (i.e., probability of correct identification (PCI), probability of missed detection (PMD)/probability of correct detection (PCD), probability of wrong exclusion (PWE), Pover+, Pover− and probability of “statistical overlap” (Pol)), we show how to find the two sensitivity indicators Minimal Detectable Bias (MDB) and Minimal Identifiable Bias (MIB) for IDS
On the basis of the probability levels of IDS, the sensitivity indicators—the Minimal Detectable Bias (MDB) and Minimal Identifiable Bias (MIB)—can be determined for a given measurement system
Summary
Outlier Detection has been increasingly applied in sensor data processing [1,2,3,4,5,6,7,8,9]. Probability levels have already been described in the literature for the case in which data snooping is run once (i.e., only one single estimation and testing), as well as for the case in which the outlier is parameterised in the model (see, e.g., [2,19,20,21,23,37,46,47]) For such cases, the probability of correct detection (PCD) and correct identification (PCI) and their corresponding Minimal Detectable Bias (MDB) and Minimal Identifiable Bias (MIB) have already been described for data snooping [37,46]. The critical value is computed by Monte Carlo such that a user-defined Type I decision error α for IDS is warranted. We analyse the relationship between the sensitivity indicators MDB and MIB for IDS
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.