Abstract

The presence of outliers in geochemical data can impact the accuracy of grade models and influence the interpretation of mine assay data. Removal of outliers is therefore an important consideration in grade estimation work. This paper presents two sample truncation strategies which have been devised to reject outliers in multivariate geochemical data. In essence, a data-dependent threshold is applied to the robust distances of sorted samples to discard outliers within a given class. For robust distances based on the minimum covariance determinant (MCD) where sample deviations from the cluster centre are computed using robust estimates, the inverse chi-square cumulative distribution function is often used to compute the cutoff point, $$\chi _{1-\alpha ,\nu }$$ , under the assumption of multivariate normality. In this work, it has been observed that this approach consistently underestimates the true extent of outliers. The proposed alternatives consist of a geometric and an analytic approach. The former defines the sample truncation point as the knee of the robust distance curve in an approximately chi-square-distributed quantile–quantile plot. The latter uses the silhouette and likelihood functions to consider the degree of cohesion in the resultant inlier/outlier clusters. Both techniques significantly reduce the scatter amongst the samples retained in each domain/class. For validation, ensemble clustering based on t-distributed stochastic neighbour embedding (t-SNE) is used to study the outlier recall rate, the effects of feature selection, and spatial correlation with MCD-based outlier rejection. Visual and quantitative analyses show that the proposed methods are superior to the baseline method which rejects samples using chi-square critical values.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.