Efficient and robust analysis of interlaboratory studies

Jaap Molenaar,Wim P Cofino,Paul J.J.F Torfs

doi:10.1016/j.chemolab.2018.01.003

Jaap Molenaar, Wim P Cofino + Show 1 more

Open Access

https://doi.org/10.1016/j.chemolab.2018.01.003

Copy DOI

Abstract

In this paper we present the ab-initio derivation of an estimator for the mean and variance of a sample of data, such as obtained from proficiency tests. This estimator has already been used for some time in this kind of analyses, but a thorough derivation together with a detailed analysis of its properties is missing until now. The estimator uses the information contained in data including uncertainty, represented via probability density functions (pdfs). An implementation of the approach is given that can be used if the uncertainty information is not available; the so-called normal distribution approach (NDA). The present estimation procedure is based on calculating the centroid of the ensemble of pdfs. This centroid is obtained by solving the eigenvalue problem for the so-called similarity matrix. Elements of this matrix measure the similarity (or overlap) between different pdfs in terms of the Bhattacharyya coefficient. Since evaluation of an eigenvalue problem is standard nowadays, the method is extremely fast. The first and second moments of the centroid pdf are used to obtain the mean and variance of the dataset.The properties of the estimator are extensively analyzed. We derive its variance and show the connection between the present estimator and Principal Component Analysis. Furthermore, we study its behavior in several limiting cases, as met in data that are very coherent or very incoherent, and check its consistency. In particular, we investigate how sensitive the estimator is for outliers, investigating its breakdown point. In the normal distribution approach the breakdown point of the estimator is shown to be optimal, i.e., 50%.The largest eigenvalue(s) of the similarity matrix appear(s) to provide important information. If the largest eigenvalue is close to the dimension of the matrix, this indicates that the data are very coherent, so they lie close to each other with similar uncertainties. If there are two (or more) largest eigenvalues with (nearly) equal values, this indicates that the data fall apart in two (or more) clusters.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Chemometrics and Intelligent Laboratory Systems	Publication Date: Feb 10, 2018
Citations: 12	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Efficient and robust analysis of interlaboratory studies

Abstract

Talk to us

Similar Papers

More From: Chemometrics and Intelligent Laboratory Systems

Lead the way for us

Similar Papers

Algebraic polynomial system solving and applications
I.W.M Bleylevens
-
I.W.M BleylevensI.W.M Bleylevens
01 Jan 2009
01 Jan 2009

GA-Fisher: A New LDA-Based Face Recognition Algorithm With Selection of Principal Components
W.-S Zheng ... P.C Yuen
IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics) | VOL. 35
W.-S Zheng, et. al.W.-S Zheng ... P.C Yuen
01 Oct 2005
IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics) | VOL. 35

Large cumulant eigenvalue as a signature of exciton condensation
Anna O Schouten ... Leeann M Sager-Smith
Physical Review B | VOL. 105
Anna O Schouten, et. al.Anna O Schouten ... Leeann M Sager-Smith
29 Jun 2022
Physical Review B | VOL. 105

A modified PCA algorithm for face recognition
Lin Luo ... E.I Plotkin
-
Lin Luo, et. al. Lin Luo ... E.I Plotkin
04 May 2003
04 May 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient and robust analysis of interlaboratory studies

Abstract

Talk to us

Similar Papers

More From: Chemometrics and Intelligent Laboratory Systems