Outliers detection using an iterative strategy for semi‐supervised learning

Flavia D Frumosu,Murat Kulahci

doi:10.1002/qre.2522

Abstract

AbstractAs a direct consequence of production systems' digitalization, high‐frequency and high‐dimensional data has become more easily available. In terms of data analysis, latent structures‐based methods are often employed when analyzing multivariate and complex data. However, these methods are designed for supervised learning problems when sufficient labeled data are available. Particularly for fast production rates, quality characteristics data tend to be scarcer than available process data generated through multiple sensors and automated data collection schemes. One way to overcome the problem of scarce outputs is to employ semi‐supervised learning methods, which use both labeled and unlabeled data. It has been shown that it is advantageous to use a semi‐supervised approach in case of labeled data and unlabeled data coming from the same distribution. In real applications, there is a chance that unlabeled data contain outliers or even a drift in the process, which will affect the performance of the semi‐supervised methods. The research question addressed in this work is how to detect outliers in the unlabeled data set using the scarce labeled data set. An iterative strategy is proposed using a combined Hotelling's T2 and Q statistics and applied using a semi‐supervised principal component regression (SS‐PCR) approach on both simulated and real data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Outliers detection using an iterative strategy for semi‐supervised learning

Abstract

Talk to us

Similar Papers

More From: Quality and Reliability Engineering International

Lead the way for us

Journal: Quality and Reliability Engineering International	Publication Date: Jul 1, 2019
Citations: 10

Similar Papers

Using soil survey data for modeling phosphorus sorption capacity
Bharpoor S Sekhon ... Devinder K Bhumbla
Environmental Earth Sciences | VOL. 75
Bharpoor S Sekhon, et. al.Bharpoor S Sekhon ... Devinder K Bhumbla
01 Apr 2016
Environmental Earth Sciences | VOL. 75

Boosting Positive and Unlabeled Learning for Anomaly Detection With Multi-Features
Jiaqi Zhang ... Junsong Yuan
IEEE Transactions on Multimedia | VOL. 21
Jiaqi Zhang, et. al.Jiaqi Zhang ... Junsong Yuan
01 May 2019
IEEE Transactions on Multimedia | VOL. 21

Bayesian semi-supervised learning with support vector machine
Sounak Chakraborty
Statistical Methodology | VOL. 8
Sounak ChakrabortySounak Chakraborty
12 Sep 2009
Statistical Methodology | VOL. 8

The Multivariate Nonparametric Methods for Identifying Gene Sets with Differential Expression
Soheila Khodakarim ... Hamid Alavimajd
Gene | VOL. 552
Soheila Khodakarim, et. al.Soheila Khodakarim ... Hamid Alavimajd
04 Sep 2014
Gene | VOL. 552

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Outliers detection using an iterative strategy for semi‐supervised learning

Abstract

Talk to us

Similar Papers

More From: Quality and Reliability Engineering International