Abstract

The development of artificial intelligence (AI) algorithms for classification purpose of undesirable events has gained notoriety in the industrial world. Nevertheless, for AI algorithm training is necessary to have labeled data to identify the normal and anomalous operating conditions of the system. However, labeled data is scarce or nonexistent, as it requires a herculean effort to the specialists of labeling them. Thus, this chapter provides a comparison performance of six unsupervised Machine Learning (ML) algorithms to pattern recognition in multivariate time series data. The algorithms can identify patterns to assist in semiautomatic way the data annotating process for, subsequentially, leverage the training of AI supervised models. To verify the performance of the unsupervised ML algorithms to detect interest/anomaly pattern in real time series data, six algorithms were applied in following two identical cases (i) meteorological data from a hurricane season and (ii) monitoring data from dynamic machinery for predictive maintenance purposes. The performance evaluation was investigated with seven threshold indicators: accuracy, precision, recall, specificity, F1-Score, AUC-ROC and AUC-PRC. The results suggest that algorithms with multivariate approach can be successfully applied in the detection of anomalies in multivariate time series data.

Highlights

  • Today, the industry is changing by what experts call the “Fourth Industrial Revolution”, called Industry 4.0

  • The data were structured in hourly frequency and it begins in June 2012 until November 2012 (213 days and 22 hours), comprising 15,315 data points. This period corresponds to the hurricane season in the Ocean Atlantic, which that year was especially active with 19 tropical cyclones, which 10 cyclones became hurricanes

  • This work demonstrated the effectiveness of a multivariate analysis using six different unsupervised Machine Learning (ML) algorithms for time series

Read more

Summary

Introduction

The industry is changing by what experts call the “Fourth Industrial Revolution”, called Industry 4.0. Principal Component Analysis (PCA) based anomaly detection techniques are able extract the main features of a certain dataset without losing its ability to represent the original data, using these features to analyze which constitute a normal class and applies distance metrics to identify cases that represent anomalies. This allows to train a model using existing imbalanced data. This indicates that most of these methods have little systematic advantages over the other when compared across many other datasets In this context, this chapter discuss the level of accuracy and reliability of six unsupervised ML algorithms for pattern recognition and anomaly detection with no need of labeled data. The real cases were: (i) meteorological data from a hurricane season and (ii) monitoring data from dynamic machinery for predictive maintenance purposes

Unsupervised ML algorithms
C-AMDATS
Luminol Bitmap
SAX-REPEAT
Bootstrap
Comparative analysis between the algorithms
Characterization of case studies
Limitations
Case study 01 - meteocean data in hurricane season
Case study 02: monitoring data from dynamic machinery
Parameterization of the ML algorithms
Case study experiment 01 - meteocean data in hurricane season
Case study experiment 02 - monitoring data from dynamic machinery
Performance evaluation
Conclusions and future work recommendations

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.