Abstract

Machine Learning methods require a huge amount of data to train. Real world constraints and missing labels hinder the assimilation of large data sets and therefore limit these methods. A common applied solution is the generation of synthetic data. The problem with most approaches is the quality and interpretability of the data generated by such black box systems. Therefore, this work proposes two interpretable methods to generate synthetic data, using features out of a pool of continuous function candidates. In the first approach, we manipulate features of a deep autoencoder to sample synthetic data, while keeping the basic components. The second method utilizes a generative adversarial learning to generate synthetic data, where the generation process is guided by the main characteristics of the source dataset. Compared to existing methods, we are able to create indistinguishable data patterns for multiple datasets, while having full control over every individual component in the time series. Besides that, the approach presents interpretable and clearly separated features of the time series in the source as well as in the target dataset domain. The approach is evaluated on multiple industrial datasets and provides qualitative and quantitative benefits compared to existing methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call