Current advances in machine (deep) learning and the exponential growth of data collected by and shared between smart manufacturing processes give a unique opportunity to get extra value from that data. The use of public machine learning services actualizes the issue of data privacy. Ordinary encryption protects the data but could make it useless for the machine learning objectives. Therefore, “privacy of data vs. value from data” is the major dilemma within the privacy preserving machine learning activity. Special encryption techniques or synthetic data generation are being in focus to address the issue. In this paper, we discuss a complex hybrid protection algorithm, which assumes sequential use of two components: homeomorphic data space transformation and synthetic data generation. Special attention is given to the privacy of image data. Specifics of image representation require special approaches towards encryption and synthetic image generation. We suggest use of (convolutional, variational) autoencoders and pre-trained feature extractors to enable applying privacy protection algorithms on top of the latent feature vectors captured from the images, and we updated the hybrid algorithms composed of homeomorphic transformation-as-encryption plus synthetic image generation accordingly. We show that an encrypted image can be reconstructed (by the pre-trained Decoder component of the convolutional variational autoencoder) into a secured representation from the extracted (by either the Encoder or a feature extractor) and encrypted (homeomorphic transformation of the latent space) feature vector.
Read full abstract