Widespread innovation from artificial intelligence and machine learning (AI/ML) tools presents a lucrative opportunity for the nuclear industry to improve state-of-the-art analyses (e.g. condition monitoring, remote operation, etc.) due to increased data visibility. In recent years, the risk posed by collaborative data exchange has received increased attention due, in part, to a potential adversary’s ability to reverse-engineer intercepted data using domain knowledge and AI/ML tools. While the efficacy of typical encryption has been proven during passive communication and data storage, collaborative exchange typically requires decryption for extended analyses by a third-party, which poses an intrinsic risk due to these trustworthiness concerns. The directed infusion of data (DIOD) 1 1 A.Al Rashdan and H.Abdel-Khalik, Deceptive Infusion of Data, Non-Provisional Patent, Application No. 63/227,389, September 2022. paradigm presented in this paper discusses a novel data masking technique that relies on preserving the usable information of proprietary data while concealing its identity via reduced-order modeling. In contrast to existing state-of-the-art data masking methods, DIOD does not impose limiting assumptions, computational overheads, or induced uncertainties, thereby allowing for secure and flexible data-level security that does not alter the inferential content of the data. This paper focuses on the application of DIOD to a process-based simulation wherein a leaking reservoir with a controlled inlet pump is simulated under various experimental conditions with the goal of producing masked data that preserve the information given by anomalies. These experiments included the injection of statistically significant anomalies, subtle anomalies that occurred over an extended period, and the addition of several independent anomalous states. Each experiment showed that a classifier will identify the same anomalies whether it analyzes the original or masked data. An additional experiment also tested the case of corrupted labeling information wherein labels were arbitrarily randomized, and the loss in labeling accuracy was about the same for both datasets. Each of these experiments show that data obfuscated by DIOD may be utilized in the place of real data for a variety of condition monitoring scenarios with no loss in performance.
Read full abstract