Data analysis and artificial intelligence systems are becoming widely used in various spheres of human life. This is confirmed by more typical cases of their use, in particular, the selection of recommendations for the user in e-commerce, the detection of spam in e-mail services, and the moderation of user comments; as well as cases of personal use of such tools (for example, chatbots ChatGPT, Google Bard, Microsoft Copilot have appeared and gained significant popularity in the last two years). One of the key elements of such systems is data, which is necessary for training and testing software systems of intelligent data analysis. A significant amount of diverse data contributes to the construction of a software system with high accuracy. Considering this, the task of choosing and preparation of datasets that can be used in the construction of such systems is important. One of the difficulties in this task is the presence of private information in the datasets, which limits their use for systems of intelligent data analysis. The paper is devoted to the development of the software system architecture for solving the classification problem based on private data. The existing methods and architectural approaches for privacy-preserving in machine learning were considered. The architecture of the software system was proposed, the characteristic feature of which is the protection of private datasets by using of functional encryption, which allows to increase the number of datasets for training publicly available data analysis and artificial intelligence systems. The proposed architecture of the software system is based on the client-server architecture and functional encryption. The components are a classifier, a generator of encryption keys, and modules of functional encryption and decryption. Prospects for further research were discussed.
Read full abstract