Abstract

Dynamic access control of big data has received much attention in recent years because of the characteristics of dynamic generation and multi-source aggregation of big data resources. Provenance-based access control defines dependency paths between resource states through provenance data, and it can dynamically realize access control according to different stages and states of resources. However, big data systems involve huge resources, and the state transition process is complex. The existing research on provenance-based access control techniques faces problems such as low efficiency of manually formulating dependency paths, redundant information in provenance data, and uneven distribution of resources. To solve these problems, this paper proposes a key provenance identification framework KPI-HGNN based on the heterogeneous graph neural network. In the framework, a community detection algorithm based on the heterogeneous graph neural network is designed to realize automatic identification and division of corresponding regions of big data resources through feature fusion of multiple types of nodes and edges. Meanwhile, a key node identification algorithm based on the heterogeneous graph attention network is designed to identify the key node in each community by weighted aggregation of the neighboring node attention coefficients. Besides, a key dependency path discovery algorithm is designed, and access control rules are automatically generated based on key dependency paths. The experimental results indicate that the community detection clustering index of the proposed method is better, the Top-5% key node identification accuracy is higher, and the resource coverage rate and the average percentage number product of key nodes of the provenance-based access control rules are better than those of the baseline method. These results indicate that the proposed key provenance identification framework can efficiently solve the problems in identifying key provenance information faced by dynamic access control of big data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call