Abstract

In the face of massive data, Knowledge Graph (KG) needs the scale-out storage schema and distributed parallel query engine to guarantee its storage and query performance. In this paper, we propose a Knowledge Graph Storage Access System (KGSAS) based on HBase to deal with these problems. Our approach presents a scalable storage schema which uses random prefix and the pre-partition operation to ensure load-balanced entity storage. Besides presenting the storage schema, in order to improve query efficiency, we propose two distributed parallel query engines: HBase with Spark and HBase with Coprocessor. The HBase with Spark engine accelerate queries in parallel by using the memory calculation on Spark. The HBase with Coprocessor engine utilizes inverted index and Coprocessor technology to speed up queries by scanning cluster in parallel. The evaluation results show that HBase with Coprocessor engine has the better performance for querying KG.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call